711 B
711 B
| type | status | tags | ||
|---|---|---|---|---|
| resource | mixed |
|
🧬 LLMs — Working Notes
Practical knowledge of large language models in production.
In my stack (verified)
- LiteLLM as the gateway abstraction (provider-agnostic) — used in Tia
- MiniMax (minimax-2.7) as a production model behind LiteLLM
- pgvector for embeddings/retrieval in Tia
Topics to capture
- Gateway pattern (LiteLLM) — why abstract the provider
- Cost/latency: p95 tracking (Tia's
/admin/aiusespercentile_cont) - Embeddings + pgvector retrieval recipe
- Model selection trade-offs
Related
Agent Architectures · Prompt Engineering · Tia - Architecture · MOC - AI