--- type: resource status: mixed tags: [ai, llm] --- # 🧬 LLMs — Working Notes > Practical knowledge of large language models in production. ## In my stack (verified) - **LiteLLM** as the gateway abstraction (provider-agnostic) — used in [[Tia]] - **MiniMax** (minimax-2.7) as a production model behind LiteLLM - **pgvector** for embeddings/retrieval in [[Tia]] ## Topics to capture - [ ] Gateway pattern (LiteLLM) — why abstract the provider - [ ] Cost/latency: p95 tracking (Tia's `/admin/ai` uses `percentile_cont`) - [ ] Embeddings + pgvector retrieval recipe - [ ] Model selection trade-offs ## Related [[Agent Architectures]] · [[Prompt Engineering]] · [[Tia - Architecture]] · [[MOC - AI]]