Obsidian_vault/50 - Resources/AI/LLMs.md
2026-06-07 14:00:01 +00:00

21 lines
711 B
Markdown

---
type: resource
status: mixed
tags: [ai, llm]
---
# 🧬 LLMs — Working Notes
> Practical knowledge of large language models in production.
## In my stack (verified)
- **LiteLLM** as the gateway abstraction (provider-agnostic) — used in [[Tia]]
- **MiniMax** (minimax-2.7) as a production model behind LiteLLM
- **pgvector** for embeddings/retrieval in [[Tia]]
## Topics to capture
- [ ] Gateway pattern (LiteLLM) — why abstract the provider
- [ ] Cost/latency: p95 tracking (Tia's `/admin/ai` uses `percentile_cont`)
- [ ] Embeddings + pgvector retrieval recipe
- [ ] Model selection trade-offs
## Related
[[Agent Architectures]] · [[Prompt Engineering]] · [[Tia - Architecture]] · [[MOC - AI]]