Obsidian_vault/50 - Resources/AI/LLMs.md

---
type: resource
status: mixed
tags: [ai, llm]
---
# 🧬 LLMs — Working Notes
> Practical knowledge of large language models in production.

## In my stack (verified)
- **LiteLLM** as the gateway abstraction (provider-agnostic) — used in [[Tia]]
- **MiniMax** (minimax-2.7) as a production model behind LiteLLM
- **pgvector** for embeddings/retrieval in [[Tia]]

## Topics to capture
- [ ] Gateway pattern (LiteLLM) — why abstract the provider
- [ ] Cost/latency: p95 tracking (Tia's `/admin/ai` uses `percentile_cont`)
- [ ] Embeddings + pgvector retrieval recipe
- [ ] Model selection trade-offs

## Related
[[Agent Architectures]] · [[Prompt Engineering]] · [[Tia - Architecture]] · [[MOC - AI]]