21 lines
711 B
Markdown
21 lines
711 B
Markdown
---
|
|
type: resource
|
|
status: mixed
|
|
tags: [ai, llm]
|
|
---
|
|
# 🧬 LLMs — Working Notes
|
|
> Practical knowledge of large language models in production.
|
|
|
|
## In my stack (verified)
|
|
- **LiteLLM** as the gateway abstraction (provider-agnostic) — used in [[Tia]]
|
|
- **MiniMax** (minimax-2.7) as a production model behind LiteLLM
|
|
- **pgvector** for embeddings/retrieval in [[Tia]]
|
|
|
|
## Topics to capture
|
|
- [ ] Gateway pattern (LiteLLM) — why abstract the provider
|
|
- [ ] Cost/latency: p95 tracking (Tia's `/admin/ai` uses `percentile_cont`)
|
|
- [ ] Embeddings + pgvector retrieval recipe
|
|
- [ ] Model selection trade-offs
|
|
|
|
## Related
|
|
[[Agent Architectures]] · [[Prompt Engineering]] · [[Tia - Architecture]] · [[MOC - AI]]
|