Notes
Notes on retrieval, evals, observability, and the engineering that starts once the demo is the easy part.
A practical blueprint for legal QA, shaped in part by work around the Agentic RAG Legal Challenge: document identity, hybrid retrieval, structured answers, page-level grounding, telemetry, and evals.
How to reduce hallucinations in LLM systems with better retrieval, abstention, verification, evals, and guardrails.
LLM evals for continuous delivery: turn production failures into automated tests, grade traces with task-specific graders, and block bad releases with eval-driven gates.