Notes

Notes on AI systems, retrieval, and the work that starts after the demo.

Notes on retrieval, evals, observability, and the engineering that starts once the demo is the easy part.

3 posts

Mar 10, 2026

How to Build Legal Answering Systems That Can Be Trusted

A practical blueprint for legal QA, shaped in part by work around the Agentic RAG Legal Challenge: document identity, hybrid retrieval, structured answers, page-level grounding, telemetry, and evals.

Feb 18, 2026

5 min read

Preventing Hallucinations in LLM Systems

How to reduce hallucinations in LLM systems with better retrieval, abstention, verification, evals, and guardrails.

Feb 3, 2026

6 min read

How to Run LLM Evals in Production

LLM evals for continuous delivery: turn production failures into automated tests, grade traces with task-specific graders, and block bad releases with eval-driven gates.