Notes
Notes on AI systems, retrieval, and the work that starts after the demo.
Practical writing on retrieval, evals, observability, constraints, and the engineering work that starts when the demo is no longer the hard part.
Working Under Repeated Alarms
A short note from Israel on what repeated alarms do to attention, engineering judgment, and team habits — and which working practices make interruption easier to absorb.
How to Build Legal Answering Systems That Can Be Trusted
A practical blueprint for legal QA, shaped in part by work around the Agentic RAG Legal Challenge: document identity, hybrid retrieval, structured answers, page-level grounding, telemetry, and evals.
LLM Product Safety Without Theater
A practical guide to LLM product safety: prompt injection, excessive agency, unsafe outputs, evals, and sober boundaries.
Interface Design for Serious Products
A practical memo on calm authority, visible product care, restrained motion, and why trustworthy interfaces feel expensive.
Getting AI-Assisted Development to Green Without Breaking the Code
Repair loops, small diffs, test trust, and how to get CI back to green without trashing the codebase.
Building Agentic AI Systems That Hold Up
Practical guidance on tool contracts, context engineering, evals, approvals, and telemetry.
Which Query Transformation Techniques Actually Help RAG?
Query rewrite, decomposition, step-back prompting, HyDE, fusion, and when each one is worth the extra latency.
Preventing Hallucinations in LLM Systems
How to reduce hallucinations in LLM systems with better retrieval, abstention, verification, evals, and guardrails.
Most RAG Failures Start in the Documents
Chunking, titles, metadata, parent-child structure, reranking, and corpus QA for RAG systems.
Spec-Driven Development: the workflow I actually use
How I use a lightweight spec-driven workflow in real projects, what SDDRush automates, and where Kotef fits if you want a stronger agent layer.
How to Run LLM Evals in Production
How to run LLM evals in production with gold sets, graders, trace checks, online signals, and release gates.
Prompt Engineering: From Phrasing to Policy
Prompt design now means response formats, examples, tools, and eval loops — not incantations.
BI Storytelling That Actually Moves Decisions
How to make BI pages support decisions through narrative, visual hierarchy, and trust.