April 8, 20269 min read

A 12-point RAG-in-production checklist

What we run through before a retrieval-augmented feature goes anywhere near a customer. Evals, guardrails, cost ceilings, and the sharp edges nobody talks about.

Most retrieval-augmented features are easy to demo and hard to operate. The first version works because the document corpus is small and the prompts are loose. The hundredth version fails because the corpus drifted, the prompts got tuned to a moment in time, and nobody owns the eval pipeline.

We use a 12-point checklist on every RAG feature before it leaves staging. It's intentionally boring: ground-truth dataset, eval harness, retrieval recall floor, generation quality floor, cost ceiling per request, latency budget, refusal rate, fallback path, audit log shape, redaction boundary, escalation path, and rollback plan.

The point of the list isn't to slow things down. It's to force the team to write down the conditions under which the feature should be turned off — before anyone has the political reason not to.

AIEngineering

All insights

Ready when you are

Want this kind of thinking on your project?

A 30-minute call is enough to see if we're a fit.

Book a discovery call See our process