Posts

Accepting an internal-docs RAG package: a practical handoff checklist

You received a RAG solution bundle: Lambda sources, scripts, maybe S3 or vector exports, and a reading order for docs. Before you promise a go-live date, walk through a short acceptance checklist. The tables below list typical questions to ask; rename regions, accounts, and services to match yours. Acceptance walkthrough (activity) ...

From messy folders to vectors: an ingestion mindset for policy RAG

Retrieval quality in an internal policy RAG is rarely fixed by swapping the chat model first. It is usually capped by how documents enter the system: file types, chunk boundaries, stable identifiers, and a repeatable path from source object to vector index. In practice you often see batch jobs or Lambdas, object storage for artifacts, and a managed vector service wired together the same way. ...

LLM-as-judge for RAG: what to score, what to distrust

LLM-as-judge adds scale when human reviewers cannot read every RAG interaction. A common pattern: after the answer path returns, enqueue question, answer, and retrieved context to a queue; a worker Lambda runs a judge prompt; results land in a database for analytics. What judges are good for Use Reason Trend monitoring Average scores or failure flags shifting after a deploy Sampling for humans Pull low-scoring rows for manual review Regression alarms Chunk size, top-k, or model changes moving the distribution Judges are cheap sensors, not auditors. ...

Internal-docs RAG: chat ingress, vector search, and an async judge loop

This note describes an internal RAG pattern for policy and handbook-style documents: employees ask questions in a familiar chat surface, the backend retrieves by semantic similarity, and a separate evaluation path scores answers for quality and traceability. The layout maps cleanly to typical AWS building blocks (API Gateway, Lambdas, object storage, a vector index, DynamoDB, and a queue). ...