Rag on Technical Blog

Accepting an internal-docs RAG package: a practical handoff checklist

Tue, 07 Apr 2026 00:00:00 +0000

You received a RAG solution bundle: Lambda sources, scripts, maybe S3 or vector exports, and a reading order for docs. Before you promise a go-live date, walk through a short acceptance checklist. The tables below list typical questions to ask; rename regions, accounts, and services to match yours.

Acceptance walkthrough (activity)

From messy folders to vectors: an ingestion mindset for policy RAG

Tue, 07 Apr 2026 00:00:00 +0000

Retrieval quality in an internal policy RAG is rarely fixed by swapping the chat model first. It is usually capped by how documents enter the system: file types, chunk boundaries, stable identifiers, and a repeatable path from source object to vector index. In practice you often see batch jobs or Lambdas, object storage for artifacts, and a managed vector service wired together the same way.

LLM-as-judge for RAG: what to score, what to distrust

Tue, 07 Apr 2026 00:00:00 +0000

LLM-as-judge adds scale when human reviewers cannot read every RAG interaction. A common pattern: after the answer path returns, enqueue question, answer, and retrieved context to a queue; a worker Lambda runs a judge prompt; results land in a database for analytics.

What judges are good for

Use	Reason
Trend monitoring	Average scores or failure flags shifting after a deploy
Sampling for humans	Pull low-scoring rows for manual review
Regression alarms	Chunk size, top-k, or model changes moving the distribution

Judges are cheap sensors, not auditors.

Internal-docs RAG: chat ingress, vector search, and an async judge loop

Sat, 04 Apr 2026 00:00:00 +0000

This note describes an internal RAG pattern for policy and handbook-style documents: employees ask questions in a familiar chat surface, the backend retrieves by semantic similarity, and a separate evaluation path scores answers for quality and traceability. The layout maps cleanly to typical AWS building blocks (API Gateway, Lambdas, object storage, a vector index, DynamoDB, and a queue).