Pattern: Retrieval-Augmented Generation (RAG)

Pattern: Retrieval-Augmented Generation (RAG)

Category: Memory Source: FOR-0012 Status: Documented

When to Use

When an agent needs to answer questions or produce content grounded in a specific knowledge base rather than relying solely on its training data. Essential for reducing hallucination and ensuring responses are factually grounded in source material.

How It Works

  • A knowledge base is prepared: documents are chunked, indexed, and stored with embeddings or metadata
  • When a query arrives, it is used to search the knowledge base for relevant chunks
  • The top matching chunks are retrieved and injected into the agent's context
  • The agent generates a response grounded in the retrieved content
  • Optionally, the response is validated against the source material
  • If retrieval quality is poor (no good matches), the agent signals low confidence or falls back

Example

A digital talent that answers HR policy questions for a small business. The company's employee handbook, benefits documents, and policy memos are chunked and indexed. When an employee asks "How many vacation days do I get after 2 years?", the system retrieves the relevant policy section and the agent answers based on the actual document, citing the source.

Tradeoffs

Pro Con
Grounds responses in real data, reducing hallucination Retrieval quality depends on chunking and indexing strategy
Knowledge base can be updated without retraining Adds latency for the retrieval step
Agent can cite sources for transparency Poor retrieval leads to irrelevant or missing context
Works with any document format once indexed Setup cost for indexing and embedding infrastructure

Factory Usage

  • Implicit RAG: Factory agents read role.md, agent.md, and reference files before acting — this is manual RAG where the agent retrieves relevant documents from the repo to ground its work.
  • Knowledge Manager role (Kai): Manages the knowledge base and indexing, which is the infrastructure side of RAG for the factory.