RAG: Definition & Examples (AI Glossary)

How it works

User query is converted to an embedding, the embedding is used for nearest-neighbor search in a vector database, the top-k chunks are retrieved and inserted into the prompt as context, the LLM generates an answer grounded in those chunks. Often combined with a reranker for higher precision.

Example

An internal Q&A agent for a SaaS company indexes the help center docs, employee handbook, and product specs as embeddings. When an employee asks 'What's our refund policy?', the agent retrieves the relevant policy chunks and grounds its answer in them, citing sources.

Related terms

Vector Database Embedding LLM

RAG

How it works

Example

Related terms

Need to actually use RAG?