RAG
- Important dates:
- Application engineering inflection point (OpenAI DevDay, Retrieval capability announcement): 2023-11-06
- Academic starting point (RAG paper arXiv v1): 2020-05-22
- References:
- Note: The
datein this entry uses the application-engineering inflection point, while the academic origin is listed in the body.
What It Is
RAG (Retrieval-Augmented Generation) is an architecture that combines an external retriever with a generator model.
The core idea from the original paper is that generation should not depend only on model parameters. A retriever fetches relevant content from an external corpus, and the generator produces responses grounded in that retrieved evidence, combining parametric and non-parametric memory.
What Step It Moved AI Application Engineering From and To
It moved knowledge-intensive tasks from "answer using internal model memory only" to "answer using updateable external knowledge sources."
In practice, this means teams can change system knowledge by updating indexes and document stores, rather than retraining for every knowledge change. This directly shaped enterprise knowledge assistants and retrieval-heavy agent designs.
What Stage It Is In Now
I currently mark RAG as mainstream.
Retrieval augmentation has moved from a research concept to a default engineering option in scenarios requiring factuality, source traceability, and updateable knowledge.
What It Might Replace
It can replace part of "stuff everything into a long prompt" knowledge QA patterns, and part of "push all new knowledge through fine-tuning" high-cost paths.
In many business settings, retrieval and index updates are more controllable and easier to trace than frequent retraining.
What Might Replace It
It is more likely to be absorbed into tighter systems that combine long-term memory, retrieval, tool execution, and feedback learning, rather than disappear.
In other words, RAG will likely evolve into a standard subsystem in broader knowledge-execution stacks with routing, caching, reranking, tool calls, and verification.