Julian Henry

Back-of-the-Envelope: Retrieval Augmented Generation (RAG)

24 Aug 2024

What is RAG?

Retrieval augmented generation forces LLM’s to cite their sources.

How it’s achieved

From a high level, a RAG architecture concatenates relevant documents to a query and gives back the most likely sequnce given a query modulo the documents. architecture

Firstly, we have the encoding. While the paper discusses both RAG-Sequence and RAG-Token, the former is simpler to understand. rag-sequence

Next, we have the seq-2-seq model generator ingest. generator

In addition, there is a decoding process. decode

Finally, a retriever produces the relevant sequence. retriever-dpr

Further Reading