What is RAG?

Retrieval augmented generation forces LLM’s to cite their sources.

How it’s achieved

From a high level, a RAG architecture concatenates relevant documents to a query and gives back the most likely sequnce given a query modulo the documents.

Firstly, we have the encoding. While the paper discusses both RAG-Sequence and RAG-Token, the former is simpler to understand.

Next, we have the seq-2-seq model generator ingest.

In addition, there is a decoding process. decode

Finally, a retriever produces the relevant sequence. retriever-dpr

Jules Henry

Back-of-the-Envelope: Retrieval Augmented Generation (RAG)

What is RAG?

How it’s achieved

Further Reading