What is RAG?
Retrieval augmented generation forces LLM’s to cite their sources.
How it’s achieved
From a high level, a RAG architecture concatenates relevant documents to a query and gives back the most likely sequnce given a query modulo the documents.

Firstly, we have the encoding. While the paper discusses both RAG-Sequence and RAG-Token, the former is simpler to understand.

Next, we have the seq-2-seq model generator ingest.

In addition, there is a decoding process.

Finally, a retriever produces the relevant sequence.
