What is RAG?
Retrieval augmented generation forces LLM’s to cite their sources.
How it’s achieved
From a high level, a RAG architecture concatenates relevant documents to a query and gives back the most likely sequnce given a query modulo the documents.
Firstly, we have the encoding. While the paper discusses both RAG-Sequence and RAG-Token, the former is simpler to understand.
Next, we have the seq-2-seq model generator ingest.
In addition, there is a decoding process.
Finally, a retriever produces the relevant sequence.