TL;DR
Retrieval Augmented Generation (RAG) is a way to make LLMs like GPT-4 more accurate and personalized to your specific data.
- LLMs are powerful as hell, but they’re also generic: they’re trained on all data on the internet ever!
- RAG helps you get more personalized responses tailored to your data by embedding your data in your model prompts
- RAG relies on the model’s context window, which is how much data in can take in a prompt
- Today’s RAG pipelines are pretty complex and rely on embedding models and vector databases
Alongside old school fine tuning, RAG is becoming the standard way to get better, more personalized results out of state of the art LLMs.