RAG Intro
Ever wondered how some AI systems seem to know exactly what you need? They might be using a technique called RAG. It's like giving the AI a little research assistant. When you ask a question, the AI first looks for relevant information from its own knowledge base or external sources. Then, it combines what it found with your question to create a super-detailed prompt. Finally, it uses a powerful language model to generate a response that's both informed and tailored to your needs.
RAG definition
In its most basic form, the following steps happen in a RAG application:
Retrieval: The user’s request is used to query some outside source of information. This might mean querying a vector store, conducting a keyword search over some text, or querying a SQL database. The goal of the retrieval step is to obtain supporting data that will help the LLM provide a useful response.
Augmentation: The supporting data from the retrieval step is combined with the user’s request, often using a template with additional formatting and instructions to the LLM, to create a prompt.
Generation: The resulting prompt is passed to the LLM, and the LLM generates a response to the user’s request.