Co-Author: Muhammad Alif Ramadhan & Muhamad Adamy Rayeuk
Retrieval-based methods search through a database of existing text to find relevant information for a given query.
Generative models create new content based on user input using techniques like pre-training and deep learning. By leveraging external knowledge sources, RAG can provide more comprehensive and informative responses.
There are three key components of RAG, which are: embedding model, database, and retrieval relevant context method.
Key Components of RAG
Let’s start with Embedding. Embedding is one of the key building blocks of Large Language Models (LLM). Embeddings are continuous vector representations of words or tokens that capture their semantic meanings in high-dimensional space.
For example, in the above picture it is explained that there are words such as man and woman where they are not the same words, but have connotations at a similar level, therefore the vector representation of the two points can be in adjacent vector space, this is what is called semantic retrieval.
Embedding Model
We use embedding models to enable LLM models to comprehend and reason with high-dimensional data. Embedding models are algorithms trained to encapsulate information into dense representations in a multidimensional space.
In other words, embedding models create fixed-length vector representations of text, focusing on semantic meaning for tasks like similarity comparison.
These vector data need to be stored in a database, right? This is where the Vector database came to the rescue.
Database
Based on Oracle, a database is an organized collection of structured information, or data, typically stored electronically in a computer system. We use vector and NoSQL databases to store and retrieve data.
Vector Database
A vector database is a collection of data stored as mathematical representations. Vector databases make it easier for machine learning models to remember previous inputs, allowing machine learning to be used to power search, recommendations, and text generation use-cases.
Data can be identified based on similarity metrics instead of exact matches, making it possible for a computer model to understand data contextually.
NoSQL Database
While vector databases are optimized for the storage and retrieval of vector data, NoSQL databases are optimized for the storage and retrieval of unstructured data.
Unstructured data lacks a strict format or schema, making it challenging for conventional databases to manage. Yet, this unstructured data holds immense potential for AI, machine learning, and modern search engines.
In our use case, we incorporate a wide range of unstructured text data into our knowledge base. This includes documents such as reports, invoices, records, emails, and outputs from various productivity applications.
Retrieval Relevant Context Method
The next challenge is retrieving relevant context-based information in response to user queries.
Now that it has enough knowledge, we want this machine to answer our question based on the knowledge that we provided.
First, we search our NoSQL database for the top x documents most relevant to the user’s prompt.
Then, we convert both the user’s prompt and the top x documents into vector representations.
We use these vectors to search our vector database, and filtering the results to retrieve the most relevant context.
After acquiring the relevant context, we’ll incorporate it, along with predefined rules and the user’s specific prompt, into the LLM’s input.
The LLM will then process this information and generate a tailored response.
This article was also published in Bahasa Indonesia in the 4th edition of ITSEC Buzz Magazine. Check it out!
Hope this helps. See you in the next article! 👋