In this tutorial, we will create a custom chatbot using Retrieval Augmented Generation. This process involves using data from external sources to prevent the chatbot from generating incorrect or “hallucinated” information. If the model doesn’t have the answer, it can simply respond with “I don’t know” if the provided context doesn’t contain the necessary information.

To achieve this, we begin by breaking our data into small chunks and inserting their embeddings into a vector database. We then derive the embedding of the question and query the index to retrieve the top five closest chunks, giving us the most relevant information. We then combine the question and context into a single large prompt and present it to the model. For the embeddings, we used text-embedding-ada-002 and the completions model was gpt-3.5-turbo.

Here the outline:

1- Install dependencies and create an index

2- Download and chunk the data

3- Generate embeddings

4- Query and run the prompt

5- Outro

You can find the full tutorial and code in the notebook here.