Let’s continue with our previous article, Refine the GPT-3.5 RAG pipeline with GPT-4 training data. This time, let’s dive into fine-tuning the other end of the spectrum of our Retrieval Augmented Generation (RAG) pipeline: the integration model.
By refining our integration model, we improve our system’s ability to retrieve the most relevant documents, ensuring optimal performance of our RAG pipeline.
We use the OpenAI integration model
text-embedding-ada-002 for most of our RAG pipelines in our LlamaIndex blog series. However, OpenAI does not offer the functionality to fine-tune
text-embedding-ada-002So let’s explore fine-tuning an open source integration model in this article.
The current number 1 integration model on the HuggingFace MTEB (Massive Text Embedding Benchmark) Ranking East
bge-large-en; it was developed by the Beijing Academy of Artificial Intelligence (BAAI). It is a pre-trained transformer model that can be used for various natural language processing tasks, such as text classification, question answering, text generation, etc. The model is trained on a massive dataset of text and code, and it has been fine-tuned. on the Massive Text Embedding Benchmark (MTEB).
For this article, we will use one of the
bge-large-enbrothers and sisters,
bge-small-ena small-scale 384-dimensional model with competitive performance, perfect for running in Google Colab.
Excerpt from our last article on fine tuning
gpt-3.5-turbo, we have gained a solid understanding of the steps required to develop an LLM. Compared to fine-tuning LLM, implementing fine-tuning
bge-small-en have some similarities and differences.
- Both types of fine-tuning follow the same approach of generating datasets for training and evaluations, refining the model, and finally evaluating the performance between the base model and the fine-tuned model.