Meet Our new AI Tutor!
How we Built an Open-Source RAG-based ChatGPT Web App
Watch the video
Hey everyone, today, I’m super excited to share a project that we at Towards AI have been meticulously crafting. We’re lifting the learning experience a notch higher, introducing our AI tutor that’s set to revolutionize how you navigate through the intricate alleys of AI and coding. Stick with me for the next few minutes, and I promise you, it will replace your current ChatGPT or Google for all your future AI-related questions!
At Towards AI, we’ve built two free AI courses and are currently building more super useful ones, which you should subscribe to stay tuned or subscribe to my newsletter below, where I will surely announce all our projects. We recognized the bottleneck in the learning journey — those moments when you’re nestled deep in code or engrossed in a lesson, and questions bubble up. We knew the traditional approach of sifting through voluminous documentation or awaiting responses in community forums was not cutting it anymore. Even Googling is often hard for a very specific question, or you ask ChatGPT and get the usual knowledge cutoff answer or work with outdated documentation.
Enter the AI Tutor, engineered to be your companion, answering your queries in real-time with precision that mirrors an AI expert and professor. But how did we make this leap? That’s where the marvel of Retrieval Augmented Generation, or RAG, comes into play.
Now, back to our AI tutor. I mentioned retrieval augmented generation or RAG. RAG isn’t your everyday AI model or a copy of ChatGPT; it’s a big step forward in alignment and power for both the chatbot itself and the user, where the retrieval of information and its generation into coherent, precise answers coalesce seamlessly.
Let me illustrate with a fresh analogy. Imagine a vast ocean of books, each a reservoir of knowledge. Now picture a seeker, a student thirsty for answers. The traditional way is akin to handing the seeker a boat and a paddle — they must row, navigate the stormy seas, and hope to land on an island where the treasure of answers lies. It’s daunting and time-consuming, and often, the seeker may land on the wrong island. This is you and me searching on the internet. Searching on ChatGPT with a knowledge cutoff in early 2022 is just as dangerous with its limitations and hallucination risks, especially for advanced and fast-changing topics like AI.
RAG transforms this journey. It’s akin to a magical compass that points directly to the island where the answers reside. The seeker doesn’t row aimlessly; they are guided, every stroke of the paddle bringing them closer to the precise answers they seek. The ocean of books is still vast, but now, it’s navigable, accessible, and friendly.
Our AI Tutor doesn’t just pull answers from the ether. It is grounded in a knowledge base that’s as vast as it is deep. It’s an aggregation of intelligence from a plethora of sources, finely curated by our team, ensuring that every response is not just accurate but steeped in the latest developments and insights in AI and coding. Right now, we have lots of content from our courses, thousands of technical articles on the Towards AI publication, many articles from Wikipedia, and the whole HuggingFace Transformers library.
We’ve built this bot following many tips I shared in my previous articles and upcoming ones, which you can all learn about in our two courses as well. All the links are below, but quickly, it is a complex chain of prompts where we ask many questions to ChatGPT and have it analyze and process the user’s query until we feed back the answer with its sources.
For the most technical of you, here’s a quick coverage of what we did. It is also all based on Buster, an open-source framework developed and maintained by a friend of mine, which you can implement for your own knowledge base as well if you’d like to!
We start by ingesting all our data into memory. This is done by processing all the content we have split into chunks of text to OpenAI’s text-embedding-ada-002 encoder model. This will produce embeddings that are basically vectors of numbers. You can then save those vectors in a memory. Then, for a new query, such as a question from the user, you can also embed it with the same model and compare it with all of your current embeddings in your memory. Once it finds the most similar embeddings, ChatGPT is asked to understand the user’s question and intent, and only uses the retrieved sources of knowledge to answer the question. This is how it reduces hallucination risks and allows you to have up-to-date information since you can update your knowledge base as much as you want.
Plus, as you see, it cites all sources it found on the question for you to dive in and learn more, which is also a plus when you are trying to learn and understand a new topic! We use Activeloop, with whom we’ve partnered on this project, to manage the whole memory and query process.
So the overall process is quite straightforward; we validate the question, ensuring it is related to AI, and our chatbot should answer it, query our database to find good and relevant sources, and then use ChatGPT to digest those sources and give a good answer for the student.
Then, there are still many things to consider, like how to determine when to answer a question or not, if it is relevant or in your documentation, understand new terms or acronyms not in ChatGPT’s knowledge base, etc. Lots of things that we’ve fixed through different prompting techniques that you can learn more about in the video series of the course we built.
Otherwise, if you have a knowledge base, just import Buster, embed your documentation, and start building your own knowledge bot! You will have to iterate on the various functionalities and fix lots of corner cases, but you will end up with a very powerful replacement of ChatGPT, still containing its powerful capacities while leveraging more expert knowledge.
It is also our first version of this AI tutor, and I’d love to get your feedback on it. Please, go use it; it’s completely free, and ask all your upcoming AI questions. We have feedback features to give a thumbs up and down or even email me with feedback so we can use your help to improve it further and plan to make many amazing updates soon. We also want to keep developing this chatbot and embed it in all our upcoming courses, so your help would mean a lot to me and to us at Towards AI.
Before I cap this off if the world of AI intrigues you as much as it does me, and you’re eager not just to keep pace but to be ahead, I’ve got something tailor-made for you. My newsletter — it’s your weekly dose of AI news and insights that cut through the noise and knowledge that empowers. Be aware I don’t share all the exciting news, and that’s it. I share my projects and explain new research and new techniques in AI. So it’s for people who want to learn and improve, not only to stay up-to-date quickly! Subscribe, and let’s grow together!
On my end, I will see you in the next article or newsletter iteration! Happy learning!