How to Build an AI Chatbot for Q&A on any Website with MistralAI on CPU

By Developer Service
6 min read

Table of Contents

In today's digital landscape, the demand for instant information access has become increasingly critical.

As visitors navigate through websites, their preference goes more toward obtaining quick answers without the need to go through copious amounts of information.

The solution to this challenge can be a chatbot, designed to comprehend a website's content and provide precise answers to user questions.

This article explores the creation of such a chatbot, leveraging web scraping and conversational AI technologies, using MistralAI on CPU.

This article is a follow-up of the previous article: https://developer-service.blog/how-to-build-an-ai-chatbot-for-q-a-on-any-website/

Example of the ChatBot script with a question about https://developer-service.io/:

Human: what is this site about?
Loading answers...

ChatBot:  This site offers content services tailored for startups and emerging businesses in technology industries. They provide industry-specific insights, brand voice development, engaging and informative articles, SEO optimization, and even content strategy and planning assistance. Their mission is to equip businesses with the knowledge they need to thrive in a competitive landscape.

Example of interaction with the ChatBot


What is MistralAI?

Mistral AI is a pioneering company specializing in the development of advanced Artificial Intelligence, particularly in the field of large language models.

Based in Paris, France, the company is committed to pushing the boundaries of AI technology with a focus on ethics, safety, and user well-being.

Mistral AI's team comprises experts in machine learning, natural language processing, and other disciplines, working together to create innovative AI solutions that can understand and generate human-like text.

The company's mission is to build AI systems that can help solve complex problems, facilitate communication, and contribute positively to society.

In this article we will be using MistralAI on the CPU through HuggingFace's TheBloke model.


Excited to dive deeper into the world of Python programming? Look no further than my latest ebook, "Python Tricks - A Collection of Tips and Techniques".

Get the eBook

Inside, you'll discover a plethora of Python secrets that will guide you through a journey of learning how to write cleaner, faster, and more Pythonic code. Whether it's mastering data structures, understanding the nuances of object-oriented programming, or uncovering Python's hidden features, this ebook has something for everyone.

Recap of the Technology Stack

As a refresher, these are the different components and technologies used in the creation of the ChatBot:

Python: Serving as the foundational programming language, Python is celebrated for its ease of use and extensive library ecosystem.

LangChain and MistralAI: For the conversational AI element, we employ LangChain for its conversational retrieval chains and MistralAI for language embeddings and models.

FAISS: A library that facilitates efficient similarity search within large datasets, empowering the chatbot to locate the most pertinent content. Here we will use it with LangChain.

BeautifulSoup and Requests: These instruments are vital for web scraping, enabling us to systematically navigate and gather content from web pages.

Our project encompasses the following files:

  • search_links.py: Manages the search of links within the site.
  • create_index.py: Tasked with indexing website content.
  • chat_bot.py: Houses the logic governing the chatbot's interactions.
  • main.py: The primary entry point for executing the chatbot application.

In the next section, we will check the changes necessary to support running the ChatBot with MistralAI on the CPU.

These changes will be contained in the chat_bot.py file.


chat_bot.py: The Conversational Engine, now on the CPU

The chat_bot.py script is the core of the chatbot application, integrating AI and retrieval technologies to engage users with meaningful, contextually relevant interactions.

This script is pivotal for several reasons:

  • Conversational Retrieval Chain: It leverages a retrieval chain to understand user queries and fetch appropriate responses from the indexed content.
  • Integration with MistralAI: By incorporating MistralAI on the CPU, the script enhances the chatbot's understanding of natural language, allowing for more nuanced interactions.
  • Use of FAISS for Efficient Search: The script employs FAISS to rapidly search through content embeddings, ensuring quick and relevant responses.
  • Memory Buffer Management: A memory buffer records the context of the conversation, aiding the chatbot in maintaining a coherent and contextually aware dialogue.

Below is the code of the chat_bot.py script, illustrating how these components are orchestrated (notice the change from running from the MistralAI API to running on the CPU):

import os

from decouple import config
from langchain.chains.conversational_retrieval.base import ConversationalRetrievalChain
from langchain.memory import ConversationBufferMemory
from langchain_community.llms.ctransformers import CTransformers
from langchain_community.vectorstores.faiss import FAISS
from langchain_core.callbacks import StreamingStdOutCallbackHandler
from langchain_mistralai import MistralAIEmbeddings, ChatMistralAI


# Get retriever from file index
def retrieve_from_index():
    db = FAISS.load_local("./index.faiss",
                          embeddings=MistralAIEmbeddings(mistral_api_key=config('MISTRAL_API_KEY')),
                          allow_dangerous_deserialization=True)
    retriever = db.as_retriever(
        search_type="mmr",
        search_kwargs={"k": 8},
        max_tokens_limit=4097,
    )
    return retriever


# Start chat
def start_chat():
    retriever = retrieve_from_index()
    memory = ConversationBufferMemory(memory_key="chat_history", input_key='question', output_key='answer',
                                      return_messages=True)
    # Uncomment to use MistralAI API
    # model = ChatMistralAI(model="open-mixtral-8x7b", mistral_api_key=config('MISTRAL_API_KEY'), temperature=0.7)
    # Using CTransformers to use MistralAI model from Hugging Face on the CPU
    model = CTransformers(model="TheBloke/Mistral-7B-Instruct-v0.2-GGUF",
                          model_file="mistral-7b-instruct-v0.2.Q4_K_M.gguf",
                          config={'max_new_tokens': 4096, 'temperature': 0.7, 'context_length': 4096},
                          threads=os.cpu_count())
    # Load LLM
    qa_chain = ConversationalRetrievalChain.from_llm(model, retriever=retriever,
                                                     return_source_documents=True,
                                                     memory=memory)
    return qa_chain


def chat_bot(chain, question):
    print("Loading answers...")
    result = chain.invoke(question)
    print(f"\nChatBot: {result['answer']}")
    # Optional: Print source documents
    # for source in result['source_documents']:
    #     print(source)

Here's a breakdown of the code:

  • Define the retrieve_from_index function:
    • Load a pre-existing FAISS index from a local file.
    • Instantiate a retriever using the Mistral AI embeddings and set search parameters.
  • Define the start_chat function:
    • Call the retrieve_from_index function to get the retriever.
    • Initialize a conversation buffer memory to store chat history.
    • Use a local Mistral AI model via Hugging Face's CTransformers.
    • Load the language model (LLM) and create a conversational retrieval chain with the specified retriever, memory, and other parameters.
  • Define the chat_bot function:
    • Accept a chat chain and a user question as input.
    • Invoke the chat chain with the user question and print the generated answer.
    • Optionally, print the source documents used to generate the answer.

Running the Chat Bot

Before running the main script, we need to make sure that all the requirements are in place:

pip install requests langchain bs4 langchain-mistralai faiss-cpu python-decouple playwright ctransformers transformers

Besides the existing libraries, we added support for ctransformers and transformers.

Now we can run the script:

python main.py

And chat with the chatbot:

Human: what is this site about?
Loading answers...

ChatBot:  This site offers content services tailored for startups and emerging businesses in technology industries. They provide industry-specific insights, brand voice development, engaging and informative articles, SEO optimization, and even content strategy and planning assistance. Their mission is to equip businesses with the knowledge they need to thrive in a competitive landscape.

Example of interaction with the ChatBot

Note: The first time you run the script it needs to download the model from HuggingFace, which can take a couple of seconds (or minutes, depending on your Internet connection). Also, keep in mind that running from the CPU will increase the response time of the ChatBot.


Conclusion

Creating a chatbot that interprets and responds based on a website's content greatly improves user interactions online.

Utilizing technologies such as web scraping, natural language processing (NLP), and conversational AI, specifically through the capabilities of MistralAI on CPU, developers can craft chatbots that enhance user engagement and provide instant access to information.


Thank you for reading and I will see you on the Internet.

This post is public so feel free to share it.

If you like my free articles and would want to support my work, consider buying me a coffee:


Are you working on a project that’s encountering obstacles, or are you envisioning the next groundbreaking web application?

If Python, Django, and AI are the tools you're exploring but you need more in-depth knowledge, you're in the right place!

Get in touch for a 1-hour consultation where I can address your specific challenges.

Developer Service Blog - 1 Hour Consulting - Nuno Bispo
Are you working on a project that’s hitting roadblocks or simply dreaming up the next big web application?

Tagged in:

Python, AI

Last Update: May 18, 2024

About the Author

Developer Service Netherlands

➡️ Tech Content Creator 📜 ➡️ Follow me for content about Python, Django, and AI

View All Posts