Creating our own Basic Local Chatbot

May 04, 2024

This is a current work in progress there is still some issues with the program. It will at times have its own conversation and sometimes add empty assistant tags to the replies. This is all from the first 2 links in the code.

After receiving my Azura AI-900 certificate I wanted to work more with AI coding but wanted to avoid the costs of working with it in the cloud and through a subscription service. I found various instructions on how to create your own chatbot on your computer without signing up to other services, after working with several I did find one I felt worked best with my computer and for how I’m trying it out. All I used for the final ‘working’ version was some langchain libraries and a GitHub project (llama.cpp), the 3 links that are at the top of the code I have in Github, for the sites I got code and information. If you are doing this commercially, you’ll need to check the license agreements to see if you would be able to use them for your situation. For the LLM I used the Mistral model from, Huggingface.

When I work on a project, I like to start with a simple way just to make sure it works before adding more code and features to it, till I know I have a basic working version. My first goal was to be able to ask it a question and get back an answer. I used the simplest way to run a basic chat locally, just calling the model to return and answer, no need for me to setup a server. After some work I was able to get the simple: have the question in code and have it reply with an answer.

For it to have an ongoing conversation I had to append the ongoing conversation to the JSON payload. As questions are added and responses come, they need to go into the JSON so the AI will ‘remember’. If needed you can add some code for the user to clear the memory, which you would remove all but the first system prompt. The system prompt is what tells the AI what it’s supposed to be like ‘You’re a proofreader for business files…’

From there I started to add the chat feature. Used basic Python to have the ongoing prompt where it will allow you to type in the command line, then hit enter and send that to the LLM to retrieve the answer. From there I added the ongoing conversation to the payload so you can keep the conversation going and it will remember what you asked, and it’s replies as you go, until you tell it to ‘quit’ the conversation.

A sample small AI conversation (I removed some of the extra information it added to help):

Me: ‘Pick a random country’
AI: The country I have randomly selected is India.
Me: What is its capital
AI: The capital city of India is New Delhi.
ME: What is its major export?
AI: India’s major exports include Information Technology and Business Services, Chemicals and Pharmaceuticals, Textiles and Leather, Gems and Jewelry, Food Products, Engineering Goods, and Services.

At some point in the conversation, it could start continuing its own the conversation, where it will add ‘human’ to the replies. This could be caused by the number of tokens that is being sent or something else in the setup, this is something I’m still working on.

Some notes that might help:

  • I’ve tried to add more comments to the code and/or simplified the comments when I can for those who might be learning to give more in-depth for that line.
  • The temperature – the lower the number the more factual the information will be giving a higher number will allow it to be more creative with the answers, if you’re writing a fun story then a high temp will be good but if you’re working on a report for work you might want it at 0.
  • This isn’t the only way you can run a local chat bot. There are several other ways by setting up local servers, using different libraries. If you setup with a local server, such as you’re planning to move it to a server in the future. Warning it will slow your computer down the more you use it. I did one before this and was having speed issues through the whole computer.
  • You could set up a chat using API calls to the various services such as OpenAI, you’ll need to check with those sites on the costs and how to do it.

The biggest thing that you might run across is after you get the chat setup you might not be getting the replies you’re expecting. This could be from your prompts, how you tell AI want you want. There’s a whole field of called Prompt Engineering, most models have a basic high level approach but for more fine tuning, each LLM model has its prefer way to be communicated to.

More info on Mistral prompting:

Some settings for GPU & Mac:

Related Articles

Python – Birthday List

Python – Birthday List

Birthday List Python Program Worked on a project for a publication to pull the birthdays of their members quickly. To help demonstrate what they were working with; they had two Word files containing tables of birthday months, the dates for the month were not in order...

read more
Microsoft Certified: Azure AI Fundamentals

Microsoft Certified: Azure AI Fundamentals

Passed the Microsoft Certified: Azure AI Fundamentals Certification: The graphic was created in Microsoft Image creator with DALL-E where I added the...

read more

Pin It on Pinterest

Share This