We’re now at a point where we can run (some thing like ChatGPT) using LLMs locally on our machines, taking advantage of the GPU usage for systems such as Apple Silicon, NVIDIA graphics, and all the other brands of GPU ...Read more
We’re now at a point where we can run (some thing like ChatGPT) using LLMs locally on our machines, taking advantage of the GPU usage for systems such as Apple Silicon, NVIDIA graphics, and all the other brands of GPU out there. Guess what? you don’t have to go through the entire setup of these models to make it work.
From my personal experience, while doing it the unoptimised way, I found out that for example Llama3 which is about 140GB takes about 5 hours to respond to one single prompt. Yes, that’s obvious if it is using CPU. However, there is a better and faster way to do it, with a few tools I will introduce you to in this writeup.
First, you want to get openwebui from the github page, clone this project to your machine
git clone https://github.com/open-webui/open-webui.git
This is an open source project that will give you the interface and management platform that looks like chatgpt from openAI at the moment. You can do many things with it, such as register an account locally, this data is stored on your machine (don’t be scared, your data is not going to anyone), sign in, save prompt, download more models, combine models and use them in your prompts.
![](https://questionbump.com/wp-content/uploads/2024/05/Screenshot-2024-05-13-at-16.48.33-1024x536.png)
To be able to use this, we need another tool called Ollama. Head on to the official web page and download it for your preferred operating system type. Once you do this, ollama should be running on your machine, the way to test this is to go to this localhost on your browser.
http://localhost:11434
You should get a message saying “Ollama is running“. Now you’re ready to pull in the models. On the model list page, you will find a list of the models that have been optimized for GPU usage in ollama. To get one, use the command
ollama pull llama3, ollama pull llama2, ollama pull phi3
These three commands are for pulling these three models, replace the model name you want and pull it. Once that is done, now you can run it using as you can see in the image below
ollama run llama3
, ollama run llama2, ollama run phi3
![](https://questionbump.com/wp-content/uploads/2024/05/Screenshot-2024-05-13-at-16.58.17-1024x515.png)
Open openwebui project you cloned in your favourite IDE, in this case, I’m using Visual Studio Code. In the terminal/ command prompt of your VScode, navigate to the backend folder of the project using the command
cd backend
Now, you need to create a new conda environment, this will keep your python packages seperated just for this purpoose only. Assumption here is that you have conda installed on your machine, if you don’t, head on to the official page and select your OS type. Once you have conda installed, create a new environment using the following command
conda create --name openwebui python=3.11
// openwebui could be any name depending on you and then once its is created, activate it by using the command conda activate openwebui
. You are now ready to install your python packages, run the following command:
pip install -r requirements.txt -U
The assumption here is that you have pip installed on your machine, and have used it before (if you don’t have pip, download it from here). You can see the requirements.txt file inside the backend folder, it contains all the packages needed. This command installs all the packages required for the project to function properly, -U means upgrade any package that is outdated.
You’re almost ready, you are done with the backend of the application, the front end part of the app (openwebui) is written in js so you need to install all the node dependencies. Do so by running
npm install
npm run build
Before running these commands, make sure that you have node installed, I recommend node version 18 and above, you can find it here, select your version and download, if you did it successfully, you can type node --version
on your machine and it should respond with the appropriate version you installed.
The frontend setup is done.
To start the application, go to the backend folder in the terminal/Command Prompt and enter this commandbash start.sh
You should see something like the image below, go to your browser and enter the url http://0.0.0.0:8080. Then you register to the application, select a model to start using (hopefully you should see the list of models you pulled earlier)
![](https://questionbump.com/wp-content/uploads/2024/05/Screenshot-2024-05-13-at-17.20.52-1024x418.png)
Good Luck!!!!, Now you have LLM locally on your machine
Read less