Local AI – QWORQS

LM Studio

Posted on January 24, 2025January 24, 2025 by Yazeed

LM studio is a great tool to run models locally on your machine, but it is somewhat more than that

According to their intro, it is…

A desktop application for running local LLMs
A familiar chat interface
Search & download functionality (via Hugging Face 🤗)
A local server that can listen on OpenAI-like endpoints
Systems for managing local models and configurations

OpenWebUI

Posted on January 23, 2025March 19, 2025 by Yazeed

1- Installing as docker container

Install docker like you would, by adding its repositories or however you are used to installing it
sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main
to check if it is running, run sudo docker ps
Now, localhost:8080 should have your OpenWebUI running
Do a signup or create account (Local instance bound), the first account you create will automatically become an admin account !

At this stage, you should be good to go,

If you don’t want to use docker ! you can simply install it with the following pip command

pip install open-webui

Once done, you can simply run it with

open-webui serve

Now that it is running, go to http://localhost:8080 and create an account, the first account you create will automatically become the admin account

Adding OpenAI API key

The most common use for open-webui is with Ollama, but let us assume you want to install your openAI account into it !

1- Go to admin panel (Bottom left)
2- Go to settings in admin panel (center top)
3- go to connections
4-

Deep Seek

Posted on January 6, 2025January 6, 2025 by Yazeed

What a pleasant surprise this is, something you can run locally on your computer, or use it for a fraction of the cost that comes with OpenAI or Anthropic’s Calude !

DeepSeek-V3 is completely open-source and free. (https://github.com/deepseek-ai/DeepSeek-V3)

If you don’t have the hardware resources for it, it is also available through a website identical to that of ChatGPT and an incredibly affordable API.

How affordable ?

Deep Seek: $0.14 per million input tokens and $0.28 per million output tokens.
Claude AI : $3.00 per million input tokens and $15.00 per million output tokens
ChatGPT : $2.50 per million input tokens and $10.00 per million output tokens

So, the bottom line is that deep seek is fifty times cheaper than Claude AI, and around 35 times cheaper than openAI ! that is, two percent and three percent of the price, But what about quality

in most scenarios, it is comparable, in some cases Deep Seek wins, in other cases, claude or ChatGPT wins, but it is up there obviously !

Ollama

Posted on January 5, 2025March 13, 2025 by Yazeed

1- Installing

1.1 – Linux

On Debian linux, Installing Ollama is a one liner, just enter the following in your terminal

curl -fsSL https://ollama.com/install.sh | sh

Yup, that is it, move on to using Ollama

1.2 Windows and MAC

Just go to https://ollama.com/, download it, and run the installer ! you are done

Optional: Changing the models directory

AI models can be very big, sometimes half a terabyte, for example, at the time of writing, deepseek model was around 700GBs, but the disk you are copying to needs to be an SSD if you want reasonable speed.

So here are the steps

sudo systemctl stop ollama
sudo mkdir -p /hds/2tb/ollama_models (Wherever you want the data to be)
sudo chown ollama:ollama /hds/2tb/ollama_models (Whatever your username is)
sudo mkdir -p /etc/systemd/system/ollama.service.d/ (This is the override conf)
sudo nano /etc/systemd/system/ollama.service.d/override.conf
and add the following to it
[Service]
Environment=”OLLAMA_MODELS=/hds/2tb/ollama_models”
sudo systemctl daemon-reload
sudo systemctl start ollama
sudo systemctl show -p Environment ollama

Using it !

Using Ollama is simple, just open your terminal window or command prompt , then activate your conda environment (Or venv) , and run the following command, for the sake of this example, I will run

ollama run llama3.2

llama3.3 with its 70 billion parameters will require a minimum of 64GB of ram, so don’t try that unless you have the RAM for it ! for comparison, 3.2 has 2 billion, which is around 3% of 3.3

It should now probably download about 2GBs of data (The model llama3.2 has around 2 billion parameters) And you are done, Now you can ask it anything

For example, create an article for me explaining this and that,

Once done, just enter “/bye” to exit the ollama prompt and quit the session

If you want to for example clear the context or do anything else, just use the command /? for a list of commands

Now, you have used the lama3.2, but on this ollama models page, you will find that there are many others that you can use !

Others include models that help you with coding, or models that are more targeted towards chat-bot QA, either way, you should take a close look at them, even if for the fun of it

Is ollama running ?

Just visit (http://localhost:11434/) in your browser, and you should see the message (Ollama is running)

What models have been installed (ex: ollama pull llama3.2)

ollama list

Remove installed models from system

Start by listing the models like the above, then you can remove them with rm

ollama rm llama3.2

Modifying a model’s settings

An example of what you may want to modify may be for example If you have a GPU, but you do not want it to be used by Ollama, to do this, you will need to create a model file, the steps to creating this file for llama 3.2, (The small one) are as follows

# Copy the llama 3.2 base file
ollama show llama3.2:latest --modelfile > ~/cpullama3.2.modelfile
# edit the file ~/cpullama3.2 and edit the FROM line to read
FROM llama3.2:latest
# go to the parameters section, and add the parameters you need
# In our case, PARAMETER num_gpu 0
PARAMETER num_gpu 0
# Create your custom model
ollama create cpullama3.2 --file cpullama3.2.modelfile

The last command above resulted in the following output

transferring model data 
using existing layer sha256:dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff 
using existing layer sha256:fcc5a6bec9daf9b561a68827b67ab6088e1dba9d1fa2a50d7bbcc8384e0a265d 
using existing layer sha256:a70ff7e570d97baaf4e62ac6e6ad9975e04caa6d900d3742d37698494479e0cd 
using existing layer sha256:966de95ca8a62200913e3f8bfbf84c8494536f1b94b49166851e76644e966396 
using existing layer sha256:fcc5a6bec9daf9b561a68827b67ab6088e1dba9d1fa2a50d7bbcc8384e0a265d 
using existing layer sha256:a70ff7e570d97baaf4e62ac6e6ad9975e04caa6d900d3742d37698494479e0cd 
creating new layer sha256:650ff8e84978b35dd2f3ea3653ed6bf020a95e7deb031ceae487cdd98dedc2e3 
creating new layer sha256:f29c86d4cf6a4072deefa0ff196b7960da63b229686497b02aad4f5202d263ea 
writing manifest 
success

Above, although you simply created a “model” by copying the existing model’s config file ! nothing more nothing less

Ollama API

So, above, your terminal allowed you to chat with the model, much like what you do when you open Claude or ChatGPT, if you want to access things via API, here is how.

Disable Ollama

To make sure Ollama does not come up at boot, execute the following two commands

sudo systemctl stop ollama
sudo systemctl disable ollama

Category: Local AI

LM Studio

OpenWebUI

Deep Seek

Ollama

1- Installing

Optional: Changing the models directory

Using it !

Is ollama running ?

Modifying a model’s settings

Ollama API

Disable Ollama

Removing Ollama