OpenWebUI

1- Installing as docker container

  • Install docker like you would, by adding its repositories or however you are used to installing it
  • sudo apt install docker-ce docker-ce-cli containerd.io docker-buildx-plugin docker-compose-plugin
  • docker run -d --network=host -v open-webui:/app/backend/data -e OLLAMA_BASE_URL=http://127.0.0.1:11434 --name open-webui --restart always ghcr.io/open-webui/open-webui:main
  • to check if it is running, run sudo docker ps
  • Now, localhost:8080 should have your OpenWebUI running
  • Do a signup or create account (Local instance bound), the first account you create will automatically become an admin account !

At this stage, you should be good to go,

Deep Seek

What a pleasant surprise this is, something you can run locally on your computer, or use it for a fraction of the cost that comes with OpenAI or Anthropic’s Calude !

DeepSeek-V3 is completely open-source and free. (https://github.com/deepseek-ai/DeepSeek-V3)

If you don’t have the hardware resources for it, it is also available through a website identical to that of ChatGPT and an incredibly affordable API.

How affordable ?

Deep Seek: $0.14 per million input tokens and $0.28 per million output tokens.
Claude AI : $3.00 per million input tokens and $15.00 per million output tokens
ChatGPT : $2.50 per million input tokens and $10.00 per million output tokens

So, the bottom line is that deep seek is fifty times cheaper than Claude AI, and around 35 times cheaper than openAI ! that is, two percent and three percent of the price, But what about quality

in most scenarios, it is comparable, in some cases Deep Seek wins, in other cases, claude or ChatGPT wins, but it is up there obviously !

Ollama

1- Installing

1.1 – Linux

On Debian linux, Installing Ollama is a one liner, just enter the following in your terminal

curl -fsSL https://ollama.com/install.sh | sh

Yup, that is it, move on to using Ollama

1.2 Windows and MAC

Just go to https://ollama.com/, download it, and run the installer ! you are done

Using it !

Using Ollama is simple, just open your terminal window or command prompt , then activate your conda environment (Or venv) , and run the following command, for the sake of this example, I will run

conda activate projectName
ollama run llama3.2

llama3.3 with its 70 billion parameters will require a minimum of 64GB of ram, so don’t try that unless you have the RAM for it ! for comparison, 3.2 has 2 billion, which is around 3% of 3.3

It should now probably download about 2GBs of data (The model llama3.2 has around 2 billion parameters) And you are done, Now you can ask it anything

For example, create an article for me explaining this and that,

Once done, just enter “/bye” to exit the ollama prompt and quit the session

If you want to for example clear the context or do anything else, just use the command /? for a list of commands

Now, you have used the lama3.2, but on this ollama models page, you will find that there are many others that you can use !

Others include models that help you with coding, or models that are more targeted towards chat-bot QA, either way, you should take a close look at them, even if for the fun of it

Is ollama running ?

Just visit (http://localhost:11434/) in your browser, and you should see the message (Ollama is running)

Modifying a model’s settings

An example of what you may want to modify may be for example If you have a GPU, but you do not want it to be used by Ollama, to do this, you will need to create a model file, the steps to creating this file for llama 3.2, (The small one) are as follows

# Copy the llama 3.2 base file
ollama show llama3.2:latest --modelfile > ~/cpullama3.2.modelfile
# edit the file ~/cpullama3.2 and edit the FROM line to read
FROM llama3.2:latest
# go to the parameters section, and add the parameters you need
# In our case, PARAMETER num_gpu 0
PARAMETER num_gpu 0
# Create your custom model
ollama create cpullama3.2 --file cpullama3.2.modelfile

The last command above resulted in the following output

transferring model data 
using existing layer sha256:dde5aa3fc5ffc17176b5e8bdc82f587b24b2678c6c66101bf7da77af9f7ccdff
using existing layer sha256:fcc5a6bec9daf9b561a68827b67ab6088e1dba9d1fa2a50d7bbcc8384e0a265d
using existing layer sha256:a70ff7e570d97baaf4e62ac6e6ad9975e04caa6d900d3742d37698494479e0cd
using existing layer sha256:966de95ca8a62200913e3f8bfbf84c8494536f1b94b49166851e76644e966396
using existing layer sha256:fcc5a6bec9daf9b561a68827b67ab6088e1dba9d1fa2a50d7bbcc8384e0a265d
using existing layer sha256:a70ff7e570d97baaf4e62ac6e6ad9975e04caa6d900d3742d37698494479e0cd
creating new layer sha256:650ff8e84978b35dd2f3ea3653ed6bf020a95e7deb031ceae487cdd98dedc2e3
creating new layer sha256:f29c86d4cf6a4072deefa0ff196b7960da63b229686497b02aad4f5202d263ea
writing manifest
success

Above, although you simply created a “model” by copying the existing model’s config file ! nothing more nothing less

Ollama API

So, above, your terminal allowed you to chat with the model, much like what you do when you open Claude or ChatGPT, if you want to access things via API, here is how.