
Getting Started with LLM Inference Using Ollama
This directory outlines two distinct scenarios and approaches, differing in the method of running the base Ollama server and the Large Language Model (LLM):
- An approach utilizing the official Ollama container image, which encompasses the entire software stack and necessary binaries to operate Ollama.
- An approach involving manual setup of Ollama within your user directories, requiring you to download binaries and modify paths accordingly.
Furthermore, this directory includes two examples:
- Using a standard REST API request to prompt the LLM
- Engaging with the LLM via the
ollama-python
library.
Please find more information to Ollama in the following links:
0. Prerequisites
To demonstrate how to use Ollama with the ollama-python
library, you first need to create a Python virtual environment. Run the following command ONCE:
# Specify the Ollama root directory
export OLLAMA_ROOT_DIR=${HOME}/ollama
mkdir -p ${OLLAMA_ROOT_DIR}
# set further relative path variables
source set_paths.sh
# create the venv
module load Python
python -m venv ${OLLAMA_VENV_DIR}
source ${OLLAMA_VENV_DIR}/bin/activate
pip install ollama
1. Running Ollama
Note
Examples here run ollama serve
and ollama run
in the background to enable concise demonstrations from a single script or shell. However, official examples also show that these commands can be run in separate shells on the same node instead.
1.1. Running Ollama with the official container (recommended)
An Ollama container will be centrally provided on our HPC system very soon. However, for now lets assume we created one with the following command:
# Specify the Ollama root directory
export OLLAMA_ROOT_DIR=${HOME}/ollama
# set further relative path variables
source set_paths.sh
# build Ollama apptainer container
apptainer build ${OLLAMA_COINTAINER_IMAGE} docker://ollama/ollama
Then you can start using the examples right away, either in your current shell or by submitting a batch job to run them on a backend node:
# run in current active shell
zsh submit_job_container.sh
# submit batch job
sbatch submit_job_container.sh
1.2. Downloading and running Ollama manually
Before beeing able to execute Ollama and run the examples, you need to download Ollama and make it available to the upcoming workflow steps.
Execute the following instructions ONCE to download Ollama:
# Specify the Ollama root directory
export OLLAMA_ROOT_DIR=${HOME}/ollama
# set further relative path variables
source set_paths.sh
# create required directory and download Ollama binaries
mkdir -p ${OLLAMA_INSTALL_DIR} && cd ${OLLAMA_INSTALL_DIR}
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
tar -xzf ollama-linux-amd64.tgz
Now you can execute the examples, either in the current shell or by submitting a batch job that runs the examples on a backend node:
# run in current active shell
zsh submit_job_venv.sh
# submit batch job
sbatch submit_job_venv.sh