Skip to content
Snippets Groups Projects
Commit 3cb067e9 authored by Jannis Klinkenberg's avatar Jannis Klinkenberg
Browse files

moved files and added Ollama

parent af2fa2e0
No related branches found
No related tags found
No related merge requests found
Showing
with 156 additions and 6 deletions
......@@ -11,9 +11,11 @@ For general help, documentation, and trainings please refer to the following pag
## What can you find here?
| Folder | Description |
|--------|-------------|
| [generic-job-scripts](generic-job-scripts) | General-purpose job submission scripts (our current workload is Slurm) for various workloads, including CPU and GPU-based computations. |
| [pytorch](pytorch) | Example scripts and best practices for running PyTorch workloads on an HPC cluster, including distributed training and GPU utilization. |
| [scikit-learn](scikit-learn) | HPC-friendly examples of using Scikit-Learn, including job submission scripts for machine learning model training. |
| [tensorflow](tensorflow) | TensorFlow job scripts and performance optimization techniques for running deep learning models on CPUs and GPUs in an HPC environment. |
\ No newline at end of file
| Folder | Sub-Folder | Description |
|--------|------------|-------------|
| [generic-job-scripts](generic-job-scripts) | - | General-purpose job submission scripts (our current workload is Slurm) for various workloads, including CPU and GPU-based computations. |
| [machine-and-deep-learning](machine-and-deep-learning) | - | Collection of different examples in the context of Machine Learning (ML), Deep Learning (DL) and Large Language Models (LLMs) |
| | [ollama](machine-and-deep-learning/ollama) | Examples how to run and use LLMs with Ollama. |
| | [pytorch](machine-and-deep-learning/pytorch) | Example scripts and best practices for running PyTorch workloads on an HPC cluster, including distributed training and GPU utilization. |
| | [scikit-learn](machine-and-deep-learning/scikit-learn) | HPC-friendly examples of using Scikit-Learn, including job submission scripts for machine learning model training. |
| | [tensorflow](machine-and-deep-learning/tensorflow) | TensorFlow job scripts and performance optimization techniques for running deep learning models on CPUs and GPUs in an HPC environment. |
\ No newline at end of file
# Ollama - Running temporary Large Language Models (LLMs)
This directory outlines two distinct scenarios and approaches, differing in the method of running the base Ollama server and the LLM:
1. An approach utilizing the official Ollama container image, which encompasses the entire software stack and necessary binaries to operate Ollama.
2. An approach involving manual setup of Ollama within your user directories, requiring you to download binaries and modify paths accordingly.
Furthermore, this directory includes two examples:
- Using a standard REST API request to prompt the LLM
- Engaging with the LLM via the `ollama-python` library.
Please find more information to Ollama in the following links:
- https://github.com/ollama/ollama
- https://github.com/ollama/ollama-python
## 1. Running Ollama with the official Container (recommended)
... follows soon ...
## 2. Downloading and Running Ollama manually
Before beeing able to execute Ollama and run the exaples, you need to download Ollama and make it available to the upcoming workflow steps. Additionally, we use a Python virtual environment, to demonstrate how Ollama can be used via the `ollama-python` library.
Execute the following instructions **ONCE** to download Ollama and create the virtual environment:
```bash
# Specify the Ollama root directory, where binaries should be placed and where venv should be created, such as:
export OLLAMA_ROOT_DIR=${HOME}/ollama
# initialize environment variables that refer to installation and virtual environment
source set_paths.sh
# Download the Ollama binaries and create the venv
zsh download_and_create_venv.sh
```
Now you can execute the examples, either in the current shell or by submitting a batch job that runs the examples on a backend node:
```bash
# run in current active shell
zsh submit_job_venv.sh
# submit batch job
sbatch submit_job_venv.sh
```
\ No newline at end of file
#!/usr/bin/zsh
# create required directory
mkdir -p ${OLLAMA_ROOT_DIR}
mkdir -p ${OLLAMA_INSTALL_DIR}
# download Ollama binaries
cd ${OLLAMA_INSTALL_DIR}
curl -L https://ollama.com/download/ollama-linux-amd64.tgz -o ollama-linux-amd64.tgz
tar -xzf ollama-linux-amd64.tgz
# create Python virtual
module load Python
python -m venv ${OLLAMA_VENV_DIR}
# activate the environment
source ${OLLAMA_VENV_DIR}/bin/activate
# install the ollama-python library
pip install ollama
\ No newline at end of file
from ollama import chat
from ollama import ChatResponse
response: ChatResponse = chat(model='llama3.2', messages=[
{
'role': 'user',
'content': 'Why is the sky blue?',
},
])
print(response['message']['content'])
# or access fields directly from the response object
# print(response.message.content)
\ No newline at end of file
#!/usr/bin/zsh
# path where Ollama binaries will be placed after download and extraction
export OLLAMA_INSTALL_DIR=${OLLAMA_ROOT_DIR}/install
# path to Python virtual environment
export OLLAMA_VENV_DIR=${OLLAMA_ROOT_DIR}/venv_ollama
# extend path to make it executable in the shell
export PATH="${OLLAMA_INSTALL_DIR}/bin:${PATH}"
\ No newline at end of file
#!/usr/bin/zsh
############################################################
### Slurm flags
############################################################
#SBATCH --time=00:15:00
#SBATCH --partition=c23g
#SBATCH --nodes=1
#SBATCH --ntasks-per-node=1
#SBATCH --cpus-per-task=24
#SBATCH --gres=gpu:1
############################################################
### Load modules or software
############################################################
# specify your Ollama root directory
export OLLAMA_ROOT_DIR=${HOME}/ollama
# set dependent paths
source set_paths.sh
# load Python and activate venv
module load Python
source ${OLLAMA_VENV_DIR}/bin/activate
############################################################
### Parameters and Settings
############################################################
# print some information about current system
echo "Job nodes: ${SLURM_JOB_NODELIST}"
echo "Current machine: $(hostname)"
nvidia-smi
############################################################
### Execution (Model Training)
############################################################
# run server in background and redirect output
ollama serve &> log_ollama_serve.log &
# remember ID of process that has just been started in background
export proc_id_serve=$!
# wait until Ollama server is up
sleep 5
# run desired model instance in background
ollama run llama3.2 &
# wait until model is up
sleep 5
# Example: prompt against LLM via REST API (Note: streaming is typically only useful when using a Chat frontends)
echo "========== Example REST API =========="
curl http://localhost:11434/api/generate -d '{"model": "llama3.2", "prompt":"Why is the sky blue?", "stream": false}'
echo "\n"
# Example: prompt against LLM through ollama-python
echo "========== Example Python via ollama-python =========="
python3 ollama-example.py
# cleanup: stop model and kill serve and run processes
ollama stop llama3.2
kill -9 ${proc_id_serve}
# kill remaining ollama procs if not already done
ps aux | grep '[o]llama' | awk '{print $2}' | xargs -r kill -9
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Please register or to comment