From 7e2cb9860399e95582fd4e1c598f02f56affaefe Mon Sep 17 00:00:00 2001 From: Jannis Klinkenberg <j.klinkenberg@itc.rwth-aachen.de> Date: Wed, 19 Mar 2025 15:45:46 +0100 Subject: [PATCH] more content for job scripts --- generic-job-scripts/README.md | 38 +++++++++++++++++++++++++++-- generic-job-scripts/gpu_job_1gpu.sh | 3 ++- 2 files changed, 38 insertions(+), 3 deletions(-) diff --git a/generic-job-scripts/README.md b/generic-job-scripts/README.md index 1763d6a..9a1ab62 100644 --- a/generic-job-scripts/README.md +++ b/generic-job-scripts/README.md @@ -1,8 +1,14 @@ # Generic Slurm Job Script Examples -This folder contains common job script examples and best practices. You can submit jobs to the Slurm batch system via `sbatch <script-name>.sh`. +This folder contains common job script examples and best practices. -## What can you find here? +## Asychronous jobs + +The following table illustrates examples for asynchronous jobs that contain both: +- The allocation requests for your job, e.g. in form of `#SBATCH` flags in your batch script +- The actual task or instructions that your job needs to perform. + +You can submit such jobs to the Slurm batch system via `sbatch <parameters> <script-name>` (detailed documentation [here](https://slurm.schedmd.com/sbatch.html)). Typically, these jobs are then queued and scheduled by the workload manager as soon as the desired resources are free and it is your turn to compute. (Remember: many people might want to use those hardware resources. So, Slurm needs to find a fair compromise.) | File/Folder | Description | |--------|-------------| @@ -16,3 +22,31 @@ This folder contains common job script examples and best practices. You can subm | [mpi_job_basic.sh](mpi_job_basic.sh) | A basic MPI job script, useful for testing and learning MPI-based job submission. | | [mpi_job_1node.sh](mpi_job_1node.sh) | Runs an MPI job on a single node, demonstrating intra-node parallel processing with multiple processes per node. | | [mpi_job_2nodes.sh](mpi_job_2nodes.sh) | Runs an MPI job spanning 2 full compute nodes, demonstrating inter-node parallelism and distributed computing across multiple machines. | + +## Interactive jobs + +Sometimes, you are still in the testing/debugging phase or do not yet completely know how your job script instruction should correctly look like. In such cases, an *interactive job* might be what you want. + +An interactive job allows users to run commands in real-time on an HPC cluster, making it useful for debugging, testing scripts, or exploring data interactively. Unlike asynchronous batch jobs, which are submitted to a queue and executed without user interaction, interactive jobs provide immediate feedback and enable on-the-fly adjustments. This is especially valuable when developing or fine-tuning workflows before submitting long-running batch jobs. + +In such a case, you only define your resource requirements and boundary conditions with `salloc` (detailed documentation [here](https://slurm.schedmd.com/salloc.html)). After the jobs has been scheduled by Slurm, the system will provide a regular shell for interactive work. Here are a few examples: + +### Example: Interactive job on CPU resources for OpenMP (full node) +```zsh +salloc --time=00:15:00 --nodes=1 --ntasks-per-node=1 --cpus-per-task=96 --partition=c23ms +``` + +### Example: Interactive job on CPU resources for MPI (2 full nodes) +```zsh +salloc --time=00:15:00 --nodes=2 --ntasks-per-node=96 --partition=c23ms +``` + +### Example: Interactive job on CPU resources for hybrid MPI+OpenMP (2 full nodes) +```zsh +salloc --time=00:15:00 --nodes=2 --ntasks-per-node=4 --cpus-per-task=24 --partition=c23ms +``` + +### Example: Interactive job on GPU resources (using 1 GPU) +```zsh +salloc --time=00:15:00 --nodes=1 --ntasks-per-node=1 --cpus-per-task=24 --gres=gpu:1 --partition=c23g +``` \ No newline at end of file diff --git a/generic-job-scripts/gpu_job_1gpu.sh b/generic-job-scripts/gpu_job_1gpu.sh index 74a97a3..5c651f4 100644 --- a/generic-job-scripts/gpu_job_1gpu.sh +++ b/generic-job-scripts/gpu_job_1gpu.sh @@ -33,4 +33,5 @@ nvidia-smi # Example: Only a single GPU is used. However, due to billing # settings, 24 CPU cores can be requested and used -# for free. \ No newline at end of file +# in conjunction with that GPU. That also enables +# multi-threaded preprocessing on the CPU side. \ No newline at end of file -- GitLab