Introduction to Kebnekaise¶

Objectives

Learn how to load the necessary modules on Kebnekaise.
Learn how to compile C code on Kebnekaise.
Learn how to compile CUDA code on Kebnekaise.
Learn how to place jobs to the batch queue.
Learn how to use the course project and reservations.

Modules and toolchains¶

You need to load the correct toolchain before compiling your code on Kebnekaise.

The available modules are listed using the ml avail command:

$ ml avail
------------------------- /hpc2n/eb/modules/all/Core --------------------------
Bison/3.0.5                        fosscuda/2020a
Bison/3.3.2                        fosscuda/2020b        (D)
Bison/3.5.3                        gaussian/16.C.01-AVX2
Bison/3.7.1                (D)     gcccuda/2019b
CUDA/8.0.61                        gcccuda/2020a
CUDA/10.1.243              (D)     gcccuda/2020b         (D)
...

The list shows the modules you can load directly, and so may change if you have loaded modules.

In order to see all the modules, including those that have prerequisites to load, use the command ml spider. Many types of application software fall in this category.

You can find more information regarding a particular module using the ml spider <module> command:

$ ml spider MATLAB

---------------------------------------------------------------------------
MATLAB: MATLAB/2019b.Update2
---------------------------------------------------------------------------
    Description:
    MATLAB is a high-level language and interactive environment that
    enables you to perform computationally intensive tasks faster than
    with traditional programming languages such as C, C++, and Fortran.


    This module can be loaded directly: module load MATLAB/2019b.Update2

    Help:
    Description
    ===========
    MATLAB is a high-level language and interactive environment
    that enables you to perform computationally intensive tasks faster than with
    traditional programming languages such as C, C++, and Fortran.


    More information
    ================
    - Homepage: http://www.mathworks.com/products/matlab

You can load the module using the ml <module> command:

$ ml MATLAB/2019b.Update2

You can list loaded modules using the ml command:

$ ml

Currently Loaded Modules:
 1) snicenvironment     (S)   7) libevent/2.1.11    13) PMIx/3.0.2
 2) systemdefault       (S)   8) numactl/2.0.12     14) impi/2018.4.274
 3) GCCcore/8.2.0             9) XZ/5.2.4           15) imkl/2019.1.144
 4) zlib/1.2.11              10) libxml2/2.9.8      16) intel/2019a
 5) binutils/2.31.1          11) libpciaccess/0.14  17) MATLAB/2019b.Update2
 6) iccifort/2019.1.144      12) hwloc/1.11.11

Where:
 S:  Module is Sticky, requires --force to unload or purge

You can unload all modules using the ml purge command:

$ ml purge
The following modules were not unloaded:
  (Use "module --force purge" to unload all):

  1) systemdefault   2) snicenvironment

Note that the ml purge command will warn that two modules were not unloaded. This is normal and you should NOT force unload them.

Challenge

Load the FOSS CUDA toolchain for source code compilation:
```
$ ml purge
$ ml fosscuda/2020b buildenv
```
The fosscuda/2020b module loads the GNU compiler (version 2020b), the CUDA SDK and several other libraries. The buildenv module sets certain environment variables that are necessary for source code compilation.
Investigate which modules were loaded.
Purge all modules.
Find the latest FOSS toolchain (foss). Load it and the buildenv module. Investigate the loaded modules. Purge all modules.

Compile C code¶

Once the correct toolchain (foss/2020b) has been loaded, we can compile C source files (*.c) with the GNU compiler:

$ gcc -o <binary name> <sources> -Wall

The -Wall causes the compiler to print additional warnings.

Challenge

Compile the following “Hello world” program:

#include <stdio.h>

int main() {
    printf("Hello world!\n");
    return 0;
}

Compile CUDA code¶

Once the correct toolchain (fosscuda/2020b) has been loaded, we can compile CU source files (*.cu) with the nvcc compiler:

$ nvcc -o <binary name> <sources> -Xcompiler="-Wall"

This passes the -Wall flag to g++. The flag causes the compiler to print extra warnings.

Challenge

Compile the following “Hello world” program:

#include <stdio.h>

__global__ void say_hello()
{
    printf("A device says, Hello world!\n");
}

int main()
{
    printf("The host says, Hello world!\n");
    say_hello<<<1,1>>>();
    cudaDeviceSynchronize();
    return 0;
}

Course project and reservation¶

During the course, you can use the course reservations (snic2021-22-272-cpu-day[1|2|3] and snic2021-22-272-gpu-day[1|2|3]) to get faster access to the compute nodes. The reservations are valid during the time 9:00-13:00 on each of the three days (10-12 May 2021). Note that capitalization matters for reservations!

Day	CPU only	CPU + GPU
Monday	snic2021-22-272-cpu-day1	snic2021-22-272-gpu-day1
Tuesday	snic2021-22-272-cpu-day2	snic2021-22-272-gpu-day1
Wednesday	snic2021-22-272-cpu-day3	snic2021-22-272-gpu-day1

Note that jobs that are submitted using a reservation are not scheduled outside the reservation time window. You can, however, submit jobs without the reservation as long as you are a member of an active project. The course project SNIC2021-22-272 is valid until 2021-06-01.

Submitting jobs¶

The jobs are submitted using the srun command:

$ srun --account=<account> --ntasks=<task count> --time=<time> <command>

This places the command into the batch queue. The three arguments are the project number, the number of tasks, and the requested time allocation. For example, the following command prints the uptime of the allocated compute node:

$ srun --account=SNIC2021-22-272 --ntasks=1 --time=00:00:15 uptime
srun: job 12727702 queued and waiting for resources
srun: job 12727702 has been allocated resources
 11:53:43 up 5 days,  1:23,  0 users,  load average: 23,11, 23,20, 23,27

Note that we are using the course project, the number of tasks is set to one, and we are requesting 15 seconds.

When the reservation is valid, you can specify it using the --reservation=<reservation> argument:

$ srun --account=SNIC2021-22-272 --reservation=snic2021-22-272-cpu-day1 --ntasks=1 --time=00:00:15 uptime
 11:58:43 up 6 days,  1:23,  0 users,  load average: 23,11, 22,20, 21,27

were N in dayN is either 1, 2, 3 and cpu can be replaced with gpu if you are running a GPU job.

We could submit multiple tasks using the --ntasks=<task count> argument:

$ srun --account=SNIC2021-22-272 --reservation=snic2021-22-272-cpu-day1 --ntasks=4 --time=00:00:15 uname -n
b-cn0932.hpc2n.umu.se
b-cn0932.hpc2n.umu.se
b-cn0932.hpc2n.umu.se
b-cn0932.hpc2n.umu.se

Note that all task are running on the same node. We could request multiple CPU cores for each task using the --cpus-per-task=<cpu count> argument:

$ srun --account=SNIC2021-22-272 --reservation=snic2021-22-272-cpu-day1 --ntasks=4 --cpus-per-task=14 --time=00:00:15 uname -n
b-cn0935.hpc2n.umu.se
b-cn0935.hpc2n.umu.se
b-cn0932.hpc2n.umu.se
b-cn0932.hpc2n.umu.se

If you want to measure the performance, it is advisable to request an exclusive access to the compute nodes (--exclusive):

$ srun --account=SNIC2021-22-272 --reservation=snic2021-22-272-cpu-day1 --ntasks=4 --cpus-per-task=14 --exclusive --time=00:00:15 uname -n
b-cn0935.hpc2n.umu.se
b-cn0935.hpc2n.umu.se
b-cn0932.hpc2n.umu.se
b-cn0932.hpc2n.umu.se

Finally, we could request a single Nvidia Tesla V100 GPU and 14 CPU cores using the --gres=gpu:v100:1,gpuexcl argument:

$ srun --account=SNIC2021-22-272 --reservation=snic2021-22-272-gpu-day1 --ntasks=1 --gres=gpu:v100:1,gpuexcl --time=00:00:15 nvidia-smi
Wed Apr 21 12:59:15 2021
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.67       Driver Version: 460.67       CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla V100-PCIE...  On   | 00000000:58:00.0 Off |                    0 |
| N/A   33C    P0    26W / 250W |      0MiB / 16160MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes:                                                                  |
|  GPU   GI   CI        PID   Type   Process name                  GPU Memory |
|        ID   ID                                                   Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Challenge

Run both “Hello world” programs on the the compute nodes.

Aliases¶

In order to save time, you can create an alias for a command:

$ alias <alist>="<command>"

For example:

$ alias run_full="srun --account=SNIC2021-22-272 --reservation=snic2021-22-272-cpu-day1 --ntasks=1 --cpus-per-task=28 --time=00:05:00"
$ run_full uname -n
b-cn0932.hpc2n.umu.se

Batch files¶

It is often more convenient to write the commands into a batch file. For example, we could write the following to a file called batch.sh:

#!/bin/bash
#SBATCH --account=SNIC2021-22-272
#SBATCH --reservation=snic2021-22-272-cpu-day1
#SBATCH --ntasks=1
#SBATCH --time=00:00:15

ml purge
ml foss/2020b

uname -n

Note that the same arguments that were earlier passed to the srun command are now given as comments. It is highly advisable to purge all loaded modules and re-load the required modules as the job inherits the environment. The batch file is submitted using the sbatch <batch file> command:

sbatch batch.sh
Submitted batch job 12728675

By default, the output is directed to the file slurm-<job_id>.out, where <job_id> is the job id returned by the sbatch command:

$ cat slurm-12728675.out
The following modules were not unloaded:
 (Use "module --force purge" to unload all):

 1) systemdefault   2) snicenvironment
b-cn0102.hpc2n.umu.se

Challenge

Write two batch files that run both “Hello world” programs on the the compute nodes.

Job queue¶

You can investigate the job queue with the squeue command:

$ squeue -u $USER

If you want an estimate for when the job will start running, you can give the squeue command the argument --start.

You can cancel a job with the scancel command:

$ scancel <job_id>