SRUN - Dartmouth Research Computing & Data (RCD)

SRUN

Interactive Jobs
Example: Interactive job
Example: Interactive GPU submission

Interactive Jobs

An interactive job is a job that returns a command line prompt (instead of running a script) when the job runs. Interactive jobs are useful when debugging or interacting with an application. The srun command is used to submit an interactive job to Slurm. When the job starts, a command line prompt will appear on one of the compute nodes assigned to the job. From here commands can be executed using the resources allocated on the local node.

[user@discovery ~]$ srun --acount=rc --pty /bin/bash
[user@p04 ~]$ hostname
p04.hpcc.dartmouth.edu
[user@p04 ~]$

Jobs submitted with srun –pty /bin/bash will be assigned the cluster default values of 1 CPU and 1024MB of memory. The account must also be specified; the job will not run otherwise. If additional resources are required, they can be requested as options to the srun command. The following example job is assigned 2 nodes with 2 CPUS and 4GB of memory each:

srun --nodes=2 --ntasks-per-node=4 --mem-per-cpu=1GB --cpus-per-task=1 --account=rc --pty /bin/bash
[user@q06 ~]$ 

Example: Interactive job

An interactive job is launched on a compute node and provides you with a command line prompt. Interactive jobs are useful when debugging or interacting with an application. You will use the srun command to launch an interactive job. Once the job has started, commands can be executed utilizing resources on the local node.

$ srun --pty /bin/bash
$ hostname
p04.hpcc.dartmouth.edu
$ 

Jobs submitted with bash srun --pty /bin/bash will be assigned the cluster default values of 1 CPU and 1024MB of memory. The account must also be specified; the job will not run otherwise. If additional resources are required, they can be requested as options to the srun command. The following example job is assigned 2 nodes with 2 CPUS and 4GB of memory each:

srun --nodes=2 --ntasks-per-node=4 --mem-per-cpu=1GB --cpus-per-task=1 --pty /bin/bash
q06 ~]$ 

Example: Interactive GPU submission

Occasionally you may need to run an interactive job on a GPU node to test the code that you are using on hardware which is GPU aware. You can query what GPU resources are available with sinfo -O gres -p <name of queue>

$ sinfo -O gres
GRES
gpu:k80:4(S:1)
gpu:v100:4(S:0-1)

From this output we can see that k80’s and v100’s are available. Now we can submit an interactive job requesting those specific resources. For example, if we wanted k80 GPUs we would submit our interactive job like:

srun -p gpuq --gres=gpu:k80:1 --pty /bin/bash

From the output of nvidia-smi we can see that we have been assigned two k80 GPUs:

$ srun -p gpuq --gres=gpu:k80:2 --pty /bin/bash
[g03]$ nvidia-smi
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|===============================+======================+======================|
|   0  Tesla K80           Off  | 00000000:8A:00.0 Off |                    0 |
| N/A   34C    P8    26W / 149W |      0MiB / 11441MiB |      0%   E. Process |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
|   1  Tesla K80           Off  | 00000000:8B:00.0 Off |                    0 |
| N/A   30C    P8    31W / 149W |      0MiB / 11441MiB |      0%   E. Process |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
[@g03 ~]$ echo $CUDA_VISIBLE_DEVICES
0,1```

SBATCH

Torque & Slurm Equivalent Commands

On this page: