Other types of jobs
Instructor note
Total: 75min (Teaching:45Min | Discussion:0min | Breaks:0min | Exercises:30Min)
Objectives
Objectives
Interactive jobs for testing
Up to this point, we’ve focused on running jobs in batch mode. SLURM also provides the ability to start an interactive session.
There are very frequently tasks that need to be done interactively. Creating an entire job
script might be overkill, but the amount of resources required is too much for a login node to
handle. A good example of this might be building a genome index for alignment with a tool like
HISAT2. Fortunately, we can run these types of
tasks as a one-off with srun
.
Interactive jobs
Instead of running on a login node, you can ask the queue system to allocate compute resources for you, and once assigned, you can run commands interactively for as long as requested. The below is an example.
Warning
Interactive jobs require that you maintain a stable/uninterupted connection to the cluster.
There for we use terminal multiplexer like tmux
or screen
salloc --account=nn9997k --time=1:1:00 --nodes=1 --ntasks=2 --mem=17G
[MY_USER_NAME@login-4 ~]$ salloc --ntasks=1 --mem-per-cpu=4G --time=00:30:00 --account=ec34
salloc: Pending job allocation 39544
salloc: job 39544 queued and waiting for resources
salloc: job 39544 has been allocated resources
salloc: Granted job allocation 39544
salloc: Waiting for resource configuration
salloc: Nodes c1-28 are ready for job
bash-4.4$ hostname
c1-28
#To end the interactive session use the command exit
bash-4.4$ exit
salloc: Relinquishing job allocation 39544
Find out available resources on a compute node
User interactive login to access a compute node and find out number of cores and amount of memory. How much of it is at your disposal ?
Solution
nproc –all
free -h
echo $SLURM_MEM_PER_NODE
echo $SLURM_NTASKS
Keeping interactive jobs alive
Interactive jobs stop when you disconnect from the login node either by
choice or by internet connection problems. To keep a job alive you can
use a terminal multiplexer like tmux
.
tmux
allows you to run processes as usual in your standard bash shell
You start tmux
on the login node before you get an interactive Slurm
session with srun
and then do all the work in it. In case of a
disconnect you simply reconnect to the login node and attach to the tmux
session again by typing:
$ tmux attach
Or in case you have multiple session running:
$ tmux list-session
$ tmux attach -t SESSION_NUMBER
As long as the tmux
session is not closed or terminated (e.g. by a
server restart) your session should continue. One problem with our
systems is that the tmux
session is bound to the particular login server
you get connected to. So if you start a tmux
session on login-1 on SAGA
and next time you get randomly connected to login-2 you first have to
connect to login-1 again by:
$ ssh login-1
To log out a tmux
session without closing it you have to press Ctrl-B
(that the Ctrl key and simultaneously “b”, which is the standard tmux
prefix) and then “d” (without the quotation marks). To close a session
just close the bash session with either Ctrl-D or type exit. You can get
a list of all tmux
commands by Ctrl-B and the ? (question mark). See
also this
page for
a short tutorial of tmux
. Otherwise, working inside a tmux
session is
almost the same as a normal bash session.