Accessing software
Instructor note
Total: 45min (Teaching:30Min | Discussion:0min | Breaks:0min | Exercises:15Min)
Objectives
Questions
How can we find out which scientific software is installed on the HPC cluster?
How can we access scientific software on the HPC cluster?
Objectives
Understand how the UNIX system looks for installed software
Understand how to load and use a software package
Keypoints
Search for software with
module avail
Load software with
module load
Unload software with
module purge
The module system handles software versioning and will prevent package conflicts for you automatically
On a high-performance computing system, it is seldom the case that the software we want to use is available when we log in. It is installed, but we will need to “load” it before it can run.
Before we start using individual software packages, however, we should understand the reasoning behind this approach. The three biggest factors are:
software incompatibilities
versioning
dependencies
Software incompatibility is a major headache for programmers. Sometimes the presence (or absence) of
a software package will break others that depend on it. Two of the most famous examples are Python 2
and 3 and C compiler versions. Python 3 famously provides a python
command that conflicts with
that provided by Python 2. Software compiled against a newer version of the C libraries and then
used when they are not present will result in a nasty 'GLIBCXX_3.4.20' not found
error, for
instance.
Software versioning is another common issue. A team might depend on a certain package version for their research project - if the software version was to change (for instance, if a package was updated), it might affect their results. Having access to multiple software versions allow a set of researchers to prevent software versioning issues from affecting their results.
Dependencies are where a particular software package (or even a particular version) depends on having access to another software package (or even a particular version of another software package). For example, the VASP materials science software may depend on having a particular version of the FFTW (the Fastest Fourier Transform in the West) software library available for it to work.
Environment modules are the solution to these problems, and we will return to this after looking at globally installed packages.
Globally installed system packages
In this example we will use Python, which is installed globally on the login node in one particular version.
We can test what the python
command is actually pointing to by another command
called which
. which
looks for programs the same way that Bash does, so we can use
it to tell us where a particular piece of software is stored.
MY_USER_NAMEd@CLUSTER_NAME ~]$ python --version
Python 3.9.14
[MY_USER_NAME@CLUSTER_NAME ~]$ which python
/usr/bin/python
MY_USER_NAMEd@CLUSTER_NAME ~]$ python3 --version
Python 3.9.14
[MY_USER_NAME@CLUSTER_NAME ~]$ which python3
/usr/bin/python3
What this tells us is that python
and python3
is the same command.
What the output of which
tells us is that typing the command python3
is
equivalent of running the full command /usr/bin/python3
.
But how did the shell know that python
should be linked to /usr/bin/python
?
To explain this, we first need to understand the nature of the PATH
environment
variable. PATH
is a special environment variable that controls where a UNIX system
looks for software. We can inspect its value with the following command (PATH
is
the variable, $
extracts its value, and echo
prints the value):
[MY_USER_NAME@CLUSTER_NAME ~]$ echo $PATH
/node/bin:/usr/local/bin:/usr/bin:/usr/local/sbin:/usr/sbin:/cluster/bin:/cluster/home/MY_USER_NAME/.local/bin:/cluster/home/MY_USER_NAME/bin
What we see here is a colon-separated (:
) list of search paths that the shell
is looping through when looking for the python
command. In this case it finds
a match under /usr/bin
, so then it exits the search and replaces python
with
/usr/bin/python
.
Exercise
Exercise (10 min)
What happens if there are other matching commands located later in the search
PATH
, e.g./cluster/bin/python
?What happens if you have an executable script in your current directory with the same name as a globally installed program?
Solution
If there are other matching commands later in the search path, these will be shadowed by first found command. The shell will stop searching for more commands when it has found the command in a directory.
If your current directory is first in the search path it will executed. On the other hand if the directory with the global installed program is first in the search path, it will be executed. To execute a command named python in your current directory, do:
$ ./python
Environment modules
A module is a self-contained description of a software package - it contains the settings required to run a software package and, usually, encodes required dependencies on other software packages.
There are a number of different environment module implementations commonly
used on HPC systems: the two most common are TCL modules and Lmod. Both of
these use similar syntax and the concepts are the same so learning to use one will
allow you to use whichever is installed on the system you are using. In both
implementations the module
command is used to interact with environment modules. An
additional subcommand is usually added to the command to specify what you want to do. For a list
of subcommands you can use module -h
or module help
. As for all commands, you can
access the full help on the man pages with man module
.
On login, you may start out with a default set of modules loaded, or you may start out with an empty environment; this depends on the setup of the system you are using.
Listing currently loaded modules
You can use the module list
command to see which modules you currently have loaded
in your environment. After logging into one of our systems, your environment
should ideally be clean like this:
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module list
Currently Loaded Modules:
1) StdEnv (S)
Where:
S: Module is Sticky, requires --force to unload or purge
You can see that one module is loaded which has special attribute of being
sticky (S
). That means that it is not usually unloaded, typically because it
is important for the system to function correctly (so --force
removing it is
advised against).
Finding and listing available modules
One way to look for available software is to search for keywords using module keyword <KEYWORD>
. This will look through the module metadata and return anything that
matches. For example, let’s list bioinformatics programs that can
be loaded using modules with module keyword bio
:
[MY_USER_NAME@CLUSTER_NAME ~]$ module keyword bio
---------------------------------------------------------------------------------------------------
The following modules match your search criteria: "bio"
---------------------------------------------------------------------------------------------------
ABySS: ABySS/2.3.7-foss-2023a
Assembly By Short Sequences - a de novo, parallel, paired-end sequence assembler
AUGUSTUS: AUGUSTUS/3.4.0-foss-2021a, AUGUSTUS/3.4.0-foss-2021b, ...
AUGUSTUS is a program that predicts genes in eukaryotic genomic sequences
BBMap: BBMap/38.96-GCC-10.3.0, BBMap/38.98-GCC-11.2.0, ...
BBMap short read aligner, and other bioinformatic tools.
bcbio-gff: bcbio-gff/0.6.7-foss-2021a, bcbio-gff/0.7.0-foss-2022a, ...
Read and write Generic Feature Format (GFF) with Biopython integration.
Bio-DB-HTS: Bio-DB-HTS/3.01-GCC-11.2.0, Bio-DB-HTS/3.01-GCC-11.3.0, ...
Read files using HTSlib including BAM/CRAM, Tabix and BCF database
files
[removed most of the output here for clarity]
Another option is to search directly on the module name using the module avail
command. If you run this command without any search string it will produce a long
list of all the installed software modules, like this:
[MY_USER_NAME@CLUSTER_NAME ~]$ module avail
--------------------------- /cluster/modulefiles/all ---------------------------
prodigal/2.6.3-GCCcore-10.3.0
prodigal/2.6.3-GCCcore-11.2.0
prodigal/2.6.3-GCCcore-11.3.0
prodigal/2.6.3-GCCcore-12.2.0
prodigal/2.6.3-GCCcore-12.3.0
PROJ/8.0.1-GCCcore-10.3.0
PROJ/8.1.0-GCCcore-11.2.0
PROJ/9.0.0-GCCcore-11.3.0
PROJ/9.1.1-GCCcore-12.2.0
PROJ/9.2.0-GCCcore-12.3.0
PROJ/9.3.1-GCCcore-13.2.0
prokka/1.14.5-gompi-2021a
prokka/1.14.5-gompi-2021b
prokka/1.14.5-gompi-2022a
prokka/1.14.5-gompi-2022b
PuLP/2.7.0-foss-2022b
PuLP/2.8.0-foss-2023a
PyBioLib/1.2.205-GCCcore-12.3.0
PyQt5/5.15.7-GCCcore-12.2.0
Pysam/0.16.0.1-GCC-10.3.0
Pysam/0.18.0-GCC-11.2.0
Pysam/0.19.1-GCC-11.3.0
Pysam/0.21.0-GCC-12.2.0
Pysam/0.22.0-GCC-12.3.0
Pysam/0.22.0-GCC-13.2.0
Python-bundle-PyPI/2023.06-GCCcore-12.3.0
Python-bundle-PyPI/2023.10-GCCcore-13.2.0
Python-bundle-PyPI/2024.06-GCCcore-13.3.0
Python/3.9.5-GCCcore-10.3.0
Python/3.9.6-GCCcore-11.2.0
Python/3.10.4-GCCcore-11.3.0
Python/3.10.8-GCCcore-12.2.0
Python/3.11.3-GCCcore-12.3.0
Python/3.11.5-GCCcore-13.2.0
Python/3.12.3-GCCcore-13.3.0
PyTorch/1.12.0-foss-2022a-CUDA-11.7.0
PyTorch/1.12.1-foss-2022a-CUDA-11.7.0
QIIME2/2022.11
Qualimap/2.2.1-foss-2021b-R-4.1.2
Qualimap/2.3-foss-2022b-R-4.2.2
QuantumESPRESSO/6.8-foss-2021a
QuantumESPRESSO/6.8-intel-2021a
QuantumESPRESSO/7.0-foss-2021b
QuantumESPRESSO/7.1-foss-2022a
QuantumESPRESSO/7.1-intel-2022a
QuantumESPRESSO/7.2-foss-2022b
QuantumESPRESSO/7.3-foss-2023a
[removed most of the output here for clarity]
-------------------------- /cluster/modulefiles/Core ---------------------------
BuildEnv EESSI/2023.06 StdEnv (S,L) VASPModules
BuildZen2 NESSI/2023.06 VaspExtra Zen2Env (S)
------------------------ /cluster/modulefiles/external -------------------------
appusage/1.0 hpcx/2.16 hpcx/2.18
hpcx/2.14 hpcx/2.17.1 hpcx/2.21
Where:
Aliases: Aliases exist: foo/1.2.3 (1.2) means that "module load foo/1.2" will load foo/1.2.3
L: Module is loaded
S: Module is Sticky, requires --force to unload or purge
If the avail list is too long consider trying:
"module --default avail" or "ml -d av" to just list the default modules.
"module overview" or "ml ov" to display the number of modules for each name.
Use "module spider" to find all possible modules and extensions.
Use "module keyword key1 key2 ..." to search for all possible modules matching
any of the "keys".
You can refine the search by adding a search string to the command, like module avail <SOFTWARE>
. In contrast to the module keyword
search, which will only be
matched to the module name, not to any metadata. For example, we can list all modules
that matched the string ‘python/’ (including the ‘/’):
[MY_USER_NAME@CLUSTER_NAME ~]$ module avail python/
--------------------------- /cluster/modulefiles/all ---------------------------
Biopython/1.79-foss-2021a netcdf4-python/1.5.7-foss-2021a
Biopython/1.79-foss-2021b netcdf4-python/1.5.7-foss-2021b
Biopython/1.79-foss-2022a netcdf4-python/1.6.1-foss-2022a
Biopython/1.81-foss-2022b netcdf4-python/1.6.3-foss-2022b
Biopython/1.83-foss-2023a netcdf4-python/1.6.4-foss-2023a
Biopython/1.84-foss-2023b netcdf4-python/1.6.5-foss-2023b
Boost.Python/1.76.0-GCC-10.3.0 Python/3.9.5-GCCcore-10.3.0
Boost.Python/1.79.0-GCC-11.3.0 Python/3.9.6-GCCcore-11.2.0
Boost.Python/1.82.0-GCC-12.3.0 Python/3.10.4-GCCcore-11.3.0
bx-python/0.8.11-foss-2021a Python/3.10.8-GCCcore-12.2.0
bx-python/0.8.13-foss-2021b Python/3.11.3-GCCcore-12.3.0
bx-python/0.9.0-foss-2022a Python/3.11.5-GCCcore-13.2.0
bx-python/0.10.0-foss-2023a Python/3.12.3-GCCcore-13.3.0
Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
Loading and unloading software
Any of the software modules that we found in the previous section can be loaded
into our environment using the module load
command. Let’s say we are not happy
with the system version of Python that we get when logging in to the cluster
(see “Globally installed system packages” above). We can then instead load a
module for the Python version that we want:
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module load Python/3.12.3-GCCcore-13.3.0
[MY_USER_NAME@CLUSTER_NAME ~ ]$ which python
/cluster/software/Python/3.12.3-GCCcore-13.3.0/bin/python
[MY_USER_NAME@CLUSTER_NAME ~ ]$ python --version
Python 3.12.3
So, what just happened? Let’s have a look at the PATH
variable again:
[MY_USER_NAME@CLUSTER_NAME ~ ]$ echo $PATH
/cluster/software/Python/3.12.3-GCCcore-13.3.0/bin:
/cluster/software/OpenSSL/3/bin:
....
/cluster/bin:
/cluster/home/MY_USER_NAME/.local/bin:
/cluster/home/MY_USER_NAME/bin
You’ll notice that the output is much longer than it was before we
loaded the Python module, and if you look closely you’ll see that the last
entries of the output are identical to what we had before. This means that by
loading the module, we changed the PATH
by adding entries to the beginning
of the list. This means that the shell will now start looking into the
/cluster/software/Python/3.12.3-GCCcore-13.3.0/bin
etc. locations, before
moving on the “system” paths /usr/bin
etc.
Let’s examine what’s there:
[MY_USER_NAME@CLUSTER_NAME ~ ]$ ls -lh /cluster/software/Python/3.12.3-GCCcore-13.3.0/bin
....
lrwxrwxr-x 1 vegarde sysapp 10 Sep 18 2024 python -> python3.12
lrwxrwxr-x 1 vegarde sysapp 10 Sep 18 2024 python3 -> python3.12
-rwxrwxr-x 1 vegarde sysapp 29K Sep 18 2024 python3.12
-rwxrwxr-x 1 vegarde sysapp 3.1K Sep 18 2024 python3.12-config
....
Taking this to its conclusion, module load
will add software to your $PATH
. It “loads”
software. A special note on this - depending on which version of the module
program that is
installed at your site, module load
will also load required software dependencies.
To demonstrate, let’s use module list
. module list
shows all loaded software modules.
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module list
Currently Loaded Modules:
1) StdEnv (S) 8) Tcl/8.6.14-GCCcore-13.3.0 (H)
2) GCCcore/13.3.0 9) SQLite/3.45.3-GCCcore-13.3.0 (H)
3) zlib/1.3.1-GCCcore-13.3.0 (H) 10) XZ/5.4.5-GCCcore-13.3.0 (H)
4) binutils/2.42-GCCcore-13.3.0 (H) 11) libffi/3.4.5-GCCcore-13.3.0 (H)
5) bzip2/1.0.8-GCCcore-13.3.0 (H) 12) OpenSSL/3 (H)
6) ncurses/6.5-GCCcore-13.3.0 (H) 13) Python/3.12.3-GCCcore-13.3.0
7) libreadline/8.2-GCCcore-13.3.0 (H)
Where:
H: Hidden Module
S: Module is Sticky, requires --force to unload or purge
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module purge
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module list
Currently Loaded Modules:
1) StdEnv (S)
Where:
S: Module is Sticky, requires --force to unload or purge
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module load BLAST+/2.14.1-gompi-2023a
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module list
Currently Loaded Modules:
1) StdEnv (S)
2) GCCcore/12.3.0
3) zlib/1.2.13-GCCcore-12.3.0 (H)
4) binutils/2.40-GCCcore-12.3.0 (H)
5) GCC/12.3.0
6) numactl/2.0.16-GCCcore-12.3.0 (H)
7) XZ/5.4.2-GCCcore-12.3.0 (H)
8) libxml2/2.11.4-GCCcore-12.3.0 (H)
9) libpciaccess/0.17-GCCcore-12.3.0 (H)
10) hwloc/2.9.1-GCCcore-12.3.0 (H)
11) hpcx/2.16
12) OpenMPI/4.1.5-GCC-12.3.0
13) gompi/2023a
14) bzip2/1.0.8-GCCcore-12.3.0 (H)
15) PCRE/8.45-GCCcore-12.3.0 (H)
16) gzip/1.12-GCCcore-12.3.0 (H)
17) lz4/1.9.4-GCCcore-12.3.0 (H)
18) zstd/1.5.5-GCCcore-12.3.0 (H)
19) ICU/73.2-GCCcore-12.3.0 (H)
20) Boost/1.82.0-GCC-12.3.0
21) GMP/6.2.1-GCCcore-12.3.0 (H)
22) libpng/1.6.39-GCCcore-12.3.0 (H)
23) NASM/2.16.01-GCCcore-12.3.0 (H)
24) libjpeg-turbo/2.1.5.1-GCCcore-12.3.0 (H)
25) LMDB/0.9.31-GCCcore-12.3.0 (H)
26) BLAST+/2.14.1-gompi-2023a
Where:
H: Hidden Module
S: Module is Sticky, requires --force to unload or purge
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module unload BLAST+/2.14.1-gompi-2023a
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module list
Currently Loaded Modules:
1) StdEnv (S)
2) GCCcore/12.3.0
3) zlib/1.2.13-GCCcore-12.3.0 (H)
4) binutils/2.40-GCCcore-12.3.0 (H)
5) GCC/12.3.0
6) numactl/2.0.16-GCCcore-12.3.0 (H)
7) XZ/5.4.2-GCCcore-12.3.0 (H)
8) libxml2/2.11.4-GCCcore-12.3.0 (H)
9) libpciaccess/0.17-GCCcore-12.3.0 (H)
10) hwloc/2.9.1-GCCcore-12.3.0 (H)
11) hpcx/2.16
12) OpenMPI/4.1.5-GCC-12.3.0
13) gompi/2023a
14) bzip2/1.0.8-GCCcore-12.3.0 (H)
15) PCRE/8.45-GCCcore-12.3.0 (H)
16) gzip/1.12-GCCcore-12.3.0 (H)
17) lz4/1.9.4-GCCcore-12.3.0 (H)
18) zstd/1.5.5-GCCcore-12.3.0 (H)
19) ICU/73.2-GCCcore-12.3.0 (H)
20) Boost/1.82.0-GCC-12.3.0
21) GMP/6.2.1-GCCcore-12.3.0 (H)
22) libpng/1.6.39-GCCcore-12.3.0 (H)
23) NASM/2.16.01-GCCcore-12.3.0 (H)
24) libjpeg-turbo/2.1.5.1-GCCcore-12.3.0 (H)
25) LMDB/0.9.31-GCCcore-12.3.0 (H)
So using `module unload` "un-loads" a module not its dependencies.
If we wanted to unload everything at once, we could run `module purge` (unloads everything).
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module purge
The following modules were not unloaded:
(Use "module --force purge" to unload all):
1) StdEnv
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module list
Currently Loaded Modules:
1) StdEnv (S)
Where:
S: Module is Sticky, requires --force to unload or purge
Note that module purge
is informative. It lets us know that all but a default set of packages
have been unloaded (and how to actually unload these if we truly so desired).
Software versioning & toolchains
So far, we’ve learned how to load and unload software packages. This is very useful. However, we have not yet addressed the issue of software versioning. At some point or other, you will run into issues where only one particular version of some software will be suitable. Perhaps a key bugfix only happened in a certain version, or version X broke compatibility with a file format you use. In either of these example cases, it helps to be very specific about what software is loaded.
Let’s examine the output of module avail <SOFTWARE>
more closely:
[MY_USER_NAME@CLUSTER_NAME ~ ]$ module avail Python/
--------------------------- /cluster/modulefiles/all ---------------------------
Biopython/1.79-foss-2021a netcdf4-python/1.5.7-foss-2021a
Biopython/1.79-foss-2021b netcdf4-python/1.5.7-foss-2021b
Biopython/1.79-foss-2022a netcdf4-python/1.6.1-foss-2022a
Biopython/1.81-foss-2022b netcdf4-python/1.6.3-foss-2022b
Biopython/1.83-foss-2023a netcdf4-python/1.6.4-foss-2023a
Biopython/1.84-foss-2023b netcdf4-python/1.6.5-foss-2023b
Boost.Python/1.76.0-GCC-10.3.0 Python/3.9.5-GCCcore-10.3.0
Boost.Python/1.79.0-GCC-11.3.0 Python/3.9.6-GCCcore-11.2.0
Boost.Python/1.82.0-GCC-12.3.0 Python/3.10.4-GCCcore-11.3.0
bx-python/0.8.11-foss-2021a Python/3.10.8-GCCcore-12.2.0
bx-python/0.8.13-foss-2021b Python/3.11.3-GCCcore-12.3.0
bx-python/0.9.0-foss-2022a Python/3.11.5-GCCcore-13.2.0
bx-python/0.10.0-foss-2023a Python/3.12.3-GCCcore-13.3.0
Use "module spider" to find all possible modules.
Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
You can see that module avail Python/
lists seven versions of ‘Python’ with
the version number being the first part after the /
. The GCCcore-*
describes the toolchain with which ‘Python’ was compiled and its version.
So the different ‘Python’ versions are compiled with toolchains from GCCcore
10.3 to 13.3.0.
Toolchains are standardized bundles used for installing modules. They usually
consist of a compiler, math libraries and MPI implementation. The most common
toolchains are GCCcore
, intel
and foss
. It is important to know that
modules created with different toolchains are often incompatible. If you try to
load two modules that are based on different toolchains, you will get an error
message from the module load
command. This means that you should always try to
find modules with matching toolchains whenever you need to load more than one
application.
Using software modules in scripts
Here we create a job script that loads a particular version of Python, and prints the version number to the Slurm output file.
[MY_USER_NAME@CLUSTER_NAME ~ ]$ nano python-module.sh
[MY_USER_NAME@CLUSTER_NAME ~ ]$ cat python-module.sh
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --time=00:01:00
#SBATCH --account=<PROJECT_NAME>
#SBATCH --mem=1G
#SBATCH --job-name=Python_module_test
module purge
module load Python/3.12.3-GCCcore-13.3.0
python --version
[MY_USER_NAME@CLUSTER_NAME ~ ]$ sbatch python-module.sh
For full reproducibility it is always good practice to start your job script by purging any existing modules which you might have loaded when you submit the job script. You can then explicitly load all the dependencies for the current job, which makes it much more robust for future execution.
Exercise
Exercise (15 min)
This exercise can be performed directly on the login node. Before you start,
run the command module purge
to make sure your environment is clean. Verify
that StdEnv
is the only loaded module when running module list
.
How many programs (not counting versions) are there related to the keyword ‘chemistry’?
Find a module for
R
version 4.1.2 usingmodule avail
(R is a popular software environment for statistical computing). Load this module and verify that you get a workingR
command in your terminal. e.g. usingwhich R
orR --version
.How many other software packages were loaded alongside the requsted
R
module?Bonus: Find a suitable version of
Ruby
to load alongside theR
module that you already have. Hint: Here we do not care about which version ofRuby
we are loading, but it needs to be compatible with the modules we have already loaded (GCCcore
versions needs to be the same).
Solution
Depends on cluster, check with
$ module keyword chemistry
which at the time of writing found 6 packages on Saga:
ADF
,MRChem
,NWChem
,OpenBabel
,OpenMolcas
, andORCA
.We can search for modules using
module avail
, and we can restrict the search by being more specific on version4
$ module avail R/4 --------------------------- /cluster/modulefiles/all --------------------------- MUMmer/4.0.0rc1-GCCcore-10.3.0 R/4.3.2-gfbf-2023a R/4.1.2-foss-2021b R/4.4.1-gfbf-2023b R/4.2.1-foss-2022a R/4.4.2-gfbf-2024a R/4.2.2-foss-2022b RepeatMasker/4.1.4-foss-2022a If the avail list is too long consider trying: "module --default avail" or "ml -d av" to just list the default modules. "module overview" or "ml ov" to display the number of modules for each name. Use "module spider" to find all possible modules and extensions. Use "module keyword key1 key2 ..." to search for all possible modules matching any of the "keys".
We see there is only one module matching version
4.1.2
, so we load this one:$ module load R/4.1.2-foss-2021b
Finally, we verify that we have the correct version available on the command line:
$ which R /cluster/software/R/4.1.2-foss-2021b/bin/R $ R --version R version 4.1.2 (2021-11-01) -- "Bird Hippie" Copyright (C) 2021 The R Foundation for Statistical Computing Platform: x86_64-pc-linux-gnu (64-bit) R is free software and comes with ABSOLUTELY NO WARRANTY. You are welcome to redistribute it under the terms of the GNU General Public License versions 2 or 3. For more information about these matters see https://www.gnu.org/licenses/.
Check the number of loaded modules with
$ module list ... [removed long output] ... 79) PROJ/8.1.0-GCCcore-11.2.0 80) libgeotiff/1.7.0-GCCcore-11.2.0 (H) 81) pybind11/2.7.1-GCCcore-11.2.0 (H) 82) SciPy-bundle/2021.10-foss-2021b 83) libtirpc/1.3.2-GCCcore-11.2.0 (H) 84) HDF/4.2.15-GCCcore-11.2.0 (H) 85) GDAL/3.3.2-foss-2021b 86) MPFR/4.1.0-GCCcore-11.2.0 (H) 87) libgit2/1.1.1-GCCcore-11.2.0 (H) 88) R/4.1.2-foss-2021b
which in this case outputs 88 different modules. So in addition to the original
StdEnv
and the module we actively loaded (R/4.1.2-foss-2021b
), we got many other software packages loaded at the same time.Bonus: When we look at the output from the
module list
command above, we see that most of the loaded modules contain theGCCcore-11.2.0
suffix. This means that they were all compiled using the same “core” compiler, and thus should be fully compatible. If we want to load another (seemingly independent) module at the same time, we need to make sure that it is compatible with this core compiler. Searching forRuby
gives:$ module avail ruby --------------------------- /cluster/modulefiles/all --------------------------- Ruby/3.0.1-GCCcore-10.3.0 Ruby/3.2.2-GCCcore-12.2.0 Ruby/3.0.1-GCCcore-11.2.0 Ruby/3.3.0-GCCcore-12.3.0 Ruby/3.0.5-GCCcore-11.3.0
were we see that only one has a compatible
GCCcore
version with our currentR
, so this one can be loaded without any problems:$ module load Ruby/3.0.1-GCCcore-11.2.0 $ ruby --version ruby 3.0.1p64 (2021-04-05 revision 0fb782ee38) [x86_64-linux]
Warning
If you try to load any of the other versions of
Ruby
, you will get an error message telling you that the site does not allow “automatic swapping of module with the same name”. You can still manually do such swapping of modules, as explained in the same error message, but it is not recommended, as it can lead to weird runtime errors that are hard to debug.
EESSI (European Environment for Scientific Software Installations)
EESSI is an innovative service to make optimized scientific software installations available on any machine anywhere in the world in near real-time - without the need to build or install the software. EESSI operates in a manner comparable to modern streaming platforms for music or video content. Just as these services allow users to instantly access media without downloading entire files, EESSI enables users to seamlessly access precompiled software environments that are optimized for a variety of system architectures.
The single command to get access to EESSI (on NRIS operated systems) is
module load EESSI/2023.06
Using software modules in scripts with EESSI
Here we create a job script that loads a particular version of Python using EESSI, and prints the version number to the Slurm output file. The script is very similar to the previous example which used local modules. Only one new line is added.
[MY_USER_NAME@CLUSTER_NAME ~ ]$ nano python-module.sh
[MY_USER_NAME@CLUSTER_NAME ~ ]$ cat python-module.sh
#!/bin/bash
#SBATCH --nodes=1
#SBATCH --time=00:01:00
#SBATCH --account=<PROJECT_NAME>
#SBATCH --mem=1G
#SBATCH --job-name=Python_module_test
module purge
module load EESSI/2023.06
module load Python/3.11.3-GCCcore-12.3.0
python --version
[MY_USER_NAME@CLUSTER_NAME ~ ]$ sbatch python-module.sh
Exercise
Exercise (5 min)
This exercise can be performed directly on the login node. Before you start,
run the command module purge
to make sure your environment is clean. Verify
that StdEnv
is the only loaded module when running module list
.
How many EESSI versions are currently availaibe ?
How many OpenFOAM versions currently exist in EESSI ?
What to do if the software you need is not provided by EESSI yet ?
Solution
We can check the available EESSI modules
module avail EESSI
-------------------------- /cluster/modulefiles/Core ---------------------------
EESSI/2023.06
We load EESSI module, and then search for OpenFOAM using
module avail
module load EESSI/2023.06
module avail OpenFOAM
OpenFOAM/v2312-foss-2023a OpenFOAM/v2406-foss-2023a OpenFOAM/10-foss-2023a OpenFOAM/11-foss-2023a (D)
3.A One option is to build on top of EESSI with EasyBuild by loading the EESSI-extend module after loading EESSI module:
module load EESSI/2023.06
module load EESSI-extend
Documentation on how the build process works is at https://www.eessi.io/docs/using_eessi/building_on_eessi
3.B Follow the contribution policy to provide guidelines for adding software to EESSI https://www.eessi.io/docs/adding_software/overview