Scripting

Objectives

  • Questions

    • Can we save one or more commands?

    • How to share complex commands/workflows?

  • Keypoints

    • Scripts make computer work more automated and reproducible

    • Scripts can be used just as regular programs

Instructor note

  • Demo/teaching: 10 min

  • Exercise: 10 min

Note

  • First we will demonstrate a couple of commands and later we will use these in an exercise.

  • If you want to type-along with the instructors, you can download and extract the example like this:

    cd
    wget https://gitlab.sigma2.no/training/tutorials/unix-for-hpc/-/raw/master/content/episodes/finding-things/finding-things.tar.gz --no-check-certificate 
    tar xzvf finding-things.tar.gz
    

Creating a bash script

We now have a quite nice set of tools and commands. We are also able to build new ones by stringing together simple commands to form more complex ones. But do we have to do this every time we want to use them. And what in case a colleague wants to use our command. How can we save and share our creation?

The simplest thing to do is to write them down, so that we don’t forget them. We can then save and share this file with others.

In the finding-things folder you can find a file called list.sh. Let’s have a look what it contains:

$ cat list.sh

The file has the following contents:

#!/bin/bash

# List all pdf files
echo "All the pdf files in the current folder"
pwd
ls *.pdf

The lines starting with # are comments and will not be executed. The script will print a message with echo, then list the current directory and finally list all .pdf file in the current folder.

We can now run it by using our .sh file as an input for bash like this:

$ bash list.sh

Executables

Our bash script works and can be run but sometimes it’s a bit inconvenient to always write bash script.sh. Especially when you share it with colleagues, they might not know what kind of interpreter (bash, python, …) to use. Instead we can make executable so that behaves like a real program.

But let’s first see what happens if we just run it directly:

$ list.sh
bash: list.sh: command not found...

Strangely enough, Bash can’t find our script. As it turns out, Bash will only look in certain directories for scripts to run. To run anything else, we need to tell Bash exactly where to look. To run a script that we wrote ourselves, we need to specify the full path to the file, followed by the filename. We could do this one of two ways: either with our absolute path /cluster/USER/finding-things/list.sh, or with the relative path ./list.sh.

$ ./list.sh
bash: ./list.sh: Permission denied

There’s one last thing we need to do. Before a file can be run, it needs “permission” to run. Let’s look at our file’s permissions with ls -l:

$ ls -l
total 22
-rw-r--r-- 1 maikenp nn9970k    51 Apr 25  2022 aeuubdsu.png
drwxr-sr-x 4 maikenp nn9970k    10 Apr  8 15:46 aivxievn
-rw-r--r-- 1 maikenp nn9970k    51 Apr 25  2022 cvfufgab.zip
-rw-r--r-- 1 maikenp nn9970k    51 Apr 25  2022 dsxfqfos.gz
-rw-r--r-- 1 maikenp nn9970k    51 Apr 25  2022 ebruioly.pdf
-rw-r--r-- 1 maikenp nn9970k    51 Apr 25  2022 ejrbaofl.out
drwxr-sr-x 3 maikenp nn9970k     4 Apr  8 15:46 experiment1
-rw-r--r-- 1 maikenp nn9970k 13879 Apr 29  2022 genelist.tsv
drwxr-sr-x 2 maikenp nn9970k     9 Apr  8 15:46 godcjrbv
-rw-r--r-- 1 maikenp nn9970k    51 Apr 25  2022 iqwdcgbr.gz
-rw-r--r-- 1 maikenp nn9970k    94 May  2  2022 list.sh
-rw-r--r-- 1 maikenp nn9970k    51 Apr 25  2022 ntgppavq.tar
drwxr-sr-x 2 maikenp nn9970k     3 Apr  8 15:46 pexhhtec
-rw-r--r-- 1 maikenp nn9970k    51 Apr 25  2022 rsaqwinx.JPEG
drwxr-sr-x 2 maikenp nn9970k     5 Apr  8 15:46 sdenohww
-rw-r--r-- 1 maikenp nn9970k    51 Apr 25  2022 xibajunm.pdf

That’s a huge amount of output: a full listing of everything in the directory. Let’s see if we can understand what each field of a given row represents, working left to right. You will find the explained details in the numbered list below.

listing info

Now let’s briefly look at an example where the usergroup is personal - here as maikenp_g

listing info - personal usergroup

  1. Permissions: On the very left side, there is a string of the characters d, r, w, x, and -. The d indicates if something is a directory (there is a - in that spot if it is not a directory). The other r, w, x bits indicate permission to read, write, and execute a file. There are three fields of rwx permissions following the spot for d. If a user is missing a permission to do something, it’s indicated by a -.

    • The first set of rwx are the permissions that the owner* has (in this case the owner is user).

    • The second set of rwxs are permissions that other members of the owner’s group share (in this case, the group is named group ).

    • The third set of rwxs are permissions that anyone else with access to this computer can do with a file. Though files are typically created with read permissions for everyone, typically the permissions on your home directory prevent others from being able to access the file in the first place.

  2. References: This counts the number of references (hard links) to the item (file, folder, symbolic link or “shortcut”).

  3. Owner: This is the username of the user who owns the file. Their permissions are indicated in the first permissions field.

  4. Group: This is the user group of the user who owns the file. Members of this user group have permissions indicated in the second permissions field.

  5. Size of item: This is the number of bytes in a file, or the number of filesystem blocks occupied by the contents of a folder. (We can use the -h option here to get a human-readable file size in megabytes, gigabytes, etc.)

  6. Time last modified: This is the last time the file was modified.

  7. Filename: This is the filename.

So how do we change permissions? As I mentioned earlier, we need permission to execute our script. Changing permissions is done with chmod. To add executable permissions for all users we could use this:

$ chmod +x list.sh
$ ls -l list.sh
-rwxrwxr-x 1 jd jd 94 april 29 17:52 list.sh

Now that we have executable permissions for that file, we can run it.

$ ./list.sh
All the pdf files in the current folder
/home/jd/hpc/NRIS/training/2022-May-linux-intro/finding-things
ebruioly.pdf  xibajunm.pdf

Our script worked!

Note

Maybe you now also wonder: How does the computer know that this file was a shell?

The answer is it doesn’t really know and it’s not based on the file suffix (.sh in this case). Instead executing files with bash is the fallback behaviour in case the file is not a binary executable file. This means that your script will work as long as it is a bash script.

But we can also write scripts in other programming languages like for example perl or python. How do you tell the computer then, how it should execute the script?

This is were the #!INTERPRETER character sequence (called shebang) comes into place. If a text file starts with it this interpreter is used instead of the default one. In our exmple we used #!/bin/bash.

This makes it a lot more explicit what our script should be executed with. This reduces confusion, so it is common to add them to all scripts.

Exercise

Exercise (10 min): Create a new script and run it

Create a new script called current_folder.sh that prints a message with echo and then the current folder with pwd.

echo 'You are in folder:'
pwd

Make sure to make the file executable and also add the shebang (#!/bin/bash).

Keypoints

  • Scripts make computer work more automated and reproducible

  • Scripts can be used just as regular programs