GNU Parallel is a shell tool for executing many sequential tasks at the same time on one or more nodes. It is useful for running a large number of sequential tasks at the same time to utilize the multiple core resources on one or multiple nodes. A task can be a single command or a small script that usually takes inputs such as a list of files, a list of names, and a list of URLs, etc. For details, please refer to the
GNU Parallel documentation page
Use GNU Parallel Interactively on Shamu
Follow the instructions below to grab a compute node with the required resources (only one user is allowed on a compute node, so there is no need to specify the number of cores are needed. You can use all the available cores on the node granted by the Slurm scheduler):
[abc123@shamu ~]$ srun --pty bash
If you would like to access a node with at least 80 cores, use the command below (avoid using -n 80, as it may end up putting you on two 40-core nodes ).
[abc123@shamu ~]$ srun --cpus-per-task=80 --pty bash
load the module for GNU Parallel
[abc123@shamu ~]$ module load parallel
Suppose you would like to zip all the .txt files in your current directory. You can use the following command to zip the files simultaneously. If the file number is larger than the available cores on the node, it will process the first number-of-cores files, and start another one once a core becomes available. the curly brackets
{}
are used for the parameters for the command to be run.
[abc123@shamu ~]$ ls *.txt | parallel gzip {}
An alternative way to do the same parallel job is to put all the files names in a text file:
[abc123@shamu ~]$ ls *.txt > mylist.txt
And then use the following syntax (--joblog option make it possible to resume the execution in case the process is interrupted):
[abc123@shamu ~]$parallel --joblog mylog.txt gzip {1} :::: mylist.txt
to Resume the execution:
[abc123@shamu ~]$parallel --resume-failed --joblog mylog.txt gzip {1} :::: mylist.txt
Another typical way to use GNU Parallel is to put commands (one command per line) to be executed in a text file, and use the following syntax to execute the commands by GNU Parallel:
[abc123@shamu ~]$parallel < my_commands.txt
Use GNU Parallel with Batch Jobs on Shamu
The sequential tasks can be distributed on one or multiple nodes across Shmau using a Slurm script. Assume that you have a sequential (non-MPI) application, called my-app. you would like to run multiple instances of the application across multiple nodes, each instance takes a different argument value from a file called my_para.txt. Here is the sample Slurm script:
#!/bin/bash
#
#SBATCH --job-name=my_job
#SBATCH --output=my_output_file.txt # Delete this line if you want the output file in slurm-jobID.out format. It will be different every time you submit the job.
#SBATCH --partition=defq # defq is the default queue as the all.q in SGE scripts
#SBATCH --time=01:05:00 # Time limit hrs:min:sec. It is an estimation about how long it will take to complete the job. 72:00:00 is the maximum
#SBATCH --nodes=1 # change it to the number of nodes you would like to run the job on.
#SBATCH --cpus-per-task=20 # request a node with 20 cores
#SBATCH --mail-type=ALL
#SBATCH --mail-user=my-email@utsa.edu #you email address for receiving notices about your job status
. /etc/profile.d/modules.sh
module load parallel
parallel --joblog mylog.txt my-app {1} :::: mypara.txt
If the job got interrupted, change the last line of the script to:
parallel --resume-failed --joblog mylog.txt my-app {1} :::: mypara.txt
and resubmit the job
Example for calculating multiple factorials at the same time using Parallel:
Here is a simple C++ code for calculating the factorial of a given number which is taken as an argument from the command line:
#include <iostream>
#include <string>
using namespace std;
int main(int argc, char *argv[] )
{
long f=1;
int n = stoi( argv[1]);
for(int i=1;i<=n;i++){
f=f*i;
}
cout<<"Factorial of " <<n<<" is: "<<f<<endl;
return 0;
}
Use following commands to compile and execute the program:
[abc123@shamu ~]$ module load gcc
[abc123@shamu ~]$ g++ factorial.cpp -o factorial
[abc123@shamu ~]$ ./factorial 5
Factorial of 5 is: 120
We can create a text file mypara.txt, and put in all the numbers (one number per line) that we want to calculate the factorials from:
[abc123@shamu ~]$ cat mypara.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
And use the following job script to submit the job.
#!/bin/bash
#
#SBATCH --job-name=my_job
#SBATCH --output=my_output_file.txt # Delete this line if you want the output file in slurm-jobID.out format. It will be different every time you submit the job.
#SBATCH --partition=defq # defq is the default queue as the all.q in SGE scripts
#SBATCH --time=00:10:00 # Time limit hrs:min:sec. It is an estimation about how long it will take to complete the job. 72:00:00 is the maximum
#SBATCH --nodes=1 # change it to the number of nodes you would like to run the job on.
#SBATCH --cpus-per-task=20 # request a node with 20 cores
#SBATCH --mail-type=ALL
#SBATCH --mail-user=my-email@utsa.edu #you email address for receiving notices about your job status
. /etc/profile.d/modules.sh
module load parallel
module load gcc
parallel --joblog mylog.txt ./factorial {1} :::: mypara.txt
The Slurm schedule will put your job on a job with at least 20 cores, and the GUC Parallel will automatically run 20 instances (or less if there are less than 20 numbers in the file) of the factorial program and each instance will take a number from the mypara.txt file. If the total number of lines in the file is larger than 20, GUC Parallel will execute the first 20 instances and start another one once a compute core is freed up, until all numbers in the file are calculated.
After the job is completed, you can check the content of the output file specified in the job script:
hamu ~]$ cat my_output_file.txt
Factorial of 1 is: 1
Factorial of 2 is: 2
Factorial of 3 is: 6
Factorial of 4 is: 24
Factorial of 5 is: 120
Factorial of 6 is: 720
Factorial of 7 is: 5040
Factorial of 8 is: 40320
Factorial of 9 is: 362880
Factorial of 10 is: 3628800
Factorial of 11 is: 39916800
Factorial of 12 is: 479001600
Factorial of 13 is: 6227020800
Factorial of 14 is: 87178291200
Factorial of 15 is: 1307674368000
Factorial of 16 is: 20922789888000
Factorial of 17 is: 355687428096000
Factorial of 18 is: 6402373705728000
Factorial of 19 is: 121645100408832000
Factorial of 20 is: 2432902008176640000
Example for finding the prime factors of given numbers at the same time using Parallel:
#include <stdio.h>
#include <math.h>
#include <stdlib.h>
#include <unistd.h> /* defines fork(), and pid_t. */
#include <sys/wait.h> /* defines the wait() system call. */
int main(int argc, char** argv)
{
int n;
sscanf(argv[1],"%d",&n);
printf(" The prime factors of %d are:", n);
while (n%2 == 0)
{
n = n/2;
}
for (int i = 3; i <= (int)sqrt(n); i = i+2)
{
while (n%i == 0)
{
printf("%d ", i);
n = n/i;
}
}
if (n > 2)
printf ("%d", n);
printf("\n");
return 0;
}
Use following commands to compile and execute the program:
[abc123@shamu ~]$ module load gcc
[abc123@shamu ~]$ gcc factor.c -lm -o factor
[abc123@shamu ~]$ ./factor 45678
45678
2 3 23 331
Create a text file mypara.txt contains all the number that we want to factor:
34534
2342
36
3546
566
75
645
645
7
47
658787
456546546
45
654
65
4645
7457
56
8
66
36
36
74
56
76
8578678
43546657658
Change the program name from 'factorial' to 'factor' in the same job script as the one for factorial calculation. And submit the job. Check the result file once the job is completed:
abc123@shamu ~]$ cat my_output_file.txt
The prime factors of 34534 are:31 557
The prime factors of 2342 are:1171
The prime factors of 36 are:3 3
The prime factors of 3546 are:3 3 197
The prime factors of 566 are:283
The prime factors of 75 are:3 5 5
The prime factors of 645 are:3 5 43
The prime factors of 645 are:3 5 43
The prime factors of 7 are:7
The prime factors of 47 are:47
The prime factors of 658787 are:19 34673
The prime factors of 456546546 are:3 3 25363697
The prime factors of 45 are:3 3 5
The prime factors of 654 are:3 109
The prime factors of 65 are:5 13
The prime factors of 4645 are:5 929
The prime factors of 7457 are:7457
The prime factors of 56 are:7
The prime factors of 8 are:
The prime factors of 66 are:3 11
The prime factors of 36 are:3 3
The prime factors of 36 are:3 3
The prime factors of 74 are:37
The prime factors of 56 are:7
The prime factors of 76 are:19
The prime factors of 8578678 are:23 251 743
The prime factors of 596984698 are:181 1649129
If you would like to put the output (the text the application prints on the screen) in separate files for each instance, use the following Parallel option:
parallel --results outdir factor {1} :::: mypara.txt
The output files can be found in outdir/1/ directory:
[abc123@shamu ~]$]$ cd outdir/1
[abc123@shamu ~]$ ls
2342 34534 3546 36 43546657658 45 456546546 4645 47 56 566 645 65 654 658787 66 7 74 7457 75 76 8 8578678
[abc123@shamu ~]$ cd 2342
[abc123@shamu ~]$ ls
seq stderr stdout
[iqr224@compute032 2342]$ cat stdout
2342: 2 1171
For both interactive jobs and bench jobs, GNU Parallel coordinates the standard output (the content printed on the screen) for each running instance so that the output of those running instances never interfere with each other.
-- Zhiwei - 22 Jul 2020