GNU Parallel is a shell tool for executing many sequential tasks at the same time on one or more nodes. It is useful for running a large number of sequential tasks at the same time to utilize the multiple core resources on one or multiple nodes. A task can be a single command or a small script that usually takes inputs such as a list of files, a list of names, and a list of URLs, etc. For details, please refer to the GNU Parallel documentation page
## Use GNU Parallel Interactively on Shamu

Follow the instructions below to grab a compute node with the required resources (only one user is allowed on a compute node, so there is no need to specify the number of cores are needed. You can use all the available cores on the node granted by the Slurm scheduler):
## Use GNU Parallel with Batch Jobs on Shamu

The sequential tasks can be distributed on one or multiple nodes across Shmau using a Slurm script. Assume that you have a sequential (non-MPI) application, called my-app. you would like to run multiple instances of the application across multiple nodes, each instance takes a different argument value from a file called my_para.txt. Here is the sample Slurm script:
###### Example for calculating multiple factorials at the same time using Parallel:

Here is a simple C++ code for calculating the factorial of a given number which is taken as an argument from the command line:
###### Example for finding the prime factors of given numbers at the same time using Parallel:

**For both interactive jobs and bench jobs, GNU Parallel coordinates the standard output (the content printed on the screen) for each running instance so that the output of those running instances never interfere with each other.**
-- Zhiwei - 22 Jul 2020

[abc123@shamu ~]$ srun --pty bashIf you would like to access a node with at least 80 cores, use the command below (avoid using -n 80, as it may end up putting you on two 40-core nodes ).

[abc123@shamu ~]$ srun --cpus-per-task=80 --pty bashload the module for GNU Parallel

[abc123@shamu ~]$ module load parallelSuppose you would like to zip all the .txt files in your current directory. You can use the following command to zip the files simultaneously. If the file number is larger than the available cores on the node, it will process the first number-of-cores files, and start another one once a core becomes available. the curly brackets

`{}`

are used for the parameters for the command to be run.
[abc123@shamu ~]$ ls *.txt | parallel gzip {}An alternative way to do the same parallel job is to put all the files names in a text file:

[abc123@shamu ~]$ ls *.txt > mylist.txtAnd then use the following syntax (--joblog option make it possible to resume the execution in case the process is interrupted):

[abc123@shamu ~]$parallel --joblog mylog.txt gzip {1} :::: mylist.txtto Resume the execution:

[abc123@shamu ~]$parallel --resume-failed --joblog mylog.txt gzip {1} :::: mylist.txtAnother typical way to use GNU Parallel is to put commands (one command per line) to be executed in a text file, and use the following syntax to execute the commands by GNU Parallel:

[abc123@shamu ~]$parallel < my_commands.txt

#!/bin/bash # #SBATCH --job-name=my_job #SBATCH --output=my_output_file.txt # Delete this line if you want the output file in slurm-jobID.out format. It will be different every time you submit the job. #SBATCH --partition=defq # defq is the default queue as the all.q in SGE scripts #SBATCH --time=01:05:00 # Time limit hrs:min:sec. It is an estimation about how long it will take to complete the job. 72:00:00 is the maximum #SBATCH --nodes=1 # change it to the number of nodes you would like to run the job on.If the job got interrupted, change the last line of the script to:

#SBATCH --cpus-per-task=20 # request a node with 20 cores #SBATCH --mail-type=ALL #SBATCH --mail-user=my-email@utsa.edu #you email address for receiving notices about your job status . /etc/profile.d/modules.sh module load parallel

parallel --joblog mylog.txt my-app {1} :::: mypara.txt

parallel --resume-failed --joblog mylog.txt my-app {1} :::: mypara.txtand resubmit the job

#include <iostream>Use following commands to compile and execute the program:

#include <string>

using namespace std;

int main(int argc, char *argv[] )

{

long f=1;

int n = stoi( argv[1]);

for(int i=1;i<=n;i++){

f=f*i;

}

cout<<"Factorial of " <<n<<" is: "<<f<<endl;

return 0;

}

[abc123@shamu ~]$ module load gccWe can create a text file mypara.txt, and put in all the numbers (one number per line) that we want to calculate the factorials from:

[abc123@shamu ~]$ g++ factorial.cpp -o factorial

[abc123@shamu ~]$ ./factorial 5

Factorial of 5 is: 120

[abc123@shamu ~]$ cat mypara.txtAnd use the following job script to submit the job.

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

20

#!/bin/bashThe Slurm schedule will put your job on a job with at least 20 cores, and the GUC Parallel will automatically run 20 instances (or less if there are less than 20 numbers in the file) of the factorial program and each instance will take a number from the mypara.txt file. If the total number of lines in the file is larger than 20, GUC Parallel will execute the first 20 instances and start another one once a compute core is freed up, until all numbers in the file are calculated. After the job is completed, you can check the content of the output file specified in the job script:

#

#SBATCH --job-name=my_job

#SBATCH --output=my_output_file.txt # Delete this line if you want the output file in slurm-jobID.out format. It will be different every time you submit the job.

#SBATCH --partition=defq # defq is the default queue as the all.q in SGE scripts

#SBATCH --time=00:10:00 # Time limit hrs:min:sec. It is an estimation about how long it will take to complete the job. 72:00:00 is the maximum

#SBATCH --nodes=1 # change it to the number of nodes you would like to run the job on.

#SBATCH --cpus-per-task=20 # request a node with 20 cores

#SBATCH --mail-type=ALL

#SBATCH --mail-user=my-email@utsa.edu #you email address for receiving notices about your job status

. /etc/profile.d/modules.sh

module load parallel

module load gcc

parallel --joblog mylog.txt ./factorial {1} :::: mypara.txt

hamu ~]$ cat my_output_file.txt

Factorial of 1 is: 1

Factorial of 2 is: 2

Factorial of 3 is: 6

Factorial of 4 is: 24

Factorial of 5 is: 120

Factorial of 6 is: 720

Factorial of 7 is: 5040

Factorial of 8 is: 40320

Factorial of 9 is: 362880

Factorial of 10 is: 3628800

Factorial of 11 is: 39916800

Factorial of 12 is: 479001600

Factorial of 13 is: 6227020800

Factorial of 14 is: 87178291200

Factorial of 15 is: 1307674368000

Factorial of 16 is: 20922789888000

Factorial of 17 is: 355687428096000

Factorial of 18 is: 6402373705728000

Factorial of 19 is: 121645100408832000

Factorial of 20 is: 2432902008176640000

#include <stdio.h>Use following commands to compile and execute the program:

#include <math.h>

#include <stdlib.h>

#include <unistd.h> /* defines fork(), and pid_t. */

#include <sys/wait.h> /* defines the wait() system call. */

int main(int argc, char** argv)

{

int n;

sscanf(argv[1],"%d",&n);

printf(" The prime factors of %d are:", n);

while (n%2 == 0)

{

n = n/2;

}

for (int i = 3; i <= (int)sqrt(n); i = i+2)

{

while (n%i == 0)

{

printf("%d ", i);

n = n/i;

}

}

if (n > 2)

printf ("%d", n);

printf("\n");

return 0;

}

[abc123@shamu ~]$ module load gccCreate a text file mypara.txt contains all the number that we want to factor:

[abc123@shamu ~]$ gcc factor.c -lm -o factor

[abc123@shamu ~]$ ./factor 45678

45678

2 3 23 331

34534Change the program name from 'factorial' to 'factor' in the same job script as the one for factorial calculation. And submit the job. Check the result file once the job is completed:

2342

36

3546

566

75

645

645

7

47

658787

456546546

45

654

65

4645

7457

56

8

66

36

36

74

56

76

8578678

43546657658

abc123@shamu ~]$ cat my_output_file.txtIf you would like to put the output (the text the application prints on the screen) in separate files for each instance, use the following Parallel option:

The prime factors of 34534 are:31 557

The prime factors of 2342 are:1171

The prime factors of 36 are:3 3

The prime factors of 3546 are:3 3 197

The prime factors of 566 are:283

The prime factors of 75 are:3 5 5

The prime factors of 645 are:3 5 43

The prime factors of 645 are:3 5 43

The prime factors of 7 are:7

The prime factors of 47 are:47

The prime factors of 658787 are:19 34673

The prime factors of 456546546 are:3 3 25363697

The prime factors of 45 are:3 3 5

The prime factors of 654 are:3 109

The prime factors of 65 are:5 13

The prime factors of 4645 are:5 929

The prime factors of 7457 are:7457

The prime factors of 56 are:7

The prime factors of 8 are:

The prime factors of 66 are:3 11

The prime factors of 36 are:3 3

The prime factors of 36 are:3 3

The prime factors of 74 are:37

The prime factors of 56 are:7

The prime factors of 76 are:19

The prime factors of 8578678 are:23 251 743

The prime factors of 596984698 are:181 1649129

parallel --results outdir factor {1} :::: mypara.txtThe output files can be found in outdir/1/ directory:

[abc123@shamu ~]$]$ cd outdir/1

[abc123@shamu ~]$ ls

2342 34534 3546 36 43546657658 45 456546546 4645 47 56 566 645 65 654 658787 66 7 74 7457 75 76 8 8578678

[abc123@shamu ~]$ cd 2342

[abc123@shamu ~]$ ls

seq stderr stdout

[iqr224@compute032 2342]$ cat stdout

2342: 2 1171

Edit | Attach | Print version | History: r7 < r6 < r5 < r4 | Backlinks | View wiki text | Edit wiki text | More topic actions

Topic revision: r7 - 30 Jul 2020, AdminUser

Copyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.

Ideas, requests, problems regarding Foswiki? Send feedback

Ideas, requests, problems regarding Foswiki? Send feedback