Monitoring your jobs

Using the squeue Command

Check the status of all jobs on Arc using the squeue command (this is just an example - compute names may be different):

$ squeue
 JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
 10435 gpu1v100 a5-5-8 kdg242 PD 0:00 1 (Resources)
 9596  bigmem sys/dash xce775 R 10-18:06:10 1 compute009
 10094 compute1 sys/dash fym313 R 2-11:03:59 1 compute028
 10149 compute1 bash iqr224 R 5-04:31:02 1 compute039
 10385 compute2 c300-a23 kdg242 R 1-02:05:09 1 compute107
 10386 compute3 c300-a25 kdg242 R 1-02:05:09 1 compute106
 10387 gpu2v100 c300-a26 kdg242 R 1-02:05:08 1 compute108
 

Using the sinfo Command

Check the status of the job partitions using the sinfo command (this is just an example - compute names and quantities may be different):

$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
compute1* up infinite 3 mix compute[028-029,039]
compute1* up infinite 51 idle compute[001-004,006-008,012-024,030-038,040-057,088-091]
bigmem up infinite 1 mix compute009
gpu1v100 up infinite 2 mix gpu[01-02]
gpu2v100 up infinite 1 mix compute025
computedev up infinite 5 idle compute[010-011]

Using the sacct Command

Check the status of individual jobs using the JobID:

$ sacct -j 10445

 JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
10445 bash compute1 admins 1 RUNNING 0:0

-- AdminUser - 14 Jul 2021
Topic revision: r2 - 28 Oct 2024, AdminUser
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding Foswiki? Send feedback