Monitoring your jobs
Using the squeue Command
Check the status of all jobs on Shamu using the squeue command (this is just an example compute names may be different):
$ squeue
JOBID PARTITION NAME USER ST TIME NODES NODELIST(REASON)
10435 gpu1v100 a5-5-8 kdg242 PD 0:00 1 (Resources)
9596 bigmem sys/dash xce775 R 10-18:06:10 1 compute009
10094 compute1 sys/dash fym313 R 2-11:03:59 1 compute028
10149 compute1 bash iqr224 R 5-04:31:02 1 compute039
10385 compute2 c300-a23 kdg242 R 1-02:05:09 1 compute107
10386 compute3 c300-a25 kdg242 R 1-02:05:09 1 compute106
10387 gpu2v100 c300-a26 kdg242 R 1-02:05:08 1 compute108
Using the sinfo Command
Check the status of the job partitions using the sinfo command (this is just an example - compute names and quantities may be different):
$ sinfo
PARTITION AVAIL TIMELIMIT NODES STATE NODELIST
compute1* up infinite 3 mix compute[028-029,039]
compute1* up infinite 51 idle compute[001-004,006-008,012-024,030-038,040-057,088-091]
bigmem up infinite 1 mix compute009
gpu1v100 up infinite 2 mix gpu[01-02]
gpu2v100 up infinite 1 mix compute025
computedev up infinite 5 idle compute[010-011]
Using the sacct Command
Check the status of individual jobs:
$ sacct -j 10445
JobID JobName Partition Account AllocCPUS State ExitCode
------------ ---------- ---------- ---------- ---------- ---------- --------
10445 bash compute1 admins 1 RUNNING 0:0
--
AdminUser - 16 Jun 2017