Arc User-Guide

  1. Arc is the primary High Performance Computing (HPC) system at The University of Texas at San Antonio (UTSA) that can be used for running data-intensive, memory-intensive, and compute-intensive jobs from a wide range of disciplines. It can deliver up to 387 TFLOPs of peak performance and is equipped with:
    • 156 total compute/GPU nodes (6032 total cores across all nodes) and 2 login nodes, majority of these are Intel Cascade Lake CPUs
    • 30 GPU nodes - each containing two CPUs with 20 cores each for a total of 40 cores, 384GB RAM, and each including one V100 Nvidia GPU accelerator

    • 5 GPU nodes - each containing two CPUs with 20 cores each for a total of 40 cores, 384GB RAM, and each including two V100 Nvidia GPU accelerators

    • Two large-memory nodes, each containing four CPUs with 20 cores each for a total of 80 cores, and each including 1.5TB of RAM

    • 100Gb/s Infiniband connectivity

    • Two Lustre filesytems: /home and /work, where /home has 110 TBs capacity and /work has 1.1 PB of capacity

    • A cumulative total of 250TB of local scratch (approximately 1.5 TB of /scratch space on most compute and gpu nodes)

    • Multiple partitions (or queues) having different characteristics and constraints
      • bigmem: 2 nodes
      • compute1: 65 nodes
      • compute2: 25 nodes
      • computedev: 5 nodes
      • gpu1v100: 28 nodes
      • gpu2v100: 5 nodes
      • gpudev: 2 nodes
      • two privately owned partitions consisting of 24 nodes
    • Arc is accessible over SSH using two-factor authentication with DUO. Hostname for Arc is arc.utsa.edu and the SSH port number is 22. In order to utilize DUO, you must register online at passphrase.utsa.edu .

  2. Arc Fair-Use Policies
    • Running Jobs
      • Compute nodes are not shared among multiple users. Instead, when a user grabs a compute node, they will be the only user allowed to access it. This is being implemented for security reasons, as well as performance reasons. If multiple users are sharing the same node, performance can be negatively impacted due to resource contention. While we will no longer be scheduling jobs from different users on the same node, users are encouraged to take advantage of tools such as GNU parallel to co-schedule their multiple independent tasks on the compute nodes allocated to them. Please see Section 10 of the user-guide for further details on running multiple tasks concurrently on one or more nodes from a single Slurm job.
      • Each user will be limited to 10 active jobs at a given point in time and will be limited to running these jobs on a maximum of 20 compute nodes. As each compute node is dual-socket, and has a 20-core processors on each socket, a total of 800 cores could be potentially used by a job at a given point in time.
      • Each job will be limited to a run-time of no more than 72 hours. Users are encouraged to consider implementing checkpointing-restart capabilities in their home-grown applications. The research computing support group will be happy to provide guidance on implementing checkpointing-restart mechanism in the users' code. Some third-party software, like the FLASH astrophysics code, already have in-built capabilities to checkpoint-restart. Such capabilities can be enabled by setting the required environment variables. The users are encouraged to review the documentation of their software to confirm whether or not the checkpoint-restart functionality is available in the software of their choice. Section 16 of this user-guide has further information on using checkpointing and restart.
      • Exceptions : If you require access to nodes for a longer period of time, or need access to more nodes than what are allowed by default, please submit a service request ticket with an exemption request. We will need a brief description of the activity for your request, along with the number of cores and nodes required, and the time duration for which you are requesting the exemption. Also, we request to explore the options for checkpointing the code before submitting the ticket for service request at the following URL: https://support.utsa.edu/myportal
    • Data Storage (Disk Usage)
      • Work Directory (/work/abc123) – as detailed in our Wiki, this directory is where you should place any input/output files as well as logs for your running jobs. This directory is NOT backed up and is not intended for long-term storage.
      • Work Directory Data Retention – all files in the Work directory that have not been accessed in the last 30 days will be likely candidates for deletion.
      • Home Directory (/home/abc123) – this directory is backed up but should only be used for installing and compiling code. Storage of datasets is permitted here, but there will be a hard quota limit of 25GB in place.

  3. Requesting an Account on Arc

  4. Prerequisite: Arc has a Linux operating system and hence, basic knowledge of Linux is required for working efficiently on Arc in command-line mode.
    If you need help with learning Linux, the following link will provide a quick overview of Linux and basic Linux commands: Express Linux Tutorial

  5. Logging into Arc, Submitting Jobs, and Monitoring Jobs on Arc

  6. File transfer

  7. Modules for Managing User Environment on Arc

  8. Running C, C++, Fortran, Python, and R applications in Serial Mode
    • Both batch and interactive modes of running serial applications is covered
    • Code and scripts used in the examples shown in the document are available from this GitHub repository

  9. Running Parallel Programs
    • Code and scripts used in the examples shown in the document are available from this GitHub repository
    • OpenMP, MPI, and CUDA examples are covered in this document
    • C, C++ and Fortran are the base languages used

  10. Running Multiple Copies of Executables Concurrently from the Same Job
    • Running multiple executables concurrently from the same job is covered
    • Using GNU Parallel for running parameter-sweep applications is covered

  11. Additional Python and R Usage Information

  12. Using Some of the Popular Software Packages that are Installed System-Wide

  13. Using Containers (Singularity and Docker) on Arc

  14. Visualization Using Paraview on Arc

  15. Setting Java Environment for Applications with Java Dependencies

  16. Application Checkpointing and Restart on Arc

  17. Checking Currently Installed Software on Arc
    • To check the list of the currently available software packages on Arc, please use the "module spider" or "module avail" command from a compute node
    • Details on using the module commands for managing the shell environment on Arc are available here
    • The list of software packages that are available on Arc as of August 23, 2021 can be found here

  18. Technical Support
    • For technical support, you can submit a support request for Arc at the following link: https://support.utsa.edu/myportal. Instructions for submitting support requests can be found here.
    • The Research Computing Support Group is available between 8:00 AM to 5:00 PM on all business days to assist with the service requests.
    • Our time-to-response on new tickets is 4 business hours, and the time-to-resolution varies depending upon the complexity of the issue.
      • Please open a new ticket for every new topic
      • Once a ticket is closed you are welcome to reopen it if the exact topic that was addressed in the ticket appears to be still unresolved
    • For after-hours emergency support, please contact Tech Cafe at 210-458-5555.
Topic attachments
I Attachment Action Size Date Who CommentSorted ascending
running_parallel_programs_on_Arc.pdfpdf running_parallel_programs_on_Arc.pdf manage 476 K 19 Aug 2021 - 21:32 AdminUser  
migrate-shamu2arcEXT migrate-shamu2arc manage 3 K 26 Aug 2021 - 14:23 AdminUser Bash wrapper script for rsync to migrate user home and/or work data from Shamu to Arc
running_executables_and_gnu_parallel.pdfpdf running_executables_and_gnu_parallel.pdf manage 468 K 19 Aug 2021 - 21:34 AdminUser Executables and GNU Parallel
Deep Learning Model on CIFAR10 dataset using <a class="foswikiNewLink" href="/foswiki/bin/edit/ARC/PyTorch?topicparent=ARC.WebHome" rel="nofollow" title="Create this topic">PyTorch</a> on <a href="/foswiki/bin/view/ARC/GPU">GPU</a> nodes.pdfpdf Deep Learning Model on CIFAR10 dataset using PyTorch on GPU nodes.pdf manage 417 K 15 Aug 2021 - 19:36 AdminUser Pytorch on GPUs
Express_Linux_Tutorial-SizeOptimized.pdfpdf Express_Linux_Tutorial-SizeOptimized.pdf manage 653 K 19 Aug 2021 - 19:22 AdminUser Quick Linux Tutorial - Saved as a "Reduced Size" pdf to get below 10MB size limit
running_c_cpp_fortran_python_r.pdfpdf running_c_cpp_fortran_python_r.pdf manage 412 K 19 Aug 2021 - 21:32 AdminUser Running C, C++, fortran, Python, and R applications in serial mode
Running_Jobs_On_Arc.pdfpdf Running_Jobs_On_Arc.pdf manage 350 K 15 Aug 2021 - 19:12 AdminUser Running Jobs on Arc
RUNNING MATLAB “Hello, World” Example on Remote Linux Systems (1).pdfpdf RUNNING MATLAB “Hello, World” Example on Remote Linux Systems (1).pdf manage 104 K 15 Aug 2021 - 18:07 AdminUser Sample MatLab Job
Installation and Working of Deep Learning Libraries (<a class="foswikiNewLink" href="/foswiki/bin/edit/ARC/TensorFlow?topicparent=ARC.WebHome" rel="nofollow" title="Create this topic">TensorFlow</a>) on Remote Linux Systems (Stampede2 and Arc).pdfpdf Installation and Working of Deep Learning Libraries (TensorFlow) on Remote Linux Systems (Stampede2 and Arc).pdf manage 135 K 15 Aug 2021 - 18:45 AdminUser Tensorflow
Topic revision: r39 - 11 Oct 2021, AdminUser
This site is powered by FoswikiCopyright © by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding UTSA Research Support Group? Send feedback