GPU Jobs

ARCH offers several GPU-equipped partitions for compute-intensive and AI/ML workloads. This page lists each partition, the CPU-per-GPU billing ratios, access requirements, and submission examples.

Available GPU partitions 

Partition	GPUs / node	CPU cores billed per GPU	Typical use-case
`l40s`	8 × NVIDIA L40 S (48 GB)	14	Large-memory image / data analytics
`a100`	8 × NVIDIA A100-40 GB	10	Mixed HPC + DL
`nvl`	4 × NVIDIA H100 (96 GB)	30	Highest-end training / inference
`h100`	4 × NVIDIA H100 (80 GB)	30	Same hardware as nvl; kept separate for scheduling

DefCpuPerGPU from scontrol show partition; this is what Slurm charges per elapsed hour per GPU.

GPU usage limits 

QoS limits are enforced cluster-wide. Most projects can have up to 18 GPUs in use simultaneously. This limit is applied to both per account and per user, whichever limit is hit first.

Submitting a GPU batch job 

Example – 2 × A100 GPUs for 24 h:

#SBATCH --partition=a100
#SBATCH --qos=qos_gpu
#SBATCH --account=jsmith123_gpu
#SBATCH --gres=gpu:2
#SBATCH --cpus-per-task=24     # 12  cores / GPU × 2
#SBATCH --time=24:00:00

module load cuda/12.3
srun python train.py --epochs 90

Monitoring GPUs 

List GPU nodes & load:

sinfo -p l40s,a100,nvl,h100 -N -o "%N %G %T %m"

Per-job utilisation:

jobstats <jobid>

Troubleshooting 

QOSMaxGRESPerAccount → you’ve hit the GPU cap; wait or cancel other runs.
AssocGrpGRES → wrong account/QoS pair.
Resources → request fewer GPUs or shorter wall-time to back-fill.

Need help? Open a ticket or e-mail help@arch.jhu.edu.

GPU Jobs

Available GPU partitions

GPU usage limits

Submitting a GPU batch job

Monitoring GPUs

Troubleshooting

Available GPU partitions 

GPU usage limits 

Submitting a GPU batch job 

Monitoring GPUs 

Troubleshooting 