User Tools

Site Tools


selecting_resources

Equipment/Selecting Resources/Slurm Parameters

There are six to seven different Slurm parameters that must be specified to pick a computational resource and run a job. Additional Slurm parameters are optional.

ParameterDescription
partition A group of similar compute nodes (except condo and pcon06)
time The clock run time limit for the job. Leave some extra as the job will be killed when it reaches the limit. For partitions ....72
nodes The number of nodes to allocate. 1 unless your program uses MPI.
tasks-per-node The number of processes per node to allocate. 1 unless your program uses MPI.
cpus-per-task The number of hardware threads to allocate per process.
constraint Sub-partition selection for partitions condo and pcon06.
qoscorresponding with partition

Partitions are

Partition qosMax hours Max nodesMax tasks Max cpu/t Num NodesNode typeNumber of GPUsDescription
cloud72 cloud721 1 2 3 Xeon 6130/32c/192GB 0 shared queue for test jobs and low-effort tasks such as compiling and editing
comp72 comp72n/a 32* 32* 45Dual Xeon 6130/32c/192GB 0 shared queue for long-term full-node computation
comp06 comp6n/a 32* 32* 45Dual Xeon 6130/32c/192GB 0 same as comp72 except 6-hour limit and higher priority
comp01 comp1n/a 32* 32* 47Dual Xeon 6130/32c/192GB 0 same as comp06 except 1-hour limit and higher priority
tres72 tres288n/a 32* 32* 111Dual AMD 6136/32c/64GB 0 shared queue for very long-term full-node computation
tres288tres288n/a 32* 32* 111Dual AMD 6136/32c/64GB 0 synonym for tres288 with expanded 288-hour limit
gpu72 gpu72n/a 32* 32* 19Dual Xeon 6130/32c/192GB 1 V100 shared queue for long-term full-node single GPU computation
gpu06 gpu6n/a 32* 32* 19Dual Xeon 6130/32c/192GB 1 V100 same as gpu72 except 6-hour limit and higher priority
himem72 himem72n/a 24* 24* 6Dual Xeon 6128/24c/768GB 0 shared queue for long-term full-node high-memory
himem06 himem6n/a 24* 24* 6Dual Xeon 6128/24c/768GB 0 same as himem72 except 6-hour limit and higher priority
acomp06 comp6n/a 64* 64* 1Dual AMD 7543/64c/1024GB 0 shared queue for medium-term full-node 64-core computation
agpu72 gpu72 n/a 64* 64* 16Dual AMD 7543/64c/1024GB 1 A100 shared queue for medium-term full-node 64-core single GPU computation
agpu06 gpu6n/a 64* 64* 18Dual AMD 7543/64c/1024GB 1 A100 same as agpu72 except 6-hour limit and higher priority
qgpu72 gpu72 n/a 64* 64* 4Dual AMD 7543/64c/1024GB 4 A100 shared queue for medium-term full-node 64-core quad GPU computation
qgpu06 gpu6n/a 64* 64* 4Dual AMD 7543/64c/1024GB 4 A100 same as qgpu72 except 6-hour limit and higher priority
csce72 csce72n/a 64* 64* 14Dual AMD 7543/64c/1024GB 0 For CSCE users
condo condon/a n/a n/a n/a 190n/a n/an/a
pcon06 comp61 n/a n/a 190n/a n/a n/a
Comments/Rules
  1. Each set of —01, —06, —72 partitions are overlaid
  2. 32* product of tasks and cpus/per task should be 32 to allocate an entire node
  3. 64* product of tasks and cpus/per task should be 64 to allocate an entire node
  4. Single-CPU jobs should go to the cloud72 partition
  5. Because of the expense of GPUs and large memory, the a)gpu and b)himem partitions are reserved for jobs that a) use the GPU b) use more than 180 GB of memory respectively
  6. qos limits the total number of jobs per user and per group for each class of partitions
  7. comp, gpu, and himem nodes are reserved for full-node jobs except for a few specific exceptions > 1 core and < full node. Contact hpc-support@listserv.uark.edu
  8. condo or pcon06 and constraint: Contact hpc-support@listserv.uark.edu
selecting_resources.txt · Last modified: 2024/03/10 17:18 by root