=== Equipment/Selecting Resources/Slurm Parameters ===
There are six to seven different Slurm parameters that must be specified to pick a computational resource and run a job. Additional Slurm parameters are optional.
Parameter, Description
partition , A group of similar compute nodes (except condo and pcon06)
time , The clock run time limit for the job. Leave some extra as the job will be killed when it reaches the limit. For partitions ....72, ....06, ....01, 72, 6 and 1 hour respectively.
nodes , The number of nodes to allocate. 1 unless your program uses MPI.
tasks-per-node , The number of processes per node to allocate. 1 unless your program uses MPI.
cpus-per-task , The number of hardware threads to allocate per process.
constraint , Sub-partition selection for partitions condo and pcon06.
qos, corresponding with partition, Quality of service class, generally the partition without the time, comp,gpu,tres,cloud,himem,condo
Partitions are
Partition , qos, Max hours , Max nodes, Max tasks , Max cpu/t , Num Nodes, Node type, Number of GPUs, Description
cloud72 , cloud, 72, 1 , 1 , 2 , 3 , Xeon 6130/32c/192GB , 0 , shared queue for test jobs and low-effort tasks such as compiling and editing, usually available now for interactive
comp72 , comp, 72, n/a , 32* , 32* , 45, Dual Xeon 6130/32c/192GB , 0 , shared queue for long-term full-node computation
comp06 , comp, 6, n/a , 32* , 32* , 45, Dual Xeon 6130/32c/192GB , 0 , same as comp72 except 6-hour limit and higher priority
comp01 , comp, 1, n/a , 32* , 32* , 47, Dual Xeon 6130/32c/192GB , 0 , same as comp06 except 1-hour limit and higher priority
tres72 , tres, 288, n/a , 32* , 32* , 111, Dual AMD 6136/32c/64GB , 0 , shared queue for very long-term full-node computation, uncrowded but slow nodes
tres288, tres, 288, n/a , 32* , 32* , 111, Dual AMD 6136/32c/64GB , 0 , synonym for tres288 with expanded 288-hour limit
gpu72 , gpu, 72, n/a , 32* , 32* , 19, Dual Xeon 6130/32c/192GB , 1 V100 , shared queue for long-term full-node single GPU computation
gpu06 , gpu, 6, n/a , 32* , 32* , 19, Dual Xeon 6130/32c/192GB , 1 V100 , same as gpu72 except 6-hour limit and higher priority
himem72 , himem, 72, n/a , 24* , 24* , 6, Dual Xeon 6128/24c/768GB , 0 , shared queue for long-term full-node high-memory, same as comp72 except 24 cores, 768 GB
himem06 , himem, 6, n/a , 24* , 24* , 6, Dual Xeon 6128/24c/768GB , 0 , same as himem72 except 6-hour limit and higher priority
acomp06 , comp, 6, n/a , 64* , 64* , 1, Dual AMD 7543/64c/1024GB , 0 , shared queue for medium-term full-node 64-core computation
agpu72 , gpu, 72 , n/a , 64* , 64* , 16, Dual AMD 7543/64c/1024GB , 1 A100 , shared queue for medium-term full-node 64-core single GPU computation
agpu06 , gpu, 6, n/a , 64* , 64* , 18, Dual AMD 7543/64c/1024GB , 1 A100 , same as agpu72 except 6-hour limit and higher priority
qgpu72 , gpu, 72 , n/a , 64* , 64* , 4, Dual AMD 7543/64c/1024GB , 4 A100 , shared queue for medium-term full-node 64-core quad GPU computation
qgpu06 , gpu, 6, n/a , 64* , 64* , 4, Dual AMD 7543/64c/1024GB , 4 A100 , same as qgpu72 except 6-hour limit and higher priority
csce72 , csce, 72, n/a , 64* , 64* , 14, Dual AMD 7543/64c/1024GB , 0 , For CSCE users, long-term full-node 64-core computation
condo , condo, n/a , n/a , n/a , n/a , 190, n/a , n/a, n/a , Contact HPCC for correct usage
pcon06 , comp, 6, 1 , n/a , n/a , 190, n/a , n/a , n/a , Contact HPCC for correct usage
==Comments/Rules==
1. Each set of ---01, ---06, ---72 partitions are overlaid
2. 32* product of tasks and cpus/per task should be 32 to allocate an entire node
3. 64* product of tasks and cpus/per task should be 64 to allocate an entire node
4. Single-CPU jobs should go to the ''cloud72'' partition
5. Because of the expense of GPUs and large memory, the a)''gpu'' and b)''himem'' partitions are reserved for jobs that a) use the GPU b) use more than 180 GB of memory respectively
6. qos limits the total number of jobs per user and per group for each class of partitions
7. ''comp'', ''gpu'', and ''himem'' nodes are reserved for full-node jobs except for a few specific exceptions > 1 core and < full node. Contact hpc-support@listserv.uark.edu
8. ''condo'' or ''pcon06'' and ''constraint'': Contact hpc-support@listserv.uark.edu