Arkansas High Performace Computing Center [hpcwiki]

Equipment/Selecting Resources/Slurm Parameters

There are six to seven different Slurm parameters that must be specified to pick a computational resource and run a job. Additional Slurm parameters are optional.

Parameter	Description
partition	A group of similar compute nodes (except condo and pcon06)
time	The clock run time limit for the job. Leave some extra as the job will be killed when it reaches the limit. For partitions ....72
nodes	The number of nodes to allocate. 1 unless your program uses MPI.
tasks-per-node	The number of processes per node to allocate. 1 unless your program uses MPI.
cpus-per-task	The number of hardware threads to allocate per process.
constraint	Sub-partition selection for partitions condo and pcon06.
qos	corresponding with partition

Partitions are

Partition	qos	Max hours	Max nodes	Max tasks	Max cpu/t	Num Nodes	Node type	Number of GPUs	Description
cloud72	cloud	72	1	1	2	3	Xeon 6130/32c/192GB	0	shared queue for test jobs and low-effort tasks such as compiling and editing
comp72	comp	72	n/a	32*	32*	45	Dual Xeon 6130/32c/192GB	0	shared queue for long-term full-node computation
comp06	comp	6	n/a	32*	32*	45	Dual Xeon 6130/32c/192GB	0	same as comp72 except 6-hour limit and higher priority
comp01	comp	1	n/a	32*	32*	47	Dual Xeon 6130/32c/192GB	0	same as comp06 except 1-hour limit and higher priority
tres72	tres	288	n/a	32*	32*	111	Dual AMD 6136/32c/64GB	0	shared queue for very long-term full-node computation
tres288	tres	288	n/a	32*	32*	111	Dual AMD 6136/32c/64GB	0	synonym for tres288 with expanded 288-hour limit
gpu72	gpu	72	n/a	32*	32*	19	Dual Xeon 6130/32c/192GB	1 V100	shared queue for long-term full-node single GPU computation
gpu06	gpu	6	n/a	32*	32*	19	Dual Xeon 6130/32c/192GB	1 V100	same as gpu72 except 6-hour limit and higher priority
himem72	himem	72	n/a	24*	24*	6	Dual Xeon 6128/24c/768GB	0	shared queue for long-term full-node high-memory
himem06	himem	6	n/a	24*	24*	6	Dual Xeon 6128/24c/768GB	0	same as himem72 except 6-hour limit and higher priority
acomp06	comp	6	n/a	64*	64*	1	Dual AMD 7543/64c/1024GB	0	shared queue for medium-term full-node 64-core computation
agpu72	gpu	72	n/a	64*	64*	16	Dual AMD 7543/64c/1024GB	1 A100	shared queue for medium-term full-node 64-core single GPU computation
agpu06	gpu	6	n/a	64*	64*	18	Dual AMD 7543/64c/1024GB	1 A100	same as agpu72 except 6-hour limit and higher priority
qgpu72	gpu	72	n/a	64*	64*	4	Dual AMD 7543/64c/1024GB	4 A100	shared queue for medium-term full-node 64-core quad GPU computation
qgpu06	gpu	6	n/a	64*	64*	4	Dual AMD 7543/64c/1024GB	4 A100	same as qgpu72 except 6-hour limit and higher priority
csce72	csce	72	n/a	64*	64*	14	Dual AMD 7543/64c/1024GB	0	For CSCE users
condo	condo	n/a	n/a	n/a	n/a	190	n/a	n/a	n/a
pcon06	comp	6	1	n/a	n/a	190	n/a	n/a	n/a

Comments/Rules

Each set of —01, —06, —72 partitions are overlaid
32* product of tasks and cpus/per task should be 32 to allocate an entire node
64* product of tasks and cpus/per task should be 64 to allocate an entire node
Single-CPU jobs should go to the cloud72 partition
Because of the expense of GPUs and large memory, the a)gpu and b)himem partitions are reserved for jobs that a) use the GPU b) use more than 180 GB of memory respectively
qos limits the total number of jobs per user and per group for each class of partitions
comp, gpu, and himem nodes are reserved for full-node jobs except for a few specific exceptions > 1 core and < full node. Contact hpc-support@listserv.uark.edu
condo or pcon06 and constraint: Contact hpc-support@listserv.uark.edu

Arkansas High Performace Computing Center [hpcwiki]

User Tools

Site Tools

Equipment/Selecting Resources/Slurm Parameters

Comments/Rules

Page Tools