User Tools

Site Tools


slurm_queues

Slurm Queues Pinnacle/Karpinski

See Selecting Resources for help on choosing the best node/queue for your work.
Updates:

tres288 queue added with 288 hour/12 day maximum
tres72 time limit changed to 288 hours, same as tres288, retained for existing scripts
csce-k2-72 queue added for new csce Pinnacle-2 nodes

Pinnacle queues or slurm “partitions” are:

pinnacle partitiondescriptiontime limitcores per nodenumber of nodesother
comp01192 GB nodes1 hr3248full node usage required
comp06192 GB nodes6 hr3244full node usage required
comp72192 GB nodes72 hr3240full node usage required
gpu06gpu nodes6 hr3219gpu usage required/full node usage required
gpu72gpu nodes72 hr3219gpu usage required/full node usage required
himem06768 GB nodes6 hr246>192 GB memory usage required/full node usage required
himem72768 GB nodes72 hr246>192 GB memory usage required/full node usage required
cloud72virtual machines/containers/single processor jobs72 hr323for non-intensive computing up to 4 cores
tres7264 GB nodes72hr3223Trestles nodes with Pinnacle operating system
tres28864 GB nodes288hr3223Trestles nodes with Pinnacle operating system
karpinski partitiondescriptiontime limitcores per nodenumber of nodes
csce7232 GB nodes72 hr818
csce-k2-72256 GB nodes72 hr646
cscloud72virtual machines/containers/single processor jobs72 hr818

Condo queues are:

pinnacle partitiondescriptiontime limitnumber of nodesother
condocondo nodesnone25authorization and appropriate properties required
pcon06public use of condo nodes6 hr25appropriate properties required

Condo nodes require specification of a sufficient set of slurm properties. Property choices available are:

condo/pcon06 jobs running on the wrong nodes through lack of specified properties will be canceled without notice
non-gpu jobs running on gpu nodes may be canceled without notice

gpu or not: 0gpu/1v100/2v100/1a100/4a100
processor: i6130/a7351/i6128
equivalently: 192gb/256gb/768gb
equivalently: 32c/32c/24c
local drive: nvme/no specification
research group: fwang equivalent to 0gpu/i6130|i6230/768gb/32c|40c/nvme
research group: tkaman equivalent to 2v100/i6130/192gb/32c
research group: aja equivalent to 0gpu/i6128/192gb|768gb/24c

examples:
#SBATCH –constraint=2v100
#SBATCH –constraint=fwang
#SBATCH –constraint=768gb&0gpu
#SBATCH –constraint=256gb

A script is available to show idle nodes like this (in this case 2 nodes idle in the1-hour comp queue, none in 6-hour or 72-hour comp queue, but nodes available in gpu, himem, csce, and csce cloud. Sufficient idle nodes in your queue of interest do not guarantee that your job will start immediately, but that is usually the case.

$ idle_pinnacle_nodes.sh
n01=2 n06=0 n72=0
g06=1 g72=1 h06=2 h72=2 c72=16 l72=16
condo aja=2 wang=0 mich=2 kama=0
$

Public Condo Queue - pcon06

The condo nodes, which are reserved for priority access by the condo node owners, are also available for public use via the pcon06 queue. There is to 6 hour walltime limit for pcon06, but it may be extended upon request if there are no condo owner jobs waiting in the queue. The pcon06 contains a collection of multiple types of nodes purchased by different departments at various times. So the hardware configuration for those nodes varies. Each node in the queue has a set of features assigned to it which describe its hardware. To select the appropriate node, slurm uses a constraints (-C) parameter in the sbatch and srun commands.

pcon06-info.sh script is available to list the available idle nodes in the pcon06 queue along with a list of constraints for each node.

pinnacle-l5:pwolinsk:~$ pcon06-info.sh 
  Idle pcon06 nodes 

  NodeName Constraint list
============================
    c1302: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
    c1305: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
    c1306: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
    c1307: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
    c1308: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
    c1309: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
    c1310: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
    c1311: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1312: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1313: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1314: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1315: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1316: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1317: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1318: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1319: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1320: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1321: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1322: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1323: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1324: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1325: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1326: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1328: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1329: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1330: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
    c1432: aja,0gpu,256gb,a7543,avx2,64c,amd
    c1618: jzhao77,0gpu,256gb,a7402,avx2,48c,amd
    c1716: yongwang,1v100,192gb,i6230,avx512,40c,intel
    c1719: mlbernha,0gpu,256gb,a7351,avx2,32c,amd
    c1720: mlbernha,0gpu,256gb,a7351,avx2,32c,amd
    c1913: laurent,0gpu,256gb,a7543,avx2,64c,amd
    c1915: laurent,0gpu,256gb,a7543,avx2,64c,amd
    c1916: laurent,0gpu,256gb,a7543,avx2,64c,amd
    c1917: laurent,0gpu,256gb,a7543,avx2,64c,amd
    c1918: laurent,0gpu,256gb,a7543,avx2,64c,amd
    c1919: laurent,0gpu,256gb,a7543,avx2,64c,amd
    c1920: laurent,0gpu,256gb,a7543,avx2,64c,amd
    c2001: aimrc,4a100,1024gb,a7543,avx2,64c,amd
    c2002: aimrc,4a100,1024gb,a7543,avx2,64c,amd
    c2003: aimrc,4a100,1024gb,a7543,avx2,64c,amd
    c2004: aimrc,4a100,1024gb,a7543,avx2,64c,amd
    c2010: zhang,2a100,512gb,a7543,avx2,64c,amd
    c2011: harris,0gpu,1024gb,a7543,avx2,64c,amd
    c2101: csce,4a100,1024gb,a7543,avx2,64c,amd
    c2102: csce,4a100,1024gb,a7543,avx2,64c,amd
    c2103: csce,4a100,1024gb,a7543,avx2,64c,amd
    c2104: csce,4a100,1024gb,a7543,avx2,64c,amd
    c2105: harris,4a100,1024gb,a7543,avx2,64c,amd
    c2112: kmbefus,0gpu,1024gb,a7543,avx2,64c,amd
    c2113: fwang,0gpu,512gb,a7543,avx2,64c,amd
    c2114: fwang,0gpu,512gb,a7543,avx2,64c,amd
    c2115: fwang,0gpu,512gb,a7543,avx2,64c,amd
    c2116: fwang,0gpu,512gb,a7543,avx2,64c,amd
    c2118: jm217,1a100,1024gb,a7543,avx2,64c,amd
    c2402: kwalters,1a40,1024gb,a7543,avx2,64c,amd
    c2403: kwalters,1a40,1024gb,a7543,avx2,64c,amd
    c2404: kwalters,0gpu,256gb,a7543,avx2,64c,amd
    c2405: kwalters,0gpu,256gb,a7543,avx2,64c,amd
    c2406: kwalters,0gpu,256gb,a7543,avx2,64c,amd
    c2407: kwalters,0gpu,256gb,a7543,avx2,64c,amd
    c2408: kwalters,0gpu,256gb,a7543,avx2,64c,amd
    c2409: kmbefus,0gpu,1024gb,a7543,avx2,64c,amd
    c2416: kwalters,0gpu,256gb,a7543,avx2,64c,amd
    c2417: kwalters,0gpu,256gb,a7543,avx2,64c,amd
    c2418: kwalters,0gpu,256gb,a7543,avx2,64c,amd
    c2421: laurent,0gpu,256gb,a7543,avx2,64c,amd
    c2422: laurent,0gpu,256gb,a7543,avx2,64c,amd
    c2423: laurent,0gpu,256gb,a7543,avx2,64c,amd
    c3101: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3103: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3104: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3107: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3108: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3109: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3110: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3111: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3114: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3115: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3116: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3118: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3119: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3120: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3121: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3122: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3123: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3124: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3125: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3126: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3127: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3128: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3129: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3130: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3131: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3132: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
    c3133: pmillett,4k80,128gb,i2650v2,avx,16c,intel
    c3201: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3202: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3203: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3204: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3205: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3206: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3207: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3208: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3209: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3210: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3211: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3212: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3213: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3214: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3216: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3217: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3219: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3220: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3221: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3222: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3224: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3226: nair,0gpu,64gb,i2650v2,avx,16c,intel
    c3227: nair,0gpu,64gb,i2650v2,avx,16c,intel

example submit commands:

     srun   -p pcon06 -t 6:00:00 -n 16 -q comp -C 'nair&0gpu&64gb&i2650v2&avx&16c&intel' --pty /bin/bash
     sbatch -p pcon06 -t 6:00:00 -n 16 -q comp -C 'nair&0gpu&64gb&i2650v2&avx&16c&intel' <slurm_script>.slrum

pinnacle-l5:pwolinsk:~$ srun   -p pcon06 -t 6:00:00 -N 2 -n 16 -q comp -C 'nair&0gpu&64gb&i2650v2&avx&16c&intel' --pty /bin/bash
srun: job 338300 queued and waiting for resources
srun: job 338300 has been allocated resources
c3201:pwolinsk:~$ 
slurm_queues.txt · Last modified: 2024/02/26 19:14 by pwolinsk