User Tools

Site Tools


slurm_queues

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
slurm_queues [2022/01/17 17:39]
root
slurm_queues [2024/02/26 19:14] (current)
pwolinsk
Line 1: Line 1:
 =====Slurm Queues Pinnacle/Karpinski===== =====Slurm Queues Pinnacle/Karpinski=====
 See [[ equipment | Selecting Resources ]] for help on choosing the best node/queue for your work.   See [[ equipment | Selecting Resources ]] for help on choosing the best node/queue for your work.  
 +
 +Updates: 
 +
 +<code>
 +tres288 queue added with 288 hour/12 day maximum
 +tres72 time limit changed to 288 hours, same as tres288, retained for existing scripts
 +csce-k2-72 queue added for new csce Pinnacle-2 nodes
 +</code>
 +
  
 Pinnacle queues or ''slurm'' "partitions" are: Pinnacle queues or ''slurm'' "partitions" are:
Line 15: Line 24:
 cloud72,virtual machines/containers/single processor jobs, 72 hr, 32, 3,for non-intensive computing up to 4 cores cloud72,virtual machines/containers/single processor jobs, 72 hr, 32, 3,for non-intensive computing up to 4 cores
 tres72, 64 GB nodes, 72hr, 32, 23, Trestles nodes with Pinnacle operating system tres72, 64 GB nodes, 72hr, 32, 23, Trestles nodes with Pinnacle operating system
-tres06, 64 GB nodes,  6hr, 32, 23, Trestles nodes with Pinnacel operating system+tres288, 64 GB nodes,  288hr, 32, 23, Trestles nodes with Pinnacle operating system
 </csv> </csv>
 +
 <csv> <csv>
 karpinski partition,description,time limit,cores per node,number of nodes karpinski partition,description,time limit,cores per node,number of nodes
 csce72,32 GB nodes, 72 hr,8, 18 csce72,32 GB nodes, 72 hr,8, 18
 +csce-k2-72, 256 GB nodes, 72 hr, 64, 6
 cscloud72,virtual machines/containers/single processor jobs, 72 hr,8, 18 cscloud72,virtual machines/containers/single processor jobs, 72 hr,8, 18
 </csv> </csv>
Line 26: Line 37:
 <csv> <csv>
 pinnacle partition,description,time limit,number of nodes,other pinnacle partition,description,time limit,number of nodes,other
-condo,condo nodes, none,25, authorization required +condo,condo nodes, none,25, authorization and appropriate properties required 
-pcon06,public use of condo nodes,6 hr, 25,+pcon06,public use of condo nodes,6 hr, 25, appropriate properties required
 </csv> </csv>
 Condo nodes require specification of a sufficient set of slurm properties. Property choices available are: Condo nodes require specification of a sufficient set of slurm properties. Property choices available are:
  
-gpu or not: ''0gpu''/''1v100''/''2v100''\\+**condo/pcon06 jobs running on the wrong nodes through lack of specified properties will be canceled without notice**\\ 
 +**non-gpu jobs running on gpu nodes may be canceled without notice**\\ 
 + 
 +gpu or not: ''0gpu''/''1v100''/''2v100''/''1a100''/''4a100''\\
 processor: ''i6130''/''a7351''/''i6128''\\ processor: ''i6130''/''a7351''/''i6128''\\
 equivalently: ''192gb''/''256gb''/''768gb''\\ equivalently: ''192gb''/''256gb''/''768gb''\\
Line 54: Line 68:
 $ $
 </code> </code>
 +
 +=== Public Condo Queue - pcon06 ===
 +
 +The condo nodes, which are reserved for priority access by the condo node owners, are also available for public use via the **''pcon06''** queue.  There is to 6 hour walltime limit for **''pcon06''**, but it may be extended upon request if there are no condo owner jobs waiting in the queue.  The **''pcon06''** contains a collection of multiple types of nodes purchased by different departments at various times.  So the hardware configuration for those nodes varies.  Each node in the queue has a set of features assigned to it which describe its hardware.  To select the appropriate node, slurm uses a **constraints** (''-C'') parameter in the **''sbatch''** and **''srun''** commands.  
 +
 +
 +**''pcon06-info.sh''** script is available to list the available idle nodes in the **''pcon06''** queue along with a list of constraints for each node.  
 +
 +
 +<code>
 +pinnacle-l5:pwolinsk:~$ pcon06-info.sh 
 +  Idle pcon06 nodes 
 +
 +  NodeName Constraint list
 +============================
 +    c1302: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
 +    c1305: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
 +    c1306: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
 +    c1307: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
 +    c1308: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
 +    c1309: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
 +    c1310: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel
 +    c1311: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1312: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1313: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1314: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1315: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1316: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1317: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1318: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1319: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1320: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1321: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1322: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1323: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1324: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1325: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1326: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1328: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1329: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1330: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel
 +    c1432: aja,0gpu,256gb,a7543,avx2,64c,amd
 +    c1618: jzhao77,0gpu,256gb,a7402,avx2,48c,amd
 +    c1716: yongwang,1v100,192gb,i6230,avx512,40c,intel
 +    c1719: mlbernha,0gpu,256gb,a7351,avx2,32c,amd
 +    c1720: mlbernha,0gpu,256gb,a7351,avx2,32c,amd
 +    c1913: laurent,0gpu,256gb,a7543,avx2,64c,amd
 +    c1915: laurent,0gpu,256gb,a7543,avx2,64c,amd
 +    c1916: laurent,0gpu,256gb,a7543,avx2,64c,amd
 +    c1917: laurent,0gpu,256gb,a7543,avx2,64c,amd
 +    c1918: laurent,0gpu,256gb,a7543,avx2,64c,amd
 +    c1919: laurent,0gpu,256gb,a7543,avx2,64c,amd
 +    c1920: laurent,0gpu,256gb,a7543,avx2,64c,amd
 +    c2001: aimrc,4a100,1024gb,a7543,avx2,64c,amd
 +    c2002: aimrc,4a100,1024gb,a7543,avx2,64c,amd
 +    c2003: aimrc,4a100,1024gb,a7543,avx2,64c,amd
 +    c2004: aimrc,4a100,1024gb,a7543,avx2,64c,amd
 +    c2010: zhang,2a100,512gb,a7543,avx2,64c,amd
 +    c2011: harris,0gpu,1024gb,a7543,avx2,64c,amd
 +    c2101: csce,4a100,1024gb,a7543,avx2,64c,amd
 +    c2102: csce,4a100,1024gb,a7543,avx2,64c,amd
 +    c2103: csce,4a100,1024gb,a7543,avx2,64c,amd
 +    c2104: csce,4a100,1024gb,a7543,avx2,64c,amd
 +    c2105: harris,4a100,1024gb,a7543,avx2,64c,amd
 +    c2112: kmbefus,0gpu,1024gb,a7543,avx2,64c,amd
 +    c2113: fwang,0gpu,512gb,a7543,avx2,64c,amd
 +    c2114: fwang,0gpu,512gb,a7543,avx2,64c,amd
 +    c2115: fwang,0gpu,512gb,a7543,avx2,64c,amd
 +    c2116: fwang,0gpu,512gb,a7543,avx2,64c,amd
 +    c2118: jm217,1a100,1024gb,a7543,avx2,64c,amd
 +    c2402: kwalters,1a40,1024gb,a7543,avx2,64c,amd
 +    c2403: kwalters,1a40,1024gb,a7543,avx2,64c,amd
 +    c2404: kwalters,0gpu,256gb,a7543,avx2,64c,amd
 +    c2405: kwalters,0gpu,256gb,a7543,avx2,64c,amd
 +    c2406: kwalters,0gpu,256gb,a7543,avx2,64c,amd
 +    c2407: kwalters,0gpu,256gb,a7543,avx2,64c,amd
 +    c2408: kwalters,0gpu,256gb,a7543,avx2,64c,amd
 +    c2409: kmbefus,0gpu,1024gb,a7543,avx2,64c,amd
 +    c2416: kwalters,0gpu,256gb,a7543,avx2,64c,amd
 +    c2417: kwalters,0gpu,256gb,a7543,avx2,64c,amd
 +    c2418: kwalters,0gpu,256gb,a7543,avx2,64c,amd
 +    c2421: laurent,0gpu,256gb,a7543,avx2,64c,amd
 +    c2422: laurent,0gpu,256gb,a7543,avx2,64c,amd
 +    c2423: laurent,0gpu,256gb,a7543,avx2,64c,amd
 +    c3101: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3103: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3104: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3107: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3108: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3109: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3110: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3111: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3114: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3115: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3116: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3118: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3119: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3120: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3121: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3122: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3123: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3124: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3125: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3126: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3127: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3128: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3129: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3130: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3131: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3132: pmillett,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3133: pmillett,4k80,128gb,i2650v2,avx,16c,intel
 +    c3201: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3202: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3203: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3204: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3205: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3206: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3207: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3208: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3209: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3210: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3211: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3212: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3213: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3214: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3216: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3217: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3219: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3220: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3221: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3222: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3224: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3226: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +    c3227: nair,0gpu,64gb,i2650v2,avx,16c,intel
 +
 +example submit commands:
 +
 +     srun   -p pcon06 -t 6:00:00 -n 16 -q comp -C 'nair&0gpu&64gb&i2650v2&avx&16c&intel' --pty /bin/bash
 +     sbatch -p pcon06 -t 6:00:00 -n 16 -q comp -C 'nair&0gpu&64gb&i2650v2&avx&16c&intel' <slurm_script>.slrum
 +
 +pinnacle-l5:pwolinsk:~$ srun   -p pcon06 -t 6:00:00 -N 2 -n 16 -q comp -C 'nair&0gpu&64gb&i2650v2&avx&16c&intel' --pty /bin/bash
 +srun: job 338300 queued and waiting for resources
 +srun: job 338300 has been allocated resources
 +c3201:pwolinsk:~$ 
 +</code>
 +
 +
 +
 +
 +
 +
slurm_queues.1642441154.txt.gz · Last modified: 2022/01/17 17:39 by root