Both sides previous revision
Previous revision
Next revision
|
Previous revision
Last revision
Both sides next revision
|
slurm_queues [2022/01/17 17:39] root |
slurm_queues [2024/01/22 18:07] pwolinsk |
=====Slurm Queues Pinnacle/Karpinski===== | =====Slurm Queues Pinnacle/Karpinski===== |
See [[ equipment | Selecting Resources ]] for help on choosing the best node/queue for your work. | See [[ equipment | Selecting Resources ]] for help on choosing the best node/queue for your work. |
| |
| Updates: |
| |
| <code> |
| tres288 queue added with 288 hour/12 day maximum |
| tres72 time limit changed to 288 hours, same as tres288, retained for existing scripts |
| csce-k2-72 queue added for new csce Pinnacle-2 nodes |
| </code> |
| |
| |
Pinnacle queues or ''slurm'' "partitions" are: | Pinnacle queues or ''slurm'' "partitions" are: |
cloud72,virtual machines/containers/single processor jobs, 72 hr, 32, 3,for non-intensive computing up to 4 cores | cloud72,virtual machines/containers/single processor jobs, 72 hr, 32, 3,for non-intensive computing up to 4 cores |
tres72, 64 GB nodes, 72hr, 32, 23, Trestles nodes with Pinnacle operating system | tres72, 64 GB nodes, 72hr, 32, 23, Trestles nodes with Pinnacle operating system |
tres06, 64 GB nodes, 6hr, 32, 23, Trestles nodes with Pinnacel operating system | tres288, 64 GB nodes, 288hr, 32, 23, Trestles nodes with Pinnacle operating system |
</csv> | </csv> |
| |
<csv> | <csv> |
karpinski partition,description,time limit,cores per node,number of nodes | karpinski partition,description,time limit,cores per node,number of nodes |
csce72,32 GB nodes, 72 hr,8, 18 | csce72,32 GB nodes, 72 hr,8, 18 |
| csce-k2-72, 256 GB nodes, 72 hr, 64, 6 |
cscloud72,virtual machines/containers/single processor jobs, 72 hr,8, 18 | cscloud72,virtual machines/containers/single processor jobs, 72 hr,8, 18 |
</csv> | </csv> |
<csv> | <csv> |
pinnacle partition,description,time limit,number of nodes,other | pinnacle partition,description,time limit,number of nodes,other |
condo,condo nodes, none,25, authorization required | condo,condo nodes, none,25, authorization and appropriate properties required |
pcon06,public use of condo nodes,6 hr, 25, | pcon06,public use of condo nodes,6 hr, 25, appropriate properties required |
</csv> | </csv> |
Condo nodes require specification of a sufficient set of slurm properties. Property choices available are: | Condo nodes require specification of a sufficient set of slurm properties. Property choices available are: |
| |
gpu or not: ''0gpu''/''1v100''/''2v100''\\ | **condo/pcon06 jobs running on the wrong nodes through lack of specified properties will be canceled without notice**\\ |
| **non-gpu jobs running on gpu nodes may be canceled without notice**\\ |
| |
| gpu or not: ''0gpu''/''1v100''/''2v100''/''1a100''/''4a100''\\ |
processor: ''i6130''/''a7351''/''i6128''\\ | processor: ''i6130''/''a7351''/''i6128''\\ |
equivalently: ''192gb''/''256gb''/''768gb''\\ | equivalently: ''192gb''/''256gb''/''768gb''\\ |
$ | $ |
</code> | </code> |
| |
| === Public Condo Queue - pcon06 === |
| |
| The condo nodes, which are reserved for priority access by the condo node owners, are also available for public use via the **''pcon06''** queue. There is to 6 hour walltime limit for **''pcon06''**, but it may be extended upon request if there are no condo owner jobs waiting in the queue. The **''pcon06''** contains a collection of multiple types of nodes purchased by different departments at various times. So the hardware configuration for those nodes varies. Each node in the queue has a set of features assigned to it which describe its hardware. To select the appropriate node, slurm uses a **constraints** (''-C'') parameter in the **''sbatch''** and **''srun''** commands. |
| |
| |
| **''pcon06-info.sh''** script is available to list the available idle nodes in the **''pcon06''** queue along with a list of constraints for each node. |
| |
| |
| <code> |
| pinnacle-l5:pwolinsk:~$ pcon06-info.sh |
| Idle pcon06 nodes |
| |
| NodeName Constraint list |
| ============================ |
| c1302: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel |
| c1305: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel |
| c1306: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel |
| c1307: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel |
| c1308: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel |
| c1309: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel |
| c1310: fwang,0gpu,nvme,384gb,i6230,avx512,40c,intel |
| c1311: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1312: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1313: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1314: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1315: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1316: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1317: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1318: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1319: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1320: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1321: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1322: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1323: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1324: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1325: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1326: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1328: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1329: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1330: fwang,0gpu,nvme,192gb,i6130,avx512,32c,intel |
| c1432: aja,0gpu,256gb,a7543,avx2,64c,amd |
| c1618: jzhao77,0gpu,256gb,a7402,avx2,48c,amd |
| c1716: yongwang,1v100,192gb,i6230,avx512,40c,intel |
| c1719: mlbernha,0gpu,256gb,a7351,avx2,32c,amd |
| c1720: mlbernha,0gpu,256gb,a7351,avx2,32c,amd |
| c1913: laurent,0gpu,256gb,a7543,avx2,64c,amd |
| c1915: laurent,0gpu,256gb,a7543,avx2,64c,amd |
| c1916: laurent,0gpu,256gb,a7543,avx2,64c,amd |
| c1917: laurent,0gpu,256gb,a7543,avx2,64c,amd |
| c1918: laurent,0gpu,256gb,a7543,avx2,64c,amd |
| c1919: laurent,0gpu,256gb,a7543,avx2,64c,amd |
| c1920: laurent,0gpu,256gb,a7543,avx2,64c,amd |
| c2001: aimrc,4a100,1024gb,a7543,avx2,64c,amd |
| c2002: aimrc,4a100,1024gb,a7543,avx2,64c,amd |
| c2003: aimrc,4a100,1024gb,a7543,avx2,64c,amd |
| c2004: aimrc,4a100,1024gb,a7543,avx2,64c,amd |
| c2010: zhang,2a100,512gb,a7543,avx2,64c,amd |
| c2011: harris,0gpu,1024gb,a7543,avx2,64c,amd |
| c2101: csce,4a100,1024gb,a7543,avx2,64c,amd |
| c2102: csce,4a100,1024gb,a7543,avx2,64c,amd |
| c2103: csce,4a100,1024gb,a7543,avx2,64c,amd |
| c2104: csce,4a100,1024gb,a7543,avx2,64c,amd |
| c2105: harris,4a100,1024gb,a7543,avx2,64c,amd |
| c2112: kmbefus,0gpu,1024gb,a7543,avx2,64c,amd |
| c2113: fwang,0gpu,512gb,a7543,avx2,64c,amd |
| c2114: fwang,0gpu,512gb,a7543,avx2,64c,amd |
| c2115: fwang,0gpu,512gb,a7543,avx2,64c,amd |
| c2116: fwang,0gpu,512gb,a7543,avx2,64c,amd |
| c2118: jm217,1a100,1024gb,a7543,avx2,64c,amd |
| c2402: kwalters,1a40,1024gb,a7543,avx2,64c,amd |
| c2403: kwalters,1a40,1024gb,a7543,avx2,64c,amd |
| c2404: kwalters,0gpu,256gb,a7543,avx2,64c,amd |
| c2405: kwalters,0gpu,256gb,a7543,avx2,64c,amd |
| c2406: kwalters,0gpu,256gb,a7543,avx2,64c,amd |
| c2407: kwalters,0gpu,256gb,a7543,avx2,64c,amd |
| c2408: kwalters,0gpu,256gb,a7543,avx2,64c,amd |
| c2409: kmbefus,0gpu,1024gb,a7543,avx2,64c,amd |
| c2416: kwalters,0gpu,256gb,a7543,avx2,64c,amd |
| c2417: kwalters,0gpu,256gb,a7543,avx2,64c,amd |
| c2418: kwalters,0gpu,256gb,a7543,avx2,64c,amd |
| c2421: laurent,0gpu,256gb,a7543,avx2,64c,amd |
| c2422: laurent,0gpu,256gb,a7543,avx2,64c,amd |
| c2423: laurent,0gpu,256gb,a7543,avx2,64c,amd |
| c3101: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3103: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3104: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3107: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3108: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3109: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3110: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3111: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3114: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3115: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3116: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3118: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3119: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3120: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3121: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3122: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3123: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3124: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3125: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3126: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3127: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3128: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3129: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3130: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3131: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3132: pmillett,0gpu,64gb,i2650v2,avx,16c,intel |
| c3133: pmillett,4k80,128gb,i2650v2,avx,16c,intel |
| c3201: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3202: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3203: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3204: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3205: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3206: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3207: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3208: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3209: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3210: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3211: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3212: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3213: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3214: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3216: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3217: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3219: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3220: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3221: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3222: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3224: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3226: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| c3227: nair,0gpu,64gb,i2650v2,avx,16c,intel |
| |
| example submit commands: |
| |
| srun -p pcon06 -t 6:00:00 -n 16 -C 'nair&0gpu&64gb&i2650v2&avx&16c&intel' --pty /bin/bash |
| sbatch -p pcon06 -t 6:00:00 -n 16 -C 'nair&0gpu&64gb&i2650v2&avx&16c&intel' <slurm_script>.slrum |
| |
| pinnacle-l5:pwolinsk:~$ srun -p pcon06 -t 6:00:00 -N 2 -n 16 -C 'nair&0gpu&64gb&i2650v2&avx&16c&intel' --pty /bin/bash |
| srun: job 338300 queued and waiting for resources |
| srun: job 338300 has been allocated resources |
| c3201:pwolinsk:~$ |
| </code> |
| |
| |
| |
| |
| |
| |