User Tools

Site Tools


torque_queues

**This is an old revision of the document!**

Torque Queues Trestles/Razor

These torque scheduler queues are deprecated as the torque scheduler will be replaced by slurm and most razor 12-core nodes will be retired.

Razor general use queues are

queueCPUmemory/nodemax PBS specmax PBS timenotesMaui partitions
debug12core2x Intel X5670 2.93 GHz24GBnodes=2:ppn=12walltime=0:0:30:00dedicatedrz
tiny12core2x Intel X5670 2.93 GHz24GBnodes=24:ppn=12walltime=0:06:00:00node pool sharedrt/rm
med12core2x Intel X5670 2.93 GHz24GBnodes=24:ppn=12walltime=3:00:00:00node pool sharedrm
debug16core2x Intel E5-2670 2.6 GHz32GBnodes=2:ppn=16walltime=0:0:30:00dedicatedyd
tiny16core2x Intel E5-2670 2.6 GHz32GBnodes=18:ppn=16walltime=0:06:00:00node pool sharedyt/ym
med16core2x Intel E5-2670 2.6 GHz32GBnodes=18:ppn=16walltime=3:00:00:00node pool sharedym
onenode16core2x Intel E5-2670 2.6 GHz32GBnodes=1:ppn=16walltime=72:00:00dedicated/one node max per jobyl
gpu16core2x Intel E5-2630V3 2.4 GHz/2xK40c64GBnodes=1:ppn=16walltime=3:00:00:00gpu jobs onlygu
mem512GB64core4x AMD 6276 2.3 GHz512GBnodes=2:ppn=64walltime=3:00:00:00>64GB shared memory onlyyc
mem768GB32core4x Intel E5-4640 2.4 GHz768GBnodes=2:ppn=32walltime=3:00:00:00>512 GB shared memory onlyyb
nebulanebula cloud

Maui partitions in the table can be used with the shownodes command to estimate which nodes are immediately available like below. So if you want to use the gpu16core queue, this will show that resources are available now:

$ shownodes -l -n | grep Idle | grep gu
compute0804 Idle    24:24 gu 0.00

This doesn't guarantee that a job will start immediately as the scheduler may be assembling idle nodes for a large job, but it is a good indication. Similarly for a queue tiny16core that can draw from multiple Maui partitions, at this time there are 22 idle nodes. The following script will count the major node categories for Idle nodes:

$ /share/apps/bin/idle_razor_nodes.sh
12-core n30m=1 n06h=35 n72h=34
16-core n30m=3 n06h=22 n72h=10 onenode=16 graphics=1
bigmem 512m=2 768m=2
condo xqian=4 sbarr=2 aja=0 laur=3 itza=2

Production queues in quantity for most usage are {tiny/med}{12core/16core}. About 16 Razor 16-core nodes that have difficulty with multi-node MPI jobs are in a queue “onenode16core” for single-node,16-core jobs.

Trestles general use queues are (the first three queues are production queues in quantity)

queueCPUmemory/nodemax PBS specmax PBS timenotesMaui partitions
q30m32c4x AMD 6136 2.4 GHz64GBnodes=128:ppn=32walltime=0:0:30:00node pool sharedtu/ts/tl
q06h32c4x AMD 6136 2.4 GHz64GBnodes=128:ppn=32walltime=0:06:00:00node pool sharedts/tl
q72h32c4x AMD 6136 2.4 GHz64GBnodes=64:ppn=32walltime=3:00:00:00node pool sharedtl
nebulanebula cloud
$ /share/apps/bin/idle_trestles_nodes.sh
n30m=2 n06h=0 n72h=0
condo laur=8 agri=2 doug=0 itza=1
condo mill=23 millgpu=2 nair=27 nairphi=2
$

Production queues in quantity for most usage are q06h32c/q72h32c.

“node pool shared” on tiny/med or q30m/q06h/q72h means that the queues allocate jobs from a common pool of identical nodes, with some dedicated for the shorter queues.

For a complete listing of all defined queues and their properties on each cluster please use the qstat -q command.

torque_queues.1580333259.txt.gz · Last modified: 2020/01/29 21:27 by root