User Tools

Site Tools


torque_queues

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
torque_queues [2020/02/04 18:44]
root
— (current)
Line 1: Line 1:
-====Torque Queues Trestles/Razor==== 
-See [[ equipment | Selecting Resources ]] for help on choosing the best node/queue for your work.  
  
-These ''torque'' scheduler queues (a different set for each of two ''torque'' instances) are deprecated as the ''torque'' scheduler will be replaced by ''slurm'' and most razor 12-core nodes will be retired. 
- 
-**Razor** general use queues are  
-<csv> 
-queue,CPU,memory/node,max PBS spec,max PBS time,notes,Maui partitions 
-debug12core,2x Intel X5670 2.93 GHz,24GB,nodes=2:ppn=12,walltime=0:0:30:00, dedicated,rz 
-tiny12core,2x Intel X5670 2.93 GHz,24GB,nodes=24:ppn=12,walltime=0:06:00:00, node pool shared,rt/rm 
-med12core,2x Intel X5670 2.93 GHz,24GB,nodes=24:ppn=12,walltime=3:00:00:00, node pool shared,rm 
-debug16core,2x Intel E5-2670 2.6 GHz,32GB,nodes=2:ppn=16,walltime=0:0:30:00, dedicated,yd 
-tiny16core,2x Intel E5-2670 2.6 GHz,32GB,nodes=18:ppn=16,walltime=0:06:00:00, node pool shared,yt/ym 
-med16core,2x Intel E5-2670 2.6 GHz,32GB,nodes=18:ppn=16,walltime=3:00:00:00, node pool shared,ym 
-onenode16core,2x Intel E5-2670 2.6 GHz,32GB,nodes=1:ppn=16,walltime=72:00:00, dedicated/one node max per job,yl 
-gpu16core,2x Intel E5-2630V3 2.4 GHz/2xK40c,64GB,nodes=1:ppn=16,walltime=3:00:00:00,gpu jobs only,gu 
-mem512GB64core,4x AMD 6276 2.3 GHz,512GB,nodes=2:ppn=64,walltime=3:00:00:00,>64GB shared memory only,yc 
-mem768GB32core,4x Intel E5-4640 2.4 GHz,768GB,nodes=2:ppn=32,walltime=3:00:00:00,>512 GB shared memory only,yb 
-nebula,nebula cloud,, 
-</csv> 
-Maui partitions in the table can be used with the ''shownodes'' command to estimate which nodes are immediately available like below. So if you want to use the ''gpu16core'' queue, this will show that resources are available now: 
-<code> 
-$ shownodes -l -n | grep Idle | grep gu 
-compute0804 Idle    24:24 gu 0.00 
-</code> 
-This doesn't guarantee that a job will start immediately as the scheduler may be assembling idle nodes for a large job, but it is a good indication.  Similarly for a queue ''tiny16core'' that can draw from multiple Maui partitions, at this time there are 22 idle nodes. 
-The following script will count the major node categories for Idle nodes: 
-<code> 
-$ /share/apps/bin/idle_razor_nodes.sh 
-12-core n30m=1 n06h=35 n72h=34 
-16-core n30m=3 n06h=22 n72h=10 onenode=16 graphics=1 
-bigmem 512m=2 768m=2 
-condo xqian=4 sbarr=2 aja=0 laur=3 itza=2 
-</code> 
- 
-Production queues in quantity for most usage are {tiny/med}{12core/16core}.  About 16 Razor 16-core nodes that have difficulty with multi-node MPI jobs are in a queue "onenode16core" for single-node,16-core jobs. 
-   
-**Trestles** general use queues are (the first three queues are production queues in quantity) 
-<csv> 
-queue,CPU,memory/node,max PBS spec,max PBS time,notes,Maui partitions 
-q30m32c,4x AMD 6136 2.4 GHz,64GB,nodes=128:ppn=32,walltime=0:0:30:00, node pool shared,tu/ts/tl 
-q06h32c,4x AMD 6136 2.4 GHz,64GB,nodes=128:ppn=32,walltime=0:06:00:00, node pool shared,ts/tl 
-q72h32c,4x AMD 6136 2.4 GHz,64GB,nodes=64:ppn=32,walltime=3:00:00:00, node pool shared,tl 
-nebula,nebula cloud,, 
-</csv> 
-<code> 
-$ /share/apps/bin/idle_trestles_nodes.sh 
-n30m=2 n06h=0 n72h=0 
-condo laur=8 agri=2 doug=0 itza=1 
-condo mill=23 millgpu=2 nair=27 nairphi=2 
-$ 
-</code> 
- 
-Production queues in quantity for most usage are q06h32c/q72h32c. 
- 
-"node pool shared" on tiny/med or q30m/q06h/q72h means that the queues allocate jobs from a common pool of identical nodes, with some dedicated for the shorter queues.  
- 
-For a complete listing of all defined queues and their properties on each cluster please use the ''**qstat -q**'' command. 
torque_queues.1580841840.txt.gz ยท Last modified: 2020/02/04 18:44 by root