User Tools

Site Tools


queueing_system

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
queueing_system [2016/03/07 20:53]
pwolinsk
queueing_system [2017/04/19 15:31] (current)
pwolinsk
Line 5: Line 5:
   * An //​**interactive job**// - a login shell is started on the first node assigned to the job.  The user, in turn, specifies the commands to execute at the command prompt.   * An //​**interactive job**// - a login shell is started on the first node assigned to the job.  The user, in turn, specifies the commands to execute at the command prompt.
  
-All compute nodes in the Razor and Trestles clusters ​are divided ​into groups of nodes called ​//​**queues**//​. ​ All nodes within each queue are identical. ​ The Queues ​differ from each other by the following factors:+A //**compute ​node**// is an individual computer which can be used to execute jobs.  Compute ​nodes are grouped ​into //​**queues**//​. ​ All nodes assigned to a particular ​queue are identical. ​ The queues ​differ from each other by the following factors:
  
   * type of cpu and number of cores on each node   * type of cpu and number of cores on each node
Line 13: Line 13:
   * walltime - the maximum amount of execution time for a single job  ​   * walltime - the maximum amount of execution time for a single job  ​
  
-[[queues|Queues]]+=== Node to Queue Assignment === 
 +All compute nodes are divided into groups called partitions. ​ A node can only belong to one partition. ​ A queue is made up of a collection of partitions. ​ A given partition can be assigned to multiple queues. ​ As a result most nodes are not exclusively assigned to a single queue, but are shared between multiple queues. ​ This configuration improves queue flexibility,​ but conceptually complicates the view of the queueing system for the user, i.e. makes it difficult to predict how many free nodes are there for a given queue. ​ To help to determine the number of available nodes per queue, a script //​**max_job_size**//​ is available:​ 
 + 
 +<​code>​ 
 +tres-l1:​pwolinsk:​$ max_job_size  
 +Maximum jobs size in number of nodes for immediate start per queue: 
 + 
 +        q30m32c: ​ 26 nodes     (max in partition: 26  queue cap:  64) 
 +        q06h32c: ​ 26 nodes     (max in partition: 26  queue cap:  64) 
 +        q72h32c: ​  8 nodes     (max in partition: ​ 8  queue cap:  32) 
 +      qcDouglas: ​  0 nodes     (max in partition: ​ 0  queue cap:   1) 
 +          qcABI: ​  0 nodes     (max in partition: ​ 0  queue cap:   1) 
 +         ​qcondo: ​  0 nodes     (max in partition: ​ 0  queue cap:   1) 
 +      qtraining: ​  2 nodes     (max in partition: ​ 2  queue cap:  64) 
 +tres-l1:​pwolinsk:​$  
 +</​code>​ 
 + 
 +The output of the script above shows that a job requesting up to 26 nodes in the queue //​**q06h32c**//​ should start immediately. 
 + 
 +[[queues|Queues]] ​- summary of public queues
  
 [[batch|Batch Jobs]] [[batch|Batch Jobs]]
  
 [[interactive|Interactive Jobs]] [[interactive|Interactive Jobs]]
 +
 +[[condo queues|Condo Queues]]
 +
 +[[walltime extensions|Job Walltime Extensions]]
queueing_system.1457383995.txt.gz · Last modified: 2016/03/07 20:53 by pwolinsk