User Tools

Site Tools


queueing_system

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
queueing_system [2016/03/07 20:54]
pwolinsk
queueing_system [2017/04/19 15:31] (current)
pwolinsk
Line 5: Line 5:
   * An //​**interactive job**// - a login shell is started on the first node assigned to the job.  The user, in turn, specifies the commands to execute at the command prompt.   * An //​**interactive job**// - a login shell is started on the first node assigned to the job.  The user, in turn, specifies the commands to execute at the command prompt.
  
-All compute nodes in the Razor and Trestles clusters ​are divided ​into groups of nodes called ​//​**queues**//​. ​ All nodes assigned to a particular queue are identical. ​ The queues differ from each other by the following factors:+A //**compute ​node**// is an individual computer which can be used to execute jobs.  Compute ​nodes are grouped ​into //​**queues**//​. ​ All nodes assigned to a particular queue are identical. ​ The queues differ from each other by the following factors:
  
   * type of cpu and number of cores on each node   * type of cpu and number of cores on each node
Line 12: Line 12:
   * amount of memory   * amount of memory
   * walltime - the maximum amount of execution time for a single job  ​   * walltime - the maximum amount of execution time for a single job  ​
 +
 +=== Node to Queue Assignment ===
 +All compute nodes are divided into groups called partitions. ​ A node can only belong to one partition. ​ A queue is made up of a collection of partitions. ​ A given partition can be assigned to multiple queues. ​ As a result most nodes are not exclusively assigned to a single queue, but are shared between multiple queues. ​ This configuration improves queue flexibility,​ but conceptually complicates the view of the queueing system for the user, i.e. makes it difficult to predict how many free nodes are there for a given queue. ​ To help to determine the number of available nodes per queue, a script //​**max_job_size**//​ is available:
 +
 +<​code>​
 +tres-l1:​pwolinsk:​$ max_job_size ​
 +Maximum jobs size in number of nodes for immediate start per queue:
 +
 +        q30m32c: ​ 26 nodes     (max in partition: 26  queue cap:  64)
 +        q06h32c: ​ 26 nodes     (max in partition: 26  queue cap:  64)
 +        q72h32c: ​  8 nodes     (max in partition: ​ 8  queue cap:  32)
 +      qcDouglas: ​  0 nodes     (max in partition: ​ 0  queue cap:   1)
 +          qcABI: ​  0 nodes     (max in partition: ​ 0  queue cap:   1)
 +         ​qcondo: ​  0 nodes     (max in partition: ​ 0  queue cap:   1)
 +      qtraining: ​  2 nodes     (max in partition: ​ 2  queue cap:  64)
 +tres-l1:​pwolinsk:​$ ​
 +</​code>​
 +
 +The output of the script above shows that a job requesting up to 26 nodes in the queue //​**q06h32c**//​ should start immediately.
  
 [[queues|Queues]] - summary of public queues [[queues|Queues]] - summary of public queues
Line 18: Line 37:
  
 [[interactive|Interactive Jobs]] [[interactive|Interactive Jobs]]
 +
 +[[condo queues|Condo Queues]]
 +
 +[[walltime extensions|Job Walltime Extensions]]
queueing_system.1457384067.txt.gz · Last modified: 2016/03/07 20:54 by pwolinsk