User Tools

Site Tools


sub_node_jobs

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
sub_node_jobs [2024/03/01 17:15]
root created
sub_node_jobs [2024/03/01 18:13] (current)
root
Line 7: Line 7:
 This schedules fairly efficiently but leaves a gap for jobs of 1/16 to 1/2 node scale or 2 to 16 cores for the examples.  One of those fractional-node jobs can be run in the cloud partition without affecting it much.  Multiples of these can fill the (small) cloud partition. In particular, where there are thousands of small jobs to be run, it is important to do so efficiently. Here are a couple of methods to run multiple fractional-node jobs efficiently using full compute nodes. This schedules fairly efficiently but leaves a gap for jobs of 1/16 to 1/2 node scale or 2 to 16 cores for the examples.  One of those fractional-node jobs can be run in the cloud partition without affecting it much.  Multiples of these can fill the (small) cloud partition. In particular, where there are thousands of small jobs to be run, it is important to do so efficiently. Here are a couple of methods to run multiple fractional-node jobs efficiently using full compute nodes.
  
-These require some intermediate level scripting.  We encourage you to contact HPC support for scripting help, particularly for very large parameter sweeps.+These require some intermediate level scripting.  We encourage you to contact HPC support for scripting help, particularly for very large parameter sweeps.  In such cases, the arrangement in the file system and other factors can be a big influence on the ease of completion.  It's best to test and plan ahead.
  
 === at now === === at now ===
Line 13: Line 13:
 This method is good for small jobs that consistently run through their lifetime at, for instance, 4 cores. In this 32-core node, it can run 8 of these concurrently with different parameters.  Each run is abstracted into a script to simplify the Slurm submit script. This method is good for small jobs that consistently run through their lifetime at, for instance, 4 cores. In this 32-core node, it can run 8 of these concurrently with different parameters.  Each run is abstracted into a script to simplify the Slurm submit script.
   - the multiple sub-jobs should be relatively uniform in workload so that one doesn't take 10 times as long as the others, which would end up a long period of 1 core taking a full compute node, which is what we are trying to avoid here   - the multiple sub-jobs should be relatively uniform in workload so that one doesn't take 10 times as long as the others, which would end up a long period of 1 core taking a full compute node, which is what we are trying to avoid here
-  - the ``wait`` at the end of the submit script is necessary to keep from Slurm from prematurely ending, as the backgrounded jobs will return the console to Slurm and it will think the overall job is done and log out.+  - the ``wait`` at the end of the submit script is necessary to keep from Slurm from prematurely ending, as the backgrounded jobs will return the console to Slurm and it will think the overall job is done and log out and either kill the background jobs or leave zombies running, either of which is bad. 
 +  - in this example, the 8 jobs running the same script with different parameters (1-8) simulate running 8 different sub-jobs with different data in the same directory. If your process to run always has the same input and output files, the sub-jobs will need to be in different directories, which should be handled by the script here called ``bench.sh``.  But the 8 ``at now`` statements could also be 8 totally different scripts, though it would be hard in that case to ensure that they take about the same time to run.
  
 <code> <code>
Line 88: Line 89:
 ``at now`` and ``parallel --jobs ##`` are useful for fixed loads or loads defined by memory like the second example.  Where the sub-jobs vary in load over their lifetime, a variation in Gnu parallel can help run the compute node near maximum throughput. This sometimes occurs in bioinformatics workflows where several programs of varying parallelism are chained together.  Sometimes the input files are small and memory is not a concern.  To get the best throughput, we would try to maximize the amount of work done concurrently until the compute node approaches overload, when it starts spending time just switching between many processes instead of doing useful work. ``at now`` and ``parallel --jobs ##`` are useful for fixed loads or loads defined by memory like the second example.  Where the sub-jobs vary in load over their lifetime, a variation in Gnu parallel can help run the compute node near maximum throughput. This sometimes occurs in bioinformatics workflows where several programs of varying parallelism are chained together.  Sometimes the input files are small and memory is not a concern.  To get the best throughput, we would try to maximize the amount of work done concurrently until the compute node approaches overload, when it starts spending time just switching between many processes instead of doing useful work.
  
-This case would be similar to the second example, except replace ``--jobs 12`` to run exactly 12 jobs concurrently until the command list is finished with ``--load 32`` to maintain system load of 32 until the command list is finished. System load is an abstraction derived from how many processes are asking for more time from the system (see current load averages with the ``uptime`` command).  For a CPU-driven load, the maximum throughput value is equal to the number of physical cores in the system, so 32 for the ``comp72`` partitions of these examples.  Some testing is usually indicated to find the best load value for a particular problem set, please contact HPC support for help.+This case would be similar to the second example, except replace ``--jobs 12`` to run exactly 12 jobs concurrently until the command list is finished with ``--load 32`` to attempt to maintain system load of 32 until the command list is finished. System load is an abstraction derived from how many processes are asking for more time from the system (see current load averages on a compute node with the ``uptime`` command).  For a CPU-driven load, the maximum throughput value is usually equal to the number of physical cores in the system, so 32 for the ``comp72`` partitions of these examples.  Some testing is usually indicated to find the best load value for a particular problem set, please contact HPC support for help.
  
 <code> <code>
 cat commands.txt | parallel --will-cite -q --load 32 --colsep '\s+' ./test.sh {} cat commands.txt | parallel --will-cite -q --load 32 --colsep '\s+' ./test.sh {}
 </code> </code>
sub_node_jobs.1709313329.txt.gz · Last modified: 2024/03/01 17:15 by root