User Tools

Site Tools


resource_selection

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
resource_selection [2024/01/23 21:43]
root
resource_selection [2024/01/23 21:54] (current)
root
Line 5: Line 5:
  
 This page details the ``public`` use partitions, those that are available for all researchers who are eligible to use AHPCC.  For these ``public`` partitions, every computer in a given partition is identical, so the partition itself specifies the configuration.   A **second page** [[condo_nodes]] details the (currently 164) researcher-funded compute nodes which can be accessed for **either** (1) dedicated use by those researchers in the ``condo`` partition, **or** (2) limited-time ``public`` usage in the ``pcon06`` partition.  Both uses require some additional specification to select the correct nodes out of currently 30 node configurations. This page details the ``public`` use partitions, those that are available for all researchers who are eligible to use AHPCC.  For these ``public`` partitions, every computer in a given partition is identical, so the partition itself specifies the configuration.   A **second page** [[condo_nodes]] details the (currently 164) researcher-funded compute nodes which can be accessed for **either** (1) dedicated use by those researchers in the ``condo`` partition, **or** (2) limited-time ``public`` usage in the ``pcon06`` partition.  Both uses require some additional specification to select the correct nodes out of currently 30 node configurations.
 +
 +==login nodes==
 +
 +When you first login to AHPCC, you will be logged into a ``login node`` or ``frontend node`` or ``portal node`` depending on whether you used ``ssh`` or the ``https`` OpenOnDemand portal to access.  All are 
 +low-powered and only suitable for starting graphical sessions, submitting jobs, editing, and viewing output. **Do not run any kind of computational task** on them.  Submit either a portal job, a batch script, or an srun session to do intensive computing. Those sessions run via ``Slurm partitions`` on one or more of our approximately 400 compute nodes.
  
 ==comp partitions== ==comp partitions==
  
-The most numerous and most crowded computing resources are the ``comp01``,``comp06``,``comp72`` partitions, which are overlaid on mostly the same set of about 50 compute nodes, and differ only by time limit. You can use the program shown below to search for idle nodes in every partition.  All nodes are identical in the ``comp`` and ``cloud`` partitions: dual Intel Gold 6130 with no gpu, 32 cores, and 192 GB of memory. +The most popular (and hence the slowest starting) computing resources are the ``comp01``,``comp06``,``comp72`` partitions, which are overlaid on mostly the same set of about 50 compute nodes, and differ only by time limit. You can use the program shown below to search for idle nodes in every partition.  All nodes are identical in the ``comp`` and ``cloud`` partitions: dual Intel Gold 6130 with no gpu, 32 cores, and 192 GB of memory. 
  
 If your code **does not use a GPU** and is **either** (1) **MPI or shared memory parallel and able to use 32 cores**, or (2) **uses over 100 GB of main memory with 1 to 32** cores, you **may use** the ``comp`` partitions. These partitions are popular and often full.  ``comp01``,``comp06``, and ``comp72`` have time limits of 1,6, and 72 hours respectively.   The advantage of the shorter time limits is that ``comp06`` has a higher queue priority than ``comp72``, and ``comp01`` has a higher queue priority than ``comp06``, thus enabling faster starting of jobs.  The partition time limits are hard limits:  ``comp01`` will terminate after 1 hour whether finished or not. If your code **does not use a GPU** and is **either** (1) **MPI or shared memory parallel and able to use 32 cores**, or (2) **uses over 100 GB of main memory with 1 to 32** cores, you **may use** the ``comp`` partitions. These partitions are popular and often full.  ``comp01``,``comp06``, and ``comp72`` have time limits of 1,6, and 72 hours respectively.   The advantage of the shorter time limits is that ``comp06`` has a higher queue priority than ``comp72``, and ``comp01`` has a higher queue priority than ``comp06``, thus enabling faster starting of jobs.  The partition time limits are hard limits:  ``comp01`` will terminate after 1 hour whether finished or not.
Line 32: Line 37:
 ``cloud72`` **is usually the best partition** for simple tasks such as compiling, moving files, and non-compute-intensive non-parallel ``R``,``matlab``,and ``python``.   ``cloud72`` **is usually the best partition** for simple tasks such as compiling, moving files, and non-compute-intensive non-parallel ``R``,``matlab``,and ``python``.  
  
-The frontend/login virtual machines are very low-powered only suitable for submitting jobs, editing, and viewing output. **Do not run any kind of computational task** on them.+
  
 <code> <code>
Line 43: Line 48:
 ==himem partitions== ==himem partitions==
  
-There is small partition of high-memory (768 GB) Intel computers called ``himem06`` and ``himem72``. If your program can run on a 192 GB ``comp`` node, **use that instead**.  When running new programs, you can use ``himem`` once to find out what the memory usage is. The ``himem`` nodes are the same architecture as the ``comp`` nodes, but have only 24 cores of higher frequency to better run poorly scaled codes such as bioinformatics and ``comsol``.+There is two small partitions of high-memory (768 GB) Intel computers called ``himem06`` and ``himem72``. If your program can run on a 192 GB ``comp`` node, **use that instead**.  When running new programs, you can use ``himem`` once to find out what the memory usage is. The ``himem`` nodes are the same architecture as the ``comp`` nodes, but have only 24 cores of higher frequency to better run poorly scaled codes requiring large shared memory such as bioinformatics programs and ``comsol``.
    
 <code> <code>
Line 57: Line 62:
 ==gpu partitions== ==gpu partitions==
  
-If your code **does use an NVidia GPU**, you may use the ``gpu72``,``agpu72``,or ``qgpu72`` partitions. The most numerous are ``gpu72`` nodes similar to the ``comp`` nodes except also having a single V100 GPU. +If your code **uses an NVidia GPU**, you may use the ``gpu72``,``agpu72``,or ``qgpu72`` partitions. The most numerous are ``gpu72`` nodes similar to the ``comp`` nodes except also having a single V100 GPU.  
 +Non-GPU programs submitted to GPU partitions are subject to immediate cancellation.
  
 <code> <code>
resource_selection.1706046211.txt.gz · Last modified: 2024/01/23 21:43 by root