This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
singularity-apptainer [2022/11/08 23:14] root |
singularity-apptainer [2023/06/14 20:54] (current) root |
||
---|---|---|---|
Line 7: | Line 7: | ||
Apptainer/ | Apptainer/ | ||
+ | ==== Example 1 ==== | ||
+ | |||
+ | |||
For an example: on your workstation as root, using (nearly identical) docker or podman commands: | For an example: on your workstation as root, using (nearly identical) docker or podman commands: | ||
Line 114: | Line 117: | ||
We have versions of both Sylabs Singularity (most recently singularity/ | We have versions of both Sylabs Singularity (most recently singularity/ | ||
+ | |||
+ | ==== Example 2: NVidia nvcr.io ==== | ||
+ | |||
+ | A login is required for NVidia containers, such as [[https:// | ||
+ | Get an NVidia login (also useful for free GTC conferences, | ||
+ | |||
+ | On your workstation as root using docker: | ||
+ | |||
+ | < | ||
+ | $ docker login nvcr.io | ||
+ | Username: $oauthtoken | ||
+ | Password: | ||
+ | |||
+ | Login Succeeded | ||
+ | |||
+ | $ docker pull nvcr.io/ | ||
+ | $ docker save nvcr.io/ | ||
+ | $ scp / | ||
+ | </ | ||
+ | |||
+ | As " | ||
+ | < | ||
+ | $ cd / | ||
+ | $ singularity build hpc-benchmarks.sif docker-archive:// | ||
+ | </ | ||
+ | |||
+ | Then for nvidia, get a gpu node with srun (and use shell --nv), or continue with cloud node (without --nv) for serial cpu computing: | ||
+ | |||
+ | < | ||
+ | $ singularity shell --nv --bind / | ||
+ | </ | ||
+ | |||
+ | For comparison, we'll run HPL with the cpu on the same node in bare metal. At this memory size it runs about 2.4 TF for this dual 7543 AMD. | ||
+ | |||
+ | < | ||
+ | $ module load gcc/7.3.1 mkl/19.0.5 impi/19.0.5 | ||
+ | $ mpirun -np 16 -genv MKL_NUM_THREADS 4 ./xhpl | ||
+ | ================================================================================ | ||
+ | HPLinpack 2.3 -- High-Performance Linpack benchmark | ||
+ | Written by A. Petitet and R. Clint Whaley, | ||
+ | Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK | ||
+ | Modified by Julien Langou, University of Colorado Denver | ||
+ | ================================================================================ | ||
+ | |||
+ | An explanation of the input/ | ||
+ | T/V : Wall time / encoded variant. | ||
+ | N : The order of the coefficient matrix A. | ||
+ | NB : The partitioning blocking factor. | ||
+ | P : The number of process rows. | ||
+ | Q : The number of process columns. | ||
+ | Time : Time in seconds to solve the linear system. | ||
+ | Gflops : Rate of execution for solving the linear system. | ||
+ | |||
+ | The following parameter values will be used: | ||
+ | |||
+ | N : | ||
+ | NB : | ||
+ | PMAP : Column-major process mapping | ||
+ | P : | ||
+ | Q : | ||
+ | PFACT : | ||
+ | NBMIN : | ||
+ | NDIV : | ||
+ | RFACT : Left | ||
+ | BCAST : | ||
+ | DEPTH : | ||
+ | SWAP : Spread-roll (long) | ||
+ | L1 : no-transposed form | ||
+ | U : transposed form | ||
+ | EQUIL : yes | ||
+ | ALIGN : 8 double precision words | ||
+ | |||
+ | ================================================================================ | ||
+ | T/V N NB | ||
+ | -------------------------------------------------------------------------------- | ||
+ | WC32L2C4 | ||
+ | |||
+ | |||
+ | ================================================================================ | ||
+ | |||
+ | Finished | ||
+ | 1 tests completed without checking, | ||
+ | 0 tests skipped because of illegal input values. | ||
+ | -------------------------------------------------------------------------------- | ||
+ | |||
+ | End of Tests. | ||
+ | ================================================================================ | ||
+ | |||
+ | </ | ||
+ | |||
+ | We'll copy this HPL.dat to apptainer and change the MPI grid from 4x4 to 1x1. In this measure, a single NVidia A100 GPU is a little over 5 times as fast as two AMD 7543 CPUs. The 40 GB memory of this GPU is close to full, though the CPU memory could hold more. | ||
+ | |||
+ | < | ||
+ | Apptainer> | ||
+ | Apptainer> | ||
+ | Apptainer> | ||
+ | |||
+ | ================================================================================ | ||
+ | HPL-NVIDIA 23.5.0 | ||
+ | ================================================================================ | ||
+ | HPLinpack 2.1 -- High-Performance Linpack benchmark | ||
+ | Written by A. Petitet and R. Clint Whaley, | ||
+ | Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK | ||
+ | Modified by Julien Langou, University of Colorado Denver | ||
+ | ================================================================================ | ||
+ | |||
+ | An explanation of the input/ | ||
+ | T/V : Wall time / encoded variant. | ||
+ | N : The order of the coefficient matrix A. | ||
+ | NB : The partitioning blocking factor. | ||
+ | P : The number of process rows. | ||
+ | Q : The number of process columns. | ||
+ | Time : Time in seconds to solve the linear system. | ||
+ | Gflops : Rate of execution for solving the linear system. | ||
+ | |||
+ | The following parameter values will be used: | ||
+ | |||
+ | N : | ||
+ | NB : | ||
+ | PMAP : Column-major process mapping | ||
+ | P : | ||
+ | Q : | ||
+ | PFACT : | ||
+ | NBMIN : | ||
+ | NDIV : | ||
+ | RFACT : Left | ||
+ | BCAST : | ||
+ | DEPTH : | ||
+ | SWAP : Spread-roll (long) | ||
+ | L1 : no-transposed form | ||
+ | U : transposed form | ||
+ | EQUIL : yes | ||
+ | ALIGN : 8 double precision words | ||
+ | |||
+ | ================================================================================ | ||
+ | T/V N NB | ||
+ | -------------------------------------------------------------------------------- | ||
+ | WC02R2R4 | ||
+ | ================================================================================ | ||
+ | |||
+ | Finished | ||
+ | 1 tests completed without checking, | ||
+ | 0 tests skipped because of illegal input values. | ||
+ | -------------------------------------------------------------------------------- | ||
+ | |||
+ | End of Tests. | ||
+ | ================================================================================ | ||
+ | Apptainer> | ||
+ | </ | ||
==References and Finding Prebuilt Containers== | ==References and Finding Prebuilt Containers== | ||
Line 124: | Line 276: | ||
[[https:// | [[https:// | ||
+ | |||
+ | [[https:// | ||
There are relatively small collections of Apptainer/ | There are relatively small collections of Apptainer/ |