singularity-apptainer
Differences
This shows you the differences between two versions of the page.
| Both sides previous revisionPrevious revisionNext revision | Previous revision | ||
| singularity-apptainer [2022/11/09 19:28] – root | singularity-apptainer [2025/10/15 19:51] (current) – external edit 127.0.0.1 | ||
|---|---|---|---|
| Line 7: | Line 7: | ||
| Apptainer/ | Apptainer/ | ||
| + | ==== Example 1 ==== | ||
| + | |||
| + | |||
| For an example: on your workstation as root, using (nearly identical) docker or podman commands: | For an example: on your workstation as root, using (nearly identical) docker or podman commands: | ||
| Line 114: | Line 117: | ||
| We have versions of both Sylabs Singularity (most recently singularity/ | We have versions of both Sylabs Singularity (most recently singularity/ | ||
| + | |||
| + | ==== Example 2: NVidia nvcr.io ==== | ||
| + | |||
| + | A login is required for NVidia containers, such as [[https:// | ||
| + | Get an NVidia login (also useful for free GTC conferences, | ||
| + | |||
| + | On your workstation as root using docker: | ||
| + | |||
| + | < | ||
| + | $ docker login nvcr.io | ||
| + | Username: $oauthtoken | ||
| + | Password: | ||
| + | |||
| + | Login Succeeded | ||
| + | |||
| + | $ docker pull nvcr.io/ | ||
| + | $ docker save nvcr.io/ | ||
| + | $ scp / | ||
| + | </ | ||
| + | |||
| + | As " | ||
| + | < | ||
| + | $ cd / | ||
| + | $ singularity build hpc-benchmarks.sif docker-archive:// | ||
| + | </ | ||
| + | |||
| + | Then for nvidia, get a gpu node with srun (and use shell --nv), or continue with cloud node (without --nv) for serial cpu computing: | ||
| + | |||
| + | < | ||
| + | $ singularity shell --nv --bind / | ||
| + | </ | ||
| + | |||
| + | For comparison, we'll run HPL with the cpu on the same node in bare metal. At this memory size it runs about 2.4 TF for this dual 7543 AMD. | ||
| + | |||
| + | < | ||
| + | $ module load gcc/7.3.1 mkl/19.0.5 impi/19.0.5 | ||
| + | $ mpirun -np 16 -genv MKL_NUM_THREADS 4 ./xhpl | ||
| + | ================================================================================ | ||
| + | HPLinpack 2.3 -- High-Performance Linpack benchmark | ||
| + | Written by A. Petitet and R. Clint Whaley, | ||
| + | Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK | ||
| + | Modified by Julien Langou, University of Colorado Denver | ||
| + | ================================================================================ | ||
| + | |||
| + | An explanation of the input/ | ||
| + | T/V : Wall time / encoded variant. | ||
| + | N : The order of the coefficient matrix A. | ||
| + | NB : The partitioning blocking factor. | ||
| + | P : The number of process rows. | ||
| + | Q : The number of process columns. | ||
| + | Time : Time in seconds to solve the linear system. | ||
| + | Gflops : Rate of execution for solving the linear system. | ||
| + | |||
| + | The following parameter values will be used: | ||
| + | |||
| + | N : | ||
| + | NB : | ||
| + | PMAP : Column-major process mapping | ||
| + | P : | ||
| + | Q : | ||
| + | PFACT : | ||
| + | NBMIN : | ||
| + | NDIV : | ||
| + | RFACT : Left | ||
| + | BCAST : | ||
| + | DEPTH : | ||
| + | SWAP : Spread-roll (long) | ||
| + | L1 : no-transposed form | ||
| + | U : transposed form | ||
| + | EQUIL : yes | ||
| + | ALIGN : 8 double precision words | ||
| + | |||
| + | ================================================================================ | ||
| + | T/V N NB | ||
| + | -------------------------------------------------------------------------------- | ||
| + | WC32L2C4 | ||
| + | |||
| + | |||
| + | ================================================================================ | ||
| + | |||
| + | Finished | ||
| + | 1 tests completed without checking, | ||
| + | 0 tests skipped because of illegal input values. | ||
| + | -------------------------------------------------------------------------------- | ||
| + | |||
| + | End of Tests. | ||
| + | ================================================================================ | ||
| + | |||
| + | </ | ||
| + | |||
| + | We'll copy this HPL.dat to apptainer and change the MPI grid from 4x4 to 1x1. In this measure, a single NVidia A100 GPU is a little over 5 times as fast as two AMD 7543 CPUs. The 40 GB memory of this GPU is close to full, though the CPU memory could hold more. | ||
| + | |||
| + | < | ||
| + | Apptainer> | ||
| + | Apptainer> | ||
| + | Apptainer> | ||
| + | |||
| + | ================================================================================ | ||
| + | HPL-NVIDIA 23.5.0 | ||
| + | ================================================================================ | ||
| + | HPLinpack 2.1 -- High-Performance Linpack benchmark | ||
| + | Written by A. Petitet and R. Clint Whaley, | ||
| + | Modified by Piotr Luszczek, Innovative Computing Laboratory, UTK | ||
| + | Modified by Julien Langou, University of Colorado Denver | ||
| + | ================================================================================ | ||
| + | |||
| + | An explanation of the input/ | ||
| + | T/V : Wall time / encoded variant. | ||
| + | N : The order of the coefficient matrix A. | ||
| + | NB : The partitioning blocking factor. | ||
| + | P : The number of process rows. | ||
| + | Q : The number of process columns. | ||
| + | Time : Time in seconds to solve the linear system. | ||
| + | Gflops : Rate of execution for solving the linear system. | ||
| + | |||
| + | The following parameter values will be used: | ||
| + | |||
| + | N : | ||
| + | NB : | ||
| + | PMAP : Column-major process mapping | ||
| + | P : | ||
| + | Q : | ||
| + | PFACT : | ||
| + | NBMIN : | ||
| + | NDIV : | ||
| + | RFACT : Left | ||
| + | BCAST : | ||
| + | DEPTH : | ||
| + | SWAP : Spread-roll (long) | ||
| + | L1 : no-transposed form | ||
| + | U : transposed form | ||
| + | EQUIL : yes | ||
| + | ALIGN : 8 double precision words | ||
| + | |||
| + | ================================================================================ | ||
| + | T/V N NB | ||
| + | -------------------------------------------------------------------------------- | ||
| + | WC02R2R4 | ||
| + | ================================================================================ | ||
| + | |||
| + | Finished | ||
| + | 1 tests completed without checking, | ||
| + | 0 tests skipped because of illegal input values. | ||
| + | -------------------------------------------------------------------------------- | ||
| + | |||
| + | End of Tests. | ||
| + | ================================================================================ | ||
| + | Apptainer> | ||
| + | </ | ||
| ==References and Finding Prebuilt Containers== | ==References and Finding Prebuilt Containers== | ||
singularity-apptainer.1668022131.txt.gz · Last modified: (external edit)
