User Tools

Site Tools


sapplication_software

**This is an old revision of the document!**

Application Software

Locating and using software has been made a little more complicated by some at-the-time reasonable decisions made 50 years ago for Unix: /usr/local for applications, $PATH to find an executable, $LD_LIBRARY_PATH to find dynamic link libraries, the file ~/.cshrc to set up these variables. These environment variables continue today in Linux and Mac, while Windows combines the two PATH variables. Also important were software packages intended to be used as infrastructure for complete applications, and which didn't need to be copied into the code of every project.. These “shared libraries” such as MPI and FFTW were specified by source code interfaces. At the time, nearly everyone used one computer and one compiler, so a source interface corresponded directly to one binary interface. Today there are many types of computers and compilers, and modern applications often have a defined binary interface or “ABI” to avoid compatibility issues.

Today on HPC systems, the MPI implementation must be heavily customized for each site and its network fabric. Almost all multi-node programs depend on MPI, and there are three popular implentations of MPI (Open MPI, MVAPICH2, and Intel). There are about six compilers that are reasonably popular (GNU/gcc, Intel proprietary, Intel open LLVM, NVidia/PGI, AMD LLVM, not-installed stock LLVM). MVAPICH2 and Intel MPI are binary compatible (thus a defacto ABI). Probably all the LLVM implementations are binary compatible (with each other, not with Open MPI vs MVAPICH2), so there are about 8 to 12 binary versions of MPI, not considering updates for each that come out 2 or 3 times a year.

With thousands of applications (most of which have multiple versions), it's obviously impractical just from name collisions to put every executable in /usr/local/bin. It's also impractical (unless you use only one application) to semipermanently set up these variables in ~/.bashrc or ~/.cshrc. There are several ways to handle this.

Modules

Almost all HPC centers use “modules” software to help manage versioning. This was originally Environment Modules and at most centers, including this one, has been replaced by an upward compatible rewrite Lmod. The primary use is to manage $PATH,$L\D_LIBRARY_PATH, and other environment variables over a large number of applications. Unfortunately the name is easily confused with the unrelated packaged programs in modular languages such as Python and R:Python modules.

Module command syntax for most uses is relatively simple: load/unload to invoke/remove a module, purge to unload all modules, list to show loaded modules, help, and spider for searching. We share some examples below for our three sources of software and module definitions “modulefiles”. There is a complete list of modulefiles in the text file /share/apps/modulelist which can be grepped.

Locally written modulefiles

There are currently about 660 locally written modulefiles, some of which have some “smart” capability to select from multiple software builds compiled for the computer loading the module. This shows the first 5:

$ grep "/share/apps/modulefiles" /share/apps/modulelist | head -5
/share/apps/modulefiles/ abinit/8.0.8b
/share/apps/modulefiles/ abinit/8.4.4
/share/apps/modulefiles/ abinit/8.6.1
/share/apps/modulefiles/ abinit/8.6.1-QFDcc
/share/apps/modulefiles/ abinit/8.6.1-qFD-trestles
OpenHPC modulefiles

At this writing there are 548 module files from the OpenHPC distribution. It concentrates on mathematical software. Many of the packages are a bit dated, but when we update to Rocky 8 Linux we will be able to install some newer packages.

You can run particular packages like so. For an example, we'll try to find the newest version of petsc and the MPI and compiler package that it needs. We then load those three and it auto-loads some prerequisites.

$ grep petsc /share/apps/modulelist

/opt/ohpc/pub/moduledeps/ gnu7-impi/petsc/3.9.1
/opt/ohpc/pub/moduledeps/ gnu7-mpich/petsc/3.9.1
/opt/ohpc/pub/moduledeps/ gnu7-mvapich2/petsc/3.9.1
/opt/ohpc/pub/moduledeps/ gnu7-openmpi3/petsc/3.9.1
/opt/ohpc/pub/moduledeps/ gnu7-openmpi/petsc/3.8.3
/opt/ohpc/pub/moduledeps/ gnu8-impi/petsc/3.12.0
/opt/ohpc/pub/moduledeps/ gnu8-mpich/petsc/3.12.0
/opt/ohpc/pub/moduledeps/ gnu8-mvapich2/petsc/3.12.0
/opt/ohpc/pub/moduledeps/ gnu8-openmpi3/petsc/3.12.0
/opt/ohpc/pub/moduledeps/ gnu-impi/petsc/3.8.3
/opt/ohpc/pub/moduledeps/ gnu-mpich/petsc/3.8.3
/opt/ohpc/pub/moduledeps/ gnu-mvapich2/petsc/3.8.3
/opt/ohpc/pub/moduledeps/ gnu-openmpi/petsc/3.8.3
/opt/ohpc/pub/moduledeps/ intel-impi/petsc/3.12.0
/opt/ohpc/pub/moduledeps/ intel-mpich/petsc/3.12.0
/opt/ohpc/pub/moduledeps/ intel-mvapich2/petsc/3.12.0
/opt/ohpc/pub/moduledeps/ intel-openmpi3/petsc/3.12.0
/opt/ohpc/pub/moduledeps/ intel-openmpi/petsc/3.8.3
/share/apps/modulefiles/ petsc/3.10.5
/share/apps/modulefiles/ petsc/3.11.3
/share/apps/modulefiles/ petsc/3.14.2
/share/apps/modulefiles/ petsc/3.16.4
/share/apps/modulefiles/ petsc/3.8.58

$ grep "gnu8/openmpi3/" /share/apps/modulelist

/opt/ohpc/pub/moduledeps/ gnu8/openmpi3/3.1.4

$ grep "gnu8/8" /share/apps/modulelist

/opt/ohpc/pub/modulefiles/ gnu8/8.3.0

$ module load gnu8/8.3.0 gnu8/openmpi3/3.1.4 gnu8-openmpi3/petsc/3.12.0
$ module list

Currently Loaded Modules:
  1) gnu8/8.3.0   2) gnu8/openmpi3/3.1.4   3) gnu8-openmpi3/phdf5/1.10.5   4) openblas/3.20-noomp   5) gnu8-openmpi3/scalapack/2.0.2   6) gnu8-openmpi3/petsc/3.12.0

It is likely that this will only work on one node since it links back to its internal MPI which is not customized for our network.

Spack modulefiles

There are currently 114 programs auto-installed using Spack. This will need to be duplicated for every node type in the cluster, which will take some time. We have begun mostly with bioinformatics programs that will have fewer MPI complicataions.

$ grep spack /share/apps/modulelist | head -5

/share/apps/spackmodulefiles/ gcc-11.2.1/SKYLAKEX/abinit/9.6.1
/share/apps/spackmodulefiles/ gcc-11.2.1/SKYLAKEX/abyss/2.3.1
/share/apps/spackmodulefiles/ gcc-11.2.1/SKYLAKEX/atompaw/4.2.0.1
/share/apps/spackmodulefiles/ gcc-11.2.1/SKYLAKEX/bamtools/2.5.2
/share/apps/spackmodulefiles/ gcc-11.2.1/SKYLAKEX/bcftools/1.14

As an example, we'll try to run abyss-pe in parallel as in Biowulf-abyss. The module turns out to call, but not to load, mpirun so we will add some local modules. At this time, all the local Spack software is compiled with gcc/11.2.1 and openmpi/4.1.4 (though not our versions) so we will use those modules. This does work, though we have not tested with multiple nodes, as it is not clear which version of MPI is actually being linked to.

$ module load gcc-11.2.1/SKYLAKEX/abyss/2.3.1
$ abyss-pe np=32 j=8 k=25 n=10 in='*fq' name=OutputPrefix

mpirun -np 32 ABYSS-P -k25 -q3    --coverage-hist=coverage.hist -s OutputPrefix-bubbles.fa  -o OutputPrefix-1.fa *fq 
bash: mpirun: command not found
make: *** [OutputPrefix-1.fa] Error 127

$ module load gcc/11.2.1 openmpi/4.1.4
$ time abyss-pe np=32 j=8 k=25 n=10 in='*fq' name=OutputPrefix

/share/apps/mpi/openmpi-4.1.4/cuda/gcc/bin/mpirun -np 32 ABYSS-P -k25 -q3    --coverage-hist=coverage.hist -s OutputPrefi-bubbles.fa  -o OutputPrefi-1.fa *fq 
ABySS 2.3.1
ABYSS-P -k25 -q3 --coverage-hist=coverage.hist -s OutputPrefi-bubbles.fa -o OutputPrefi-1.fa Thalassiosira-weissflogii_AJA159-02_0ppt_r8_BS-440_trimmed_filtered_1.fq Thalassiosira-weissflogii_AJA159-02_0ppt_r8_BS-440_trimmed_filtered_2.fq
Running on 32 processors
4: Running on host c1411
.etc

Most parallel programs require the selection of a compiler and an MPI version. We usually recommend the following compiler versions (select only one, usually, with exceptions noted below):

module load gcc/11.2.1    
#synonym gnu also works, latest gnu compiler from "Centos 7 Development Tools", enables gcc/g++/gfortran

module load intel/21.2.0
#synonym intelcompiler also works, both intel proprietary icc/icpc/ifort and intel llvm icx/icpx/ifx

module load nvhpc/22.7
#synonym PGI also works, Nvidia/PGI compiler equally nvc/nvc++/nvfortran and pgcc/pgc++/pgf77/pgf90/pgf95/pgfortran

module load aocc/3.0
#AMD llvm compiler clang/clang++/flang

If you don't load any modules, there are some very old compilers built into Centos:

gcc --version
gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-44)

clang -v
clang version 3.4.2 (tags/RELEASE_34/dot2-final)
Target: x86_64-redhat-linux-gnu
Thread model: posix
Found candidate GCC installation: /bin/../lib/gcc/x86_64-redhat-linux/4.8.2
Found candidate GCC installation: /bin/../lib/gcc/x86_64-redhat-linux/4.8.5
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.2
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.5
Selected GCC installation: /bin/../lib/gcc/x86_64-redhat-linux/4.8.5

These compilers aren't recommended, but probably suffice for running non-parallel applications compiled with newer gcc.

We recommend the following MPI versions. Definitely select only one (though at runtime mvapich2 and impi should be equivalent):

openmpi/4.1.4
#with gcc, intel, nvhpc

mvapich2/2.3.7
#with gcc, intel

impi/17.0.7
#with gcc, intel

In combination we recommend (as compiler, then mpi in order so that the correct libraries are loaded).

module load { gcc/11.2.1 | intel/21.2.0 | nvhpc/22.7 } openmpi/4.1.4

module load { gcc/11.2.1 | intel/21.2.0 } mvapich2/2.3.7

module load { gcc/11.2.1 | intel/21.2.0 } impi/17.0.7

There are a couple of situations where you would want multiple compilers loaded (But first compiler-MPI version will determine the MPI code that is loaded).

(1) Most c++ compilers use the gnu c++ include libraries. For a program (LAMMPS is one) that uses a lot of relatively recent c++, you will want a recent gcc to provide those libraries.

This works with the Intel proprietary icpc compiler

module load intel/17.0.7 openmpi/4.1.4 gcc/11.2.1

If you don't add the third module, icpc will use the libraries from the default Centos g++ 4.8.5 which is quite old and probably can't compile LAMMPS at all.

(2) llvm compilers (aocc/3.0.0 and intel/21.2.0 icx) try to auto-find g++ libraries but don't do it quite correctly.

module load aocc/3.0.0
clang++ -v
AMD clang version 12.0.0 (CLANG: AOCC_3.0.0-Build#78 2020_12_10) (based on LLVM Mirror.Version.12.0.0)
Target: x86_64-unknown-linux-gnu
Thread model: posix
InstalledDir: /opt/AMD/aocc-compiler-3.0.0/bin
Found candidate GCC installation: /opt/rh/devtoolset-7/root/usr/lib/gcc/x86_64-redhat-linux/7
Found candidate GCC installation: /opt/rh/devtoolset-8/root/usr/lib/gcc/x86_64-redhat-linux/8
Found candidate GCC installation: /opt/rh/devtoolset-9/root/usr/lib/gcc/x86_64-redhat-linux/9
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.2
Found candidate GCC installation: /usr/lib/gcc/x86_64-redhat-linux/4.8.5
Selected GCC installation: /opt/rh/devtoolset-9/root/usr/lib/gcc/x86_64-redhat-linux/9

so it picks devtoolset-9 libraries in spite of devtoolset-10 and 11 being available

$ ls /opt/rh
devtoolset-10  devtoolset-11  devtoolset-3  devtoolset-7  devtoolset-8  devtoolset-9  

if devtoolset-9 (gcc/9.3.1) is new enough, then that's ok.

(3) Sometimes Intel MKL will link back to the Intel compiler when using Intel OMP instead of GNU OMP. This should work:

module load gcc/11.2.1 mkl/20.0.4 openmpi/4.1.4 intel/17.0.7
sapplication_software.1664309143.txt.gz · Last modified: 2022/09/27 20:05 by root