====lammps====
For a reference Intel compilation, see
''/scrfs/apps/lammps/lammps-16Mar18/src/lmp_pinnacle-threaded''. Lammps modules are
/scrfs/apps/lammps/lammps-16Mar18/src$ make ps | grep "YES:"
Installed YES: package CLASS2
Installed YES: package KSPACE
Installed YES: package MANYBODY
Installed YES: package MOLECULE
Installed YES: package USER-FEP
Installed YES: package USER-INTEL
Installed YES: package USER-MESO
Installed YES: package USER-MISC
Installed YES: package USER-MOLFILE
Installed YES: package USER-REAXC
Installed YES: package USER-SMD
Makefile is a slight modification of the Lammps-included Makefile.knl. This is compiled on Trestles to be upwards compatible with all systems. For the Phis you would want to include the MIC-AVX512 from the original.
/scrfs/apps/lammps/lammps-16Mar18/src/MAKE/OPTIONS$ diff Makefile.knl Makefile.pinnacle-threaded
10c10
< OPTFLAGS = -xMIC-AVX512 -O2 -fp-model fast=2 -no-prec-div -qoverride-limits
---
> OPTFLAGS = -xHOST -axsse4.2,AVX,CORE-AVX512 -O2 -fp-model fast=2 -no-prec-div -qoverride-limits
12c12,13
< -DLMP_INTEL_USELRT -DLMP_USE_MKL_RNG $(OPTFLAGS)
---
> -DLMP_INTEL_USELRT -DLMP_USE_MKL_RNG $(OPTFLAGS) \
> -I/scrfs/apps/intel/parallel_studio_xe_2019_update_4/compilers_and_libraries_2019.5.281/linux/tbb/include
18c19
< LIB = -ltbbmalloc
---
> LIB = -L$(MKLROOT)/lib/intel64/ -lmkl_lapack95_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -ltbbmalloc
33c34
< LMP_INC = -DLAMMPS_GZIP -DLAMMPS_JPEG
---
> LMP_INC = -DLAMMPS_GZIP -DLAMMPS_MEMALIGN=64 -DLAMMPS_JPEG
119c120
< cc -O -o $@ $<
---
> icc -O -o $@ $<
Best known run configuration on all (non-Phi) systems is with MPI threads=physical cores/2 like so for Pinnacle 32-core:
module load intel/19.0.5 impi/19.0.5 mkl/19.0.5
mpirun -np 16 -genv OMP_NUM_THREADS=2 /scrfs/apps/lammps/lammps-16Mar18/src/lmp_pinnacle-threaded -pk intel 0 omp 2 -sf intel out
Benchmarks across systems for one node are
Razor 16 core, np=8 1:56
Trestles 32 core, np=16 2:02
Pinnacle 32 core, np=16 0:46