For a reference Intel compilation, see
/scrfs/apps/lammps/lammps-16Mar18/src/lmp_pinnacle-threaded
. Lammps modules are
/scrfs/apps/lammps/lammps-16Mar18/src$ make ps | grep "YES:" Installed YES: package CLASS2 Installed YES: package KSPACE Installed YES: package MANYBODY Installed YES: package MOLECULE Installed YES: package USER-FEP Installed YES: package USER-INTEL Installed YES: package USER-MESO Installed YES: package USER-MISC Installed YES: package USER-MOLFILE Installed YES: package USER-REAXC Installed YES: package USER-SMD
Makefile is a slight modification of the Lammps-included Makefile.knl. This is compiled on Trestles to be upwards compatible with all systems. For the Phis you would want to include the MIC-AVX512 from the original.
/scrfs/apps/lammps/lammps-16Mar18/src/MAKE/OPTIONS$ diff Makefile.knl Makefile.pinnacle-threaded 10c10 < OPTFLAGS = -xMIC-AVX512 -O2 -fp-model fast=2 -no-prec-div -qoverride-limits --- > OPTFLAGS = -xHOST -axsse4.2,AVX,CORE-AVX512 -O2 -fp-model fast=2 -no-prec-div -qoverride-limits 12c12,13 < -DLMP_INTEL_USELRT -DLMP_USE_MKL_RNG $(OPTFLAGS) --- > -DLMP_INTEL_USELRT -DLMP_USE_MKL_RNG $(OPTFLAGS) \ > -I/scrfs/apps/intel/parallel_studio_xe_2019_update_4/compilers_and_libraries_2019.5.281/linux/tbb/include 18c19 < LIB = -ltbbmalloc --- > LIB = -L$(MKLROOT)/lib/intel64/ -lmkl_lapack95_lp64 -lmkl_intel_lp64 -lmkl_sequential -lmkl_core -ltbbmalloc 33c34 < LMP_INC = -DLAMMPS_GZIP -DLAMMPS_JPEG --- > LMP_INC = -DLAMMPS_GZIP -DLAMMPS_MEMALIGN=64 -DLAMMPS_JPEG 119c120 < cc -O -o $@ $< --- > icc -O -o $@ $<
Best known run configuration on all (non-Phi) systems is with MPI threads=physical cores/2 like so for Pinnacle 32-core:
module load intel/19.0.5 impi/19.0.5 mkl/19.0.5 mpirun -np 16 -genv OMP_NUM_THREADS=2 /scrfs/apps/lammps/lammps-16Mar18/src/lmp_pinnacle-threaded -pk intel 0 omp 2 -sf intel <in.dpd >out
Benchmarks across systems for one node are
Razor 16 core, np=8 1:56 Trestles 32 core, np=16 2:02 Pinnacle 32 core, np=16 0:46