This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision Next revision Both sides next revision | ||
quantum_espresso [2020/12/30 21:09] root |
quantum_espresso [2022/06/20 18:32] root |
||
---|---|---|---|
Line 119: | Line 119: | ||
< | < | ||
OMP=" | OMP=" | ||
+ | VERSION=6.6 | ||
./ | ./ | ||
SCALAPACK_LIBS=" | SCALAPACK_LIBS=" | ||
Line 128: | Line 129: | ||
CFLAGS=" | CFLAGS=" | ||
--with-hdf5=/ | --with-hdf5=/ | ||
- | --enable-parallel $OMP --prefix=/ | + | --enable-parallel $OMP --prefix=/ |
</ | </ | ||
+ | ==Update 2022== | ||
+ | QE 6.8 are 7.1 installed with two versions compiled with the Intel compiler (" | ||
+ | " | ||
+ | " | ||
+ | Both use '' | ||
+ | The '' | ||
+ | The " | ||
+ | '' | ||
+ | For AMD, explicitly set after the module load: | ||
+ | < | ||
+ | export MKL_DEBUG_CPU_TYPE=0 | ||
+ | </ | ||
+ | |||
+ | A small pw.x input was used to allow some parameter sweeps. | ||
+ | < | ||
+ | Single Node Results | ||
+ | |||
+ | {OMP_NUM_THREADS=1|2} time mpirun -np {16|32|64} \ | ||
+ | / | ||
+ | -nk {1|4|8|16} <scf.in >log | ||
+ | |||
+ | System | ||
+ | |||
+ | Pinnacle I 32 Intel 6130 6.8 skylake | ||
+ | Pinnacle I 32 Intel 6130 6.8 skylake | ||
+ | Pinnacle I 32 Intel 6130 6.8 skylake | ||
+ | Pinnacle I 32 Intel 6130 6.8 skylake | ||
+ | |||
+ | Pinnacle I 32 Intel 6130 6.8 skylake | ||
+ | Pinnacle I 32 Intel 6130 6.8 bulldozer | ||
+ | |||
+ | Trestles | ||
+ | Trestles | ||
+ | Trestles | ||
+ | |||
+ | Pinnacle II 64 AMD 7543 6.8 bulldozer | ||
+ | Pinnacle II 64 AMD 7543 6.8 bulldozer | ||
+ | Pinnacle II 64 AMD 7543 6.8 bulldozer | ||
+ | Pinnacle II 64 AMD 7543 6.8 bulldozer | ||
+ | Pinnacle II 64 AMD 7543 7.1 bulldozer | ||
+ | </ | ||
+ | |||
+ | Conclusions for this sample program: | ||
+ | |||
+ | '' | ||
+ | |||
+ | QE 7.1 is slightly slower than 6.8 on Pinnacle I & II and slightly faster on Trestles. | ||
+ | |||
+ | '' | ||
+ | |||
+ | The bulldozer version runs on Intel but is significantly slower than the skylake version. | ||
+ | |||
+ | The uncrowded Trestles system has relatively good performance on QE if memory (64 GB) allows. |