This shows you the differences between two versions of the page.
Both sides previous revision Previous revision Next revision | Previous revision | ||
optimization [2023/03/09 22:07] root T |
optimization [2023/03/09 22:16] (current) root |
||
---|---|---|---|
Line 8: | Line 8: | ||
* Intel proprietary: | * Intel proprietary: | ||
- | |||
* Intel oneAPI Clang/LLVM based: icx/ | * Intel oneAPI Clang/LLVM based: icx/ | ||
- | |||
* AMD Clang/LLVM based: clang/ | * AMD Clang/LLVM based: clang/ | ||
- | |||
* NVidia PGI based: pgcc/ | * NVidia PGI based: pgcc/ | ||
- | |||
* GNU: gcc/ | * GNU: gcc/ | ||
- | |||
* Also base Clang/LLVM is available, but not necessary with two optimized versions | * Also base Clang/LLVM is available, but not necessary with two optimized versions | ||
Line 51: | Line 46: | ||
== OpenMP == | == OpenMP == | ||
- | The automated parallelization is not usually very good, so it requires directives in the code for good performance | + | The automated parallelization is not usually very good, so it requires directives in the code for good performance. But generally a compiler option is necessary to enable OpenMP. |
* icc -qopenmp -parallel | * icc -qopenmp -parallel | ||
Line 66: | Line 61: | ||
These include | These include | ||
- | * BLAS and LAPACK: | + | * BLAS and LAPACK: Intel MKL, AMD AOCL, OpenBLAS |
- | * FFT; | + | * FFT: FFTW, MKL, AOCL |
- | * Solvers: | + | * Solvers: |
+ | * Random Numbers: AOCL, MKL | ||
==MPI Versions== | ==MPI Versions== | ||
- | * Intel MPI: usually the easiest as it has run-time interfaces | + | * Intel MPI: usually the easiest as it has run-time interfaces |
- | * Open MPI: often the fastest, must be compiled with the compiler | + | * Open MPI: often the fastest, must be compiled with the compiler |
- | * MVAPICH: (MPICH for Infiniband): | + | * MVAPICH: (MPICH for Infiniband): |