This shows you the differences between two versions of the page.
| Both sides previous revision Previous revision Next revision | Previous revision | ||
|
optimization [2023/03/09 22:07] root T |
optimization [2023/03/09 22:16] (current) root |
||
|---|---|---|---|
| Line 8: | Line 8: | ||
| * Intel proprietary: | * Intel proprietary: | ||
| - | |||
| * Intel oneAPI Clang/LLVM based: icx/ | * Intel oneAPI Clang/LLVM based: icx/ | ||
| - | |||
| * AMD Clang/LLVM based: clang/ | * AMD Clang/LLVM based: clang/ | ||
| - | |||
| * NVidia PGI based: pgcc/ | * NVidia PGI based: pgcc/ | ||
| - | |||
| * GNU: gcc/ | * GNU: gcc/ | ||
| - | |||
| * Also base Clang/LLVM is available, but not necessary with two optimized versions | * Also base Clang/LLVM is available, but not necessary with two optimized versions | ||
| Line 51: | Line 46: | ||
| == OpenMP == | == OpenMP == | ||
| - | The automated parallelization is not usually very good, so it requires directives in the code for good performance | + | The automated parallelization is not usually very good, so it requires directives in the code for good performance. But generally a compiler option is necessary to enable OpenMP. |
| * icc -qopenmp -parallel | * icc -qopenmp -parallel | ||
| Line 66: | Line 61: | ||
| These include | These include | ||
| - | * BLAS and LAPACK: | + | * BLAS and LAPACK: Intel MKL, AMD AOCL, OpenBLAS |
| - | * FFT; | + | * FFT: FFTW, MKL, AOCL |
| - | * Solvers: | + | * Solvers: |
| + | * Random Numbers: AOCL, MKL | ||
| ==MPI Versions== | ==MPI Versions== | ||
| - | * Intel MPI: usually the easiest as it has run-time interfaces | + | * Intel MPI: usually the easiest as it has run-time interfaces |
| - | * Open MPI: often the fastest, must be compiled with the compiler | + | * Open MPI: often the fastest, must be compiled with the compiler |
| - | * MVAPICH: (MPICH for Infiniband): | + | * MVAPICH: (MPICH for Infiniband): |