User Tools

Site Tools


optimization

**This is an old revision of the document!**

Optimization

How to make your code run faster: Here we focus on compiling someone else's code in Linux for scientific computing. Writing your own code expands the problem considerably. For that you might check the free textbooks and supplemental material at https://theartofhpc.com/.

About 2015 this was a simpler exercise. There was one compiler that was the best in most situations (Intel proprietary). Now there are five or six compilers, all with some degree of different options. There are three major MPI variants which can work with each compiler. And usually you need to do at least a little custom compiling for each hardware that you plan to run on. Here are the major factors in making your code faster.

Compilers
  • Intel proprietary: icc/icpc/ifort
  • Intel oneAPI Clang/LLVM based: icx/icpx/ifx
  • AMD Clang/LLVM based: clang/clang++/flang
  • NVidia PGI based: pgcc/pgc++/pgf90
  • GNU: gcc/g++/gfortran
  • Also base Clang/LLVM but not necessary with two optimized versions

    For each of these you need to find the right options to enable your compute hardware. The most important options are:

  • Optimization level: Usually -O0 (no optimization, for debugging) -O1 light optimization, for fast compiles -O2 more optimization -O3 more optimization -Ofast usually -O3 with reduced numerical precision * Set the target architecture, with examples for AHPCC The similar generations of Intel E5 processors are mostly distinguished by their floating point: nehalem(SSE4.2), sandybridge/ivybridge(AVX), haswell/broadwell(AVX2) icc -x{sandybridge|ivybridge|haswell|skylake-avx512|HOST(compile host)}, limited options for AMD icx -x{mostly the same as icc} clang -march=znver{1:2:3:4}, limited options for Intel pgicc -tp={bulldozer|sandybridge|ivybridge|haswell|skylake|zen|zen2|zen3|native (compile host)} gcc –march={bdver1|nehalem|sandybridge|ivybridge|haswell|skylake-avx512|znver1|znver2|znver3|native} ** gcc –mtume={bdver1|nehalem|sandybridge|haswell|skylake-avx512|znver1|znver2|znver3}


optimization.1678397696.txt.gz · Last modified: 2023/03/09 21:34 by root