Compiling for Intel with Intel Composer XE, MKL, and Intel MPI
Upcoming Changes to the Intel Compiler Suite
Intel is in the process of transitioning its compiler suite to an LLVM-based suite called oneAPI. The Composer XE suite (and related tools) will be migrated into a "Legacy Support" state by 2023 (C/C++ compilers) or 2024 (Fortran compiler).[1]
Intel oneAPI is installed on Picotte. Please see: Intel oneAPI
General Notes
- Static linking is not possible because Red Hat does not distribute a
static
libm
(standard math library)
Motivation
Intel Compilers + MKL can produce executables which run significantly faster on Intel CPUs, when compared with that produced by GCC. For example, see the metrics reported here which compare linear algebra performance of MKL vs ATLAS (Automatically Tuned Linear Algebra Software).
Versions Available
The cluster management vendor Bright Computing provides the Intel Composer suite as multiple modules:
[juser@proteusi01 ~]$ module avail intel
----------------------------------------------------- /cm/shared/modulefiles -----------------------------------------------------
intel/compiler/64/14.0/2013_sp1.3.174 intel-cluster-checker/2.1.2 intel-mpi/64/4.1.1/036
intel/ipp/64/8.1/2013_sp1.3.174 intel-cluster-runtime/ia32/3.6 intel-mpi/mic/4.1.1/036
intel/mkl/64/11.1/2013_sp1.3.174 intel-cluster-runtime/intel64/3.6 intel-tbb-oss/ia32/42_20140601oss
intel/sourcechecker/64/14.0/2013_sp1.3.174 intel-cluster-runtime/mic/3.6 intel-tbb-oss/intel64/42_20140601oss
intel/tbb/32/4.2/2013_sp1.3.174 intel-itac/8.1.3/037
intel/tbb/64/4.2/2013_sp1.3.174 intel-mpi/32/4.1.1/036
---------------------------------------------------- /mnt/HA/opt/modulefiles -----------------------------------------------------
intel/composerxe/2013.3.174 intel/composerxe/2015.1.133 intel/composerxe/2016.0.109 intel/composerxe/current
The modules under /cm/shared/modulefiles are provided by Bright. The modules under /mnt/HA/opt/modulefiles are locally-installed.
For convenience, use the locally-installed modules.
Intel Composer XE
In all the versions described below, all associated packages (MKL, TBB, IPP) are loaded with a single module.
Intel Composer XE on Picotte
Version 2020u4
Use the modulefile:
intel/composerxe/2020
Current minor version is 2020u4.
Intel Composer XE on Proteus
Version 2013
Intel Composer XE is a suite of tools including compilers, parallel debugger, optimized libraries, the Math Kernel Library, and tools for profiling and tuning applications.[2]
[juser@proteusi01 ~]$ module load intel/composerxe/2013.3.174
With Composer XE 2013.3.174, MKL 11.1 is installed.
Version 2015
Version 2015 is also installed, with all components loaded by a single module:
[juser@proteusi01 ~]$ module load intel/composerxe/2015.1.133
With Composer XE 2015.1.133, MKL 11.2 is installed.
Version 2016
Version 2016 is installed, with all components loaded by a single module:
[juser@proteusi01 ~]$ module load intel/composerxe/2016.0.109
With Composer XE 2016.0.109, MKL 11.3 is installed.
Optimization Flags
Please see Hardware for details on what hardware-specific optimizations may be used.
- 2015-04-15: -xHost -- CPU architecture of proteusi01 is identical to all Intel compute nodes
Intel Math Kernel Library (MKL)
For best performance on Intel CPUs, do not use generic linear algebra libraries (BLAS, LAPACK). Instead, use the MKL.[3][4]
- MKL 11.1 is installed with Composer XE 2013
- MKL 11.2 is installed with Composer XE 2015
- MKL 11.3 is installed with Composer XE 2016
The installations on Proteus also includes interfaces for BLAS95, LAPACK95, FFTW2 (double), and FFTW3 (double).
Choice of Integer Size
The MKL offers the choice of standard 32-bit integers (denoted LP64
,
or long 64-bit integers (denoted ILP64
).[5][6] The installations on
Proteus default to 32-bit integers.
Interfaces for BLAS95, LAPACK95, FFTW2, and FFTW3
The interfaces for BLAS95, LAPACK95, FFTW2, and FFTW3 are available, as well. They are provided as static library files, compiled locally against the MKL. The libraries are in the directory $MKLROOT/lib/intel64
The library files themselves are:
libmkl_blas95_lp64.a
libmkl_blas95_ilp64.a
libmkl_lapack95_lp64.a
libmkl_lapack95_ilp64.a
libfftw3xf_intel.a
libfftw3x_cdft_ilp64.a
libfftw3x_cdft_lp64.a
libfftw3xc_intel.a
libfftw2xf_single_intel.a
libfftw2xf_double_intel.a
libfftw2x_cdft_DOUBLE_lp64.a
libfftw2x_cdft_SINGLE_lp64.a
libfftw2xc_single_intel.a
libfftw2xc_double_intel.a
As these are not part of the base MKL libraries, the Link Line Advisor will not generate link flags for these libraries. You should manually include them in your link line, e.g.
-L$MKLROOT/lib/intel64 -lmkl_blas95_lp64 -lfftw3xc_intel
Compiling Numpy and Scipy with MKL
Intel has instructions on using Intel Compilers + MKL to compile Numpy and Scipy:
https://software.intel.com/en-us/articles/numpyscipy-with-intel-mkl
They also include comparative performance numbers (against ATLAS).
MKL Link Line Advisor
Linking against the MKL can be complicated: consult the MKL User's Guide for detailed documentation.[7] The MKL Link Line Advisor web-based tool will generate the proper compilation options to compile and link against the MKL:[8]
http://software.intel.com/en-us/articles/intel-mkl-link-line-advisor
MPI implementation
Intel MPI
NOTE As of 2015-01-01 we do not have a license for Intel MPI. Please use MVAPICH2 or OpenMPI (the latter is recommended).
IN PROGRESS
Intel MPI is Intel's implementation of MPI-2.[9][10] It is available via the module:
[juser@proteusi01 ~]$ module load intel-mpi/64
The compiler commands are:
mpiicc
(note the two letters "i
")mpiifort
See Intel MPI for Linux Getting Started Guide.[11] Also see the article on Message Passing Interface.
Open MPI
For the 2013 version, use:
proteus-openmpi/intel/64/1.8.1-mlnx-ofed
For the 2015 version, use:
proteus-openmpi/intel/2015/1.8.1-mlnx-ofed
Hybrid MPI-OpenMP
Intel MPI supports hybrid MPI-OpenMP code[12]
- Use the thread-safe MPI library by passing the compiler option:
-mt_mpi
- Set the environment variable
I_MPI_PIN_DOMAIN
to "omp
":export I_MPI_PIN_DOMAIN=omp
. This sets the pinning domain size to be equal to the value given by the environment variableOMP_NUM_THREADS
. IfOMP_NUM_THREADS
is not set, Intel MPI will assume all cores are to be used.
NOTE: Grid Engine may assign only some of the cores in a node to any domain of MPI.
Recommended Combination for Proteus
This is the combination of Intel compilers/libraries and MPI implementation that we recommend:
intel/composerxe/2015.1.133
proteus-openmpi/intel/2015/1.8.1-mlnx-ofed
This combination supports hybrid OpenMP-MPI code, though performance improvement of hybrid code over MPI-only may be small.
See Also
- Compiling with GCC
- For a concrete example of using the Intel 2015 + OpenMPI toolchain, see Compiling LAMMPS
References
[1] Intel oneAPI Product Fact Sheet (Dec 2020)
[2] Intel Composer XE information website
[3] Intel MKL information website
[4] Intel MKL 11.1 Reference Manual
[5] Intel Math Kernel Library for Linux OS User's Guide: Using the ILP64 Interface vs. LP64 Interface
[6] Intel Math Kernel Library for Linux OS User's Guide: Support for ILP64 Programming
[8] Intel MKL Link Line Advisor
[9] Intel MPI 4.1 Reference Manual
[10] Intel MPI Reference Manual - Interoperability with OpenMP* API
[11] IntelMPIforLinuxGettingStarted.pdf
[12] Intel Developer Zone - Hybrid applications: Intel MPI Library and OpenMP