Compiling for AMD with Open64
Hardware♯
- AMD nodes:
ac[01-09]n[01-02]
- CPU: AMD Opteron™ 6378 Piledriver microarchitecture Abu Dhabi core (Family 15h)[1] -- supports SSE, SSE2, SSE3, AVX, FMA3, FMA4
Compilation Flags♯
- consult AMD Open64 Documentation[2][3]
-march=bdver2
[4][5]- specific flags for various CPU features may be used, instead,
e.g. "
-mfma4
" - other possible flags:
- -mso -- optimize for multicore scalability
- -apo -- automatic parallelization
- specific flags for various CPU features may be used, instead,
e.g. "
Math Libraries♯
- Use the AMD Core Math Library:[6]
- select an appropriate
acml
module -- for Proteus, use acml/open64/fma4- acml/open64/64 -- base ACML library, 32-bit integers
- acml/open64/fma4 -- ACML with FMA4 (Fused Multiply Accumulate v. 4), 32-bit integers
- acml/open64/mp/64 -- ACML with OpenMP[7]
- acml/open64/mp/fma4 -- ACML with OpenMP + FMA4
- acml/open64-int64/64 -- base ACML library, 64-bit integers
- acml/open64-int64/fma4 -- ACML with FMA4, 64-bit integers
- acml/open64-int64/mp/64 -- ACML with OpenMP, 64-bit integers
- acml/open64-int64/mp/fma4 -- ACML with OpenMP + FMA4
- select an appropriate
MPI Implementations♯
See the Message Passing Interface article for implementations and versions available on Proteus.
MVAPICH2♯
Refer to official MVAPICH2 website for documentation.
Module:
proteus-mvapich2/open64
Environment variables:
MPI_HOME=/mnt/HA/opt/mvapich2/open64/64/1.9-mlnx-ofed
MPI_LDFLAGS=-L/mnt/HA/opt/mvapich2/open64/64/1.9-mlnx-ofed/lib
MPI_LIBDIR=/mnt/HA/opt/mvapich2/open64/64/1.9-mlnx-ofed/lib
MPI_LIBS=-lmpich -lmpl -libmad -libumad -libverbs -ldl -lrt -lnuma -lrdmacm -lm -lpthread
MPI_RUN=/mnt/HA/opt/mvapich2/open64/64/1.9-mlnx-ofed/bin/mpirun_rsh
Example makefile settings:
CPPFLAGS=-I$(MPI_HOME)/include
LDFLAGS=$(MPI_LDFLAGS) $(MPI_LIBS)
OpenMPI♯
Module:
proteus-openmpi/open64
Guidance on Building Specific Applications♯
AMD has produced a series of guides on building some widely-used scientific software: Apps & Libraries Built by x86 Open64.
References♯
[1] New-Bulldozer-and-Piledriver-Instructions.pdf
[2] AMD x86 Open64 Compiler Suite web site
[3] CompilerOptQuickRef-63004300.pdf
[4] AMD Developer Central: Developer Guides & Manuals
[5] Phoronix - FX-8350 Piledriver Tuning On AMD's Open64 Compiler
[6] AMD Core Math Library website
[7] OpenMP® official website, 32-bit integers (NB OpenMP is an API for shared memory multiprocessing. It is distinct from OpenMPI, which is an implementation of the MPI-2 standard.)