Installed Versions♯

Picotte♯

Only FFTW3 is available. Version 3.3.10 fixes a longstanding bug. Use one of these modulefiles:

fftw3/gcc/3.3.10
fftw3/intel/2020/3.3.9
fftw3/intel/2020/3.3.10

NOTE

Intel MKL provides an FFTW3-compatible interface. They are in $MKLROOT/interfaces, available the modulefile intel/composerxe/2020u4 is loaded.
CUDA provides a GPU-based FFT library libcufft

Proteus (OBSOLETE)♯

There are versions of FFTW2 and FFTW3 already installed on Proteus. Use one of these modulefiles:

proteus-fftw2/amd/gcc/2.1.5                   proteus-fftw2/open64/64/float/2.1.5
proteus-fftw2/gcc/64/double/2.1.5             proteus-fftw3/amd/gcc/64/3.3.3
proteus-fftw2/gcc/64/float/2.1.5              proteus-fftw3/gcc/64/3.3.3
proteus-fftw2/mvapich2/open64/64/double/2.1.5 proteus-fftw3/intel/gcc/64/3.3.3
proteus-fftw2/mvapich2/open64/64/float/2.1.5  proteus-fftw3/open64/64/3.3.3
proteus-fftw2/open64/64/double/2.1.5

Compiling when using these installed versions should include these flags, where "n" is either "2" or "3":

-I$(FFTWnINCLUDE) -L$(FFTWnDIR)

General Notes♯

Using GCC to compile a version which runs on both the Intel and AMD nodes results in poor performance on both platforms. Some testing has indicated that, depending on the specific compiler options used, FFTW3 may run 4 times slower on an Intel node than on an AMD node. This is despite the fact that the Intel nodes are intrinsically faster.

Using FFTW⅔ in Intel MKL♯

If you are using the Intel Compilers, FFTW2 and FFTW3 are included in the Math Kernel Library (MKL). You may have to specify a slightly different include path:

-I$MKLROOT/fftw

or use a slightly different include directive in your source:

#include <fftw/fftwX.h> // where "X" = 2 or 3

See details in the MKL Reference Manual.

FFTW 2♯

Current version: fftw-2.1.5

GCC double precision♯

Environment♯

F77 = gfortran

Configure♯

./configure --prefix=/mnt/HA/opt/fftw2/gcc/64/double/2.1.5 --enable-threads --disable-mpi --enable-fortran --disable-single--enable-type-prefix --enable-shared

Modify config.status♯

s,@CFLAGS@,-O3 -mavx -msse4.2 -mfpmath=sse -fomit-frame-pointer -fno-schedule-insns -fschedule-insns2 -malign-double -fstrict-aliasing -pthread ,;t t

GCC single precision♯

TBA

Open64 double precision♯

Open64 single precision♯

NAMD uses FFTW 2.1.5 single precision.

Environment♯

CC = opencc F77 = openf90

Configure♯

./configure --prefix=/mnt/HA/opt/fftw2/open64/64/single/2.1.5 --enable-threads --disable-mpi --enable-fortran \ --enable-float --enable-type-prefix --enable-shared

Modify config.status♯

s,@CFLAGS@,-O3 -march=bdver2 -mtune=bdver2 -fomit-frame-pointer -fno-schedule-insns -fschedule-insns2 -fstrict-aliasing -pthread,;t t

Then, execute config.status to generate makefiles:

[juser@proteusa01 fftw-2.1.5]$ sh config.status

FFTW 3♯

Current version: fftw-3.3.3

GCC♯

Environment♯

F77 = gfortran

Configure♯

This configures double-precision for AMD:

./configure --prefix=/mnt/HA/opt/fftw3/amd/gcc/64/3.3.3 --enable-threads --enable-openmp \ --disable-mpi \ --enable-shared --enable-static \ --enable-sse2 --enable-avx --enable-fma \ --enable-fortran --disable-single > Configure.out 2>&1 &

NOTE

only AMD CPUs support the FMA instruction

Modify config.status♯

Modify the CFLAGS in config.status:

CFLAGS='-O3 -march=native -fomit-frame-pointer -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math'

S["CFLAGS"]="-O3 -march=native -fomit-frame-pointer -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math"

S["SSE2_CFLAGS"]="-msse2"

D["FFTW_CC"]=" \"/cm/shared/apps/gcc/4.8.1/bin/gcc -std=gnu99 -O3 -march=native -fomit-frame-pointer -malign-double -fstrict-aliasing -fno-schedule-insns -ffastmath\""

Then, execute it to regenerate Makefiles:

$ sh config.status

Build and Test♯

Be sure to "make test" on both Intel and AMD nodes to make sure the compiled libraries work on both platforms.

Open64 for AMD, Single Precision♯

Environment♯

[juser@proteusa01]$ module unload gcc [juser@proteusa01]$ module load open64 acml/open64/mp/fma4 [juser@proteusa01]$ module list Currently Loaded Modulefiles: 1) shared 2) proteus 3) sge/univa 4) open64/4.5.2.1 5) acml/open64/mp/fma4/5.3.1

Environment variables:

CC = opencc F77 = openf95

Configure♯

./configure --prefix=/mnt/HA/opt/fftw3/open64/64/3.3.3 --enable-shared --enable-static \ --disable-mpi --enable-openmp --enable-threads --enable-fortran \ --enable-fma --enable-sse2 \ --enable-single > & Configure.out &

NB

AVX support (i.e. AVX assembler routines) is broken with Open64.
--disable-single for double precision only
serial version is always built

Modify config.status♯

CFLAGS='-O3 -mso -march=bdver2 -fomit-frame-pointer -align64 -fstrict-aliasing -fno-schedule-insns -ffast-math'

S["SSE2_CFLAGS"]="-msse2"

S["CFLAGS"]="-O3 -mso -march=bdver2 -fomit-frame-pointer -align64 -fstrict-aliasing -fno-schedule-insns -ffast-math"

D["FFTW_CC"]=" \"opencc -std=gnu99 -O3 -mso -march=bdver2 --fomit-frame-pointer -align64 -fstrict-aliasing -fno-schedule-insns -ffast-math\""

Open64 for AMD, Double Precision♯

...