Installed Versions♯
Picotte♯
Only FFTW3 is available. Version 3.3.10 fixes a longstanding bug. Use one of these modulefiles:
fftw3/gcc/3.3.10
fftw3/intel/2020/3.3.9
fftw3/intel/2020/3.3.10
NOTE
- Intel MKL provides an FFTW3-compatible interface. They are in
$MKLROOT/interfaces
, available the modulefileintel/composerxe/2020u4
is loaded. - CUDA provides a GPU-based FFT library
libcufft
Proteus (OBSOLETE)♯
There are versions of FFTW2 and FFTW3 already installed on Proteus. Use one of these modulefiles:
proteus-fftw2/amd/gcc/2.1.5 proteus-fftw2/open64/64/float/2.1.5
proteus-fftw2/gcc/64/double/2.1.5 proteus-fftw3/amd/gcc/64/3.3.3
proteus-fftw2/gcc/64/float/2.1.5 proteus-fftw3/gcc/64/3.3.3
proteus-fftw2/mvapich2/open64/64/double/2.1.5 proteus-fftw3/intel/gcc/64/3.3.3
proteus-fftw2/mvapich2/open64/64/float/2.1.5 proteus-fftw3/open64/64/3.3.3
proteus-fftw2/open64/64/double/2.1.5
Compiling when using these installed versions should include these flags, where "n" is either "2" or "3":
-I$(FFTWnINCLUDE)
-L$(FFTWnDIR)
General Notes♯
Using GCC to compile a version which runs on both the Intel and AMD nodes results in poor performance on both platforms. Some testing has indicated that, depending on the specific compiler options used, FFTW3 may run 4 times slower on an Intel node than on an AMD node. This is despite the fact that the Intel nodes are intrinsically faster.
Using FFTW⅔ in Intel MKL♯
If you are using the Intel Compilers, FFTW2 and FFTW3 are included in the Math Kernel Library (MKL). You may have to specify a slightly different include path:
-I$MKLROOT/fftw
or use a slightly different include directive in your source:
#include <fftw/fftwX.h> // where "X" = 2 or 3
See details in the MKL Reference Manual.
FFTW 2♯
Current version: fftw-2.1.5
GCC double precision♯
Environment♯
F77 = gfortran
Configure♯
./configure --prefix=/mnt/HA/opt/fftw2/gcc/64/double/2.1.5 --enable-threads --disable-mpi --enable-fortran --disable-single--enable-type-prefix --enable-shared
Modify config.status♯
s,@CFLAGS@,-O3 -mavx -msse4.2 -mfpmath=sse -fomit-frame-pointer -fno-schedule-insns -fschedule-insns2 -malign-double -fstrict-aliasing -pthread ,;t t
GCC single precision♯
TBA
Open64 double precision♯
Open64 single precision♯
NAMD uses FFTW 2.1.5 single precision.
Environment♯
CC = opencc
F77 = openf90
Configure♯
./configure --prefix=/mnt/HA/opt/fftw2/open64/64/single/2.1.5 --enable-threads --disable-mpi --enable-fortran \
--enable-float --enable-type-prefix --enable-shared
Modify config.status♯
s,@CFLAGS@,-O3 -march=bdver2 -mtune=bdver2 -fomit-frame-pointer -fno-schedule-insns -fschedule-insns2 -fstrict-aliasing -pthread,;t t
Then, execute config.status to generate makefiles:
[juser@proteusa01 fftw-2.1.5]$ sh config.status
FFTW 3♯
Current version: fftw-3.3.3
GCC♯
Environment♯
F77 = gfortran
Configure♯
This configures double-precision for AMD:
./configure --prefix=/mnt/HA/opt/fftw3/amd/gcc/64/3.3.3 --enable-threads --enable-openmp \
--disable-mpi \
--enable-shared --enable-static \
--enable-sse2 --enable-avx --enable-fma \
--enable-fortran --disable-single > Configure.out 2>&1 &
NOTE
- only AMD CPUs support the FMA instruction
Modify config.status♯
Modify the CFLAGS in config.status:
CFLAGS='-O3 -march=native -fomit-frame-pointer -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math'
S["CFLAGS"]="-O3 -march=native -fomit-frame-pointer -malign-double -fstrict-aliasing -fno-schedule-insns -ffast-math"
S["SSE2_CFLAGS"]="-msse2"
D["FFTW_CC"]=" \"/cm/shared/apps/gcc/4.8.1/bin/gcc -std=gnu99 -O3 -march=native -fomit-frame-pointer -malign-double -fstrict-aliasing -fno-schedule-insns -ffastmath\""
Then, execute it to regenerate Makefiles:
$ sh config.status
Build and Test♯
Be sure to "make test" on both Intel and AMD nodes to make sure the compiled libraries work on both platforms.
Open64 for AMD, Single Precision♯
Environment♯
[juser@proteusa01]$ module unload gcc
[juser@proteusa01]$ module load open64 acml/open64/mp/fma4
[juser@proteusa01]$ module list
Currently Loaded Modulefiles:
1) shared 2) proteus 3) sge/univa 4) open64/4.5.2.1 5) acml/open64/mp/fma4/5.3.1
Environment variables:
CC = opencc
F77 = openf95
Configure♯
./configure --prefix=/mnt/HA/opt/fftw3/open64/64/3.3.3 --enable-shared --enable-static \
--disable-mpi --enable-openmp --enable-threads --enable-fortran \
--enable-fma --enable-sse2 \
--enable-single > & Configure.out &
NB
- AVX support (i.e. AVX assembler routines) is broken with Open64.
- --disable-single for double precision only
- serial version is always built
Modify config.status♯
CFLAGS='-O3 -mso -march=bdver2 -fomit-frame-pointer -align64 -fstrict-aliasing -fno-schedule-insns -ffast-math'
S["SSE2_CFLAGS"]="-msse2"
S["CFLAGS"]="-O3 -mso -march=bdver2 -fomit-frame-pointer -align64 -fstrict-aliasing -fno-schedule-insns -ffast-math"
D["FFTW_CC"]=" \"opencc -std=gnu99 -O3 -mso -march=bdver2 --fomit-frame-pointer -align64 -fstrict-aliasing -fno-schedule-insns -ffast-math\""
Open64 for AMD, Double Precision♯
...