Compiling VASP

VASP is the Vienna Ab-initio Simulation Package[1] for performing ab-initio quantum mechanical molecular dynamics simulations.

NOTE 2017-Dec-02 The makefiles previously found here do not produce a correctly running executable. The current versions linked below have been tested and found to work correctly.

Installed Version

There is no generally installed version as VASP is licensed to research groups directly.

General Guidelines

There are some general guidelines:

https://www.nsc.liu.se/~pla/vaspstatus/
http://kestrel.isa-geek.net/ryumei/tone/?VASPIntelFortran (In Japanese)
official guide:

And information about compiling on Proteus:

Compiling for Intel with Intel Composer XE, MKL, and Intel MPI

About the makefiles below:

makefiles require the TAB character in the rule line. Copy and paste from this page will not work.

Memory Issues

If a certain preprocessor flag is given, VASP allocates memory on the stack[2] as opposed to using dynamic memory on the heap.[3] A common problem that arises is that system limits on stack size cause VASP to crash.[4]

There are two ways to deal with this:

Modify the source code, adding a new function/subroutine, which increases the stacksize limit at run time on all nodes involved in a computation. NB Adding a limit or ulimit statement in the job script does not work because that statement is only executed on the "master" node of a multi-node MPI job. The limit remains in effect for all the worker nodes.
Alternatively, delete the "-Davoidalloc" option. This rolls in code to allocate memory on the heap.

Add a new file stacksize.c to the source

This file defines a C function which increases the stack limit at run time:

#include <sys/time.h>
#include <sys/resource.h>
#include <stdio.h>

/* NOTE there is an underscore at the end of the function name */
void stacksize_()
{
    int res;
    struct rlimit rlim;

    getrlimit(RLIMIT_STACK, &rlim);
    printf("Before: cur=%d,hard=%d\n",(int)rlim.rlim_cur,(int)rlim.rlim_max);

    rlim.rlim_cur=RLIM_INFINITY;
    rlim.rlim_max=RLIM_INFINITY;
    res=setrlimit(RLIMIT_STACK, &rlim);

    getrlimit(RLIMIT_STACK, &rlim);
    printf("After: res=%d,cur=%d,hard=%d\n",res,(int)rlim.rlim_cur,(int)rlim.rlim_max);
}

Add stacksize.o to the variable SOURCE

In the Makefile, add "stacksize.o" to the end of the line defining the "SOURCE" variable.

Add call to stacksize() in main.F

In the file main.F, after the section

 !===========================================
 ! initialise / set constants and parameters ...
 !===========================================

add the call to the function stacksize() defined above -- NB there is no underscore at the end of the function name here:

       CALL stacksize()

The business with the underscore is due to Fortran's name mangling.[5]

Compiling with Intel Composer XE 2015 + MKL + OpenMPI 1.8.1 - multi-node MPI parallel

Seems like only MPI-sequential works, i.e. cannot have both MPI and OpenMP enabled. "MPI-sequential" refers to the setting in the Intel MKL Link Line Advisor: select the "sequential layer", and the MPICH2 MPI library.

The original makefiles have been modified to use newer pattern rules rather than suffix rules.
We do not use the ALLOC-free stuff to avoid having to add a new source file, and modifying main.F. No testing has been done to see how this affects speed.

NOTE 2017-12-02 For the main VASP 5.3 application, please use makefile.linux_ifc_proteus from: NOTE 2017-12-01 The makefiles below do not generate a correctly-running VASP. Update is coming shortly.

NOTE Copying and pasting the makefile contents from this wiki page will result in a broken makefile. Makefiles require a TAB character at the start of each "action" line. Or, fork the code from GitHub:

https://github.com/prehensilecode/vasp_makefiles_proteus

Environment

The following modules must be loaded to build VASP 5.3.5 using these makefiles and instructions:

1) shared 4) sge/univa 2) proteus 5) intel/composerxe/2015.1.133 3) gcc/4.8.1 6) proteus-openmpi/intel/2015/1.8.1-mlnx-ofed

Intel Link Line Advisor Settings

These are the settings used in the Link Line Advisor to generate the proper link lines:

Intel product: Intel(R) MKL 11.2
OS: Linux
Xeon Phi Coprocessor: None
Compiler: Intel(R) Fortran
Architecture: Intel(R) 64
Dynamic or static linking: Static
Interface layer: 32-bit integer
Threading layer: Sequential
Cluster library: ScaLAPACK (BLACS required)
MPI Library: Open MPI
Fortran 95 interfaces: BLAS95 and LAPACK95

Makefiles

The makefiles referred to below are available at github: https://github.com/prehensilecode/vasp_makefiles_proteus

Use

makefile.vasp5lib.proteus -- to compile vasp5lib
makefile.linux_ifc_proteus -- to compile the vasp application itself

Makefile for vasp.5.lib using MPI sequential

Put this file in the directory "vasp.5.lib", and then type:

make -f makefile.vasp5lib.proteus

in that directory.

Makefile for vasp executable using MPI sequential

As a starting point, use makefile.linux_ifc_P4.

Save this file in the directory "vasp.5.3" and then type:

make -f makefile.linux_ifc_proteus

Parallel make will not work; compilation may take up to an hour.

Example job script snippet

Since OpenMPI has Grid Engine integration, the number of nodes need not be specified.

#!/bin/bash
#$ -S /bin/bash
#$ -cwd
#$ -pe fixed16 128
#$ -l vendor=intel
...

. /etc/profile.d/modules.sh
module load shared
module load proteus
module load sge/univa
module load gcc/4.8.1
module load intel/composerxe/2015.1.133
module load proteus-openmpi/intel/2015/1.8.1-mlnx-ofed

### this job script assumes that it lives in the same directory as the inputs

export VASPEXE=/mnt/HA/groups/myrsrchGrp/bin/vasp

### do not specify no. of processes: openmpi integrates with Grid Engine and
### pulls that information from the environment
$MPI_RUN $VASPEXE

Benchmarks

This is a set of non-rigorous benchmarks.

There were done on a non-quiescent cluster, i.e. other jobs were running at the same time.
The distribution of the job across nodes was not restricted to stay within a physical InfinBand network switch. As the IB network of Proteus has a 2:1 over-subscription design, crossing switch boundaries would adversely affect computations which have large amounts of communication.
Based on the results below, it is recommended to use no more than 128 slots for a computation.

frame|left|[VASP benchmarks, interactive version](https://docs.google.com/spreadsheets/d/e/2PACX-1vQ4_QfGq-QvMh0aZFrpRWO_syfZHKY6CSNEjBFTihXBQ-TKIT_tJQNG6rDnMn5Oe4a_O6n-Pkq_auBZ/pubchart?oid=1984431559&format=interactive)