Skip to content

Compiling kraken

Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.[1]

Installed Versions

Proteus has kraken 0.10.4-beta and 1.0 installed. Load the appropriate modulefile

kraken/gcc/0.10.4-beta kraken/gcc/1.0

Documentation

A local copy of documentation is available for kraken 1.0:

$KRAKEN_HOME/doc/MANUAL.html

Usage Notes

Read the documentation. Pay special note to these points:

  1. The disk used to store the database should be locally-attached storage. Storing the database on a network filesystem (NFS) partition can cause Kraken's operation to be very slow, or to be stopped completely. As NFS accesses are much slower than local disk accesses, both preloading and database building will be slowed by use of NFS. NB Use of Lustre has not been tested, but may provide enough performance for Kraken to work.
  2. To run efficiently, Kraken requires enough free memory to hold the database in RAM. ... The default database size is 74 GB (as of Dec. 2013).
  3. This uses OpenMP shared-memory parallel.

To deal with point 1, you must use local scratch, given by the environment variable TMPDIR in every job. See Writing Job Scripts#Staging Work to Local Scratch.

To deal with point 2, you must request enough m_mem_free and h_vmem.

To deal with point 3, you must request the shm parallel environment with an appropriate number of slots.

Download

[juser@proteusa01 src]$ git clonehttps://github.com/DerrickWood/kraken.git

Build and Install

Requires libcrispr and crass.

[juser@proteusa01 src]$ cd kraken [juser@proteusa01 kraken]$ export KRAKEN_DIR=${HOME}/kraken [juser@proteusa01 kraken]$ ./install_kraken.sh ${KRAKEN_DIR}

Edit your ~/.bashrc, and modify the PATH:

export PATH=${HOME}/kraken:${PATH}

Check

[juser@proteusa01 kraken]$ which kraken

References

[1] Kraken Official webpage