Compiling kraken
Kraken is a system for assigning taxonomic labels to short DNA sequences, usually obtained through metagenomic studies.[1]
Installed Versions♯
Proteus has kraken 0.10.4-beta and 1.0 installed. Load the appropriate modulefile
kraken/gcc/0.10.4-beta
kraken/gcc/1.0
Documentation♯
A local copy of documentation is available for kraken 1.0:
$KRAKEN_HOME/doc/MANUAL.html
Usage Notes♯
Read the documentation. Pay special note to these points:
- The disk used to store the database should be locally-attached storage. Storing the database on a network filesystem (NFS) partition can cause Kraken's operation to be very slow, or to be stopped completely. As NFS accesses are much slower than local disk accesses, both preloading and database building will be slowed by use of NFS. NB Use of Lustre has not been tested, but may provide enough performance for Kraken to work.
- To run efficiently, Kraken requires enough free memory to hold the database in RAM. ... The default database size is 74 GB (as of Dec. 2013).
- This uses OpenMP shared-memory parallel.
To deal with point 1, you must use local scratch, given by the
environment variable TMPDIR
in every job. See Writing Job
Scripts#Staging Work to Local
Scratch.
To deal with point 2, you must request enough m_mem_free
and h_vmem
.
To deal with point 3, you must request the shm
parallel environment
with an appropriate number of slots.
Download♯
[juser@proteusa01 src]$ git clone
https://github.com/DerrickWood/kraken.git
Build and Install♯
[juser@proteusa01 src]$ cd kraken
[juser@proteusa01 kraken]$ export KRAKEN_DIR=${HOME}/kraken
[juser@proteusa01 kraken]$ ./install_kraken.sh ${KRAKEN_DIR}
Edit your ~/.bashrc
, and modify the PATH
:
export PATH=${HOME}/kraken:${PATH}
Check♯
[juser@proteusa01 kraken]$ which kraken