VSEARCH
VSEARCH is an open and free 64-bit multithreaded tool for processing metagenomic sequences, including searching, clustering, chimera detection, dereplication, sorting, masking and shuffling.[1] It is an alterntive to USEARCH by Robert C. Edgar.[2]
Installed Versions♯
VSEARCH 1.1.3, 1.10.0, and 2.7.1 are installed on Proteus. Use one of the modules:
vsearch/gcc/1.1.3
vsearch/gcc/1.10.0
vsearch/gcc/2.7.1
This also provides a man page. To view (after loading the above modulefile):
[juser@proteusa01 ~]$ man vsearch
The full documentation plus tests is available in $VSEARCHDIR
.
Using VSEARCH with Multiple Threads♯
When submitting a job, request the "shm
" resource with the number of
slots equal to the number of threads required:
#$ -pe shm 16
And then, when running vsearch, specify the number of threads (otherwise it will try to use all CPU cores):
vsearch --threads $NSLOTS ...
Compiling 1.1.3♯
They provide three separate makefiles for three separate versions. One "plain" (Makefile), one which can read and write gzipped data (Makefile.ZLIB), and one which can read and write BZ2-compresse data (Makefile.BZLIB). This article will only deal with the "plain" version, for now.
Download a tarball and expand it. Note that the distribution tarball includes executables of many other versions for many other machines. You can delete them.
Edit the Makefile and Compile♯
Modify src/Makefile
according to this diff:
--- Makefile.orig 2015-03-18 05:58:00.000000000 -0400
+++ Makefile 2015-08-10 22:42:28.962047475 -0400
@@ -20,14 +20,16 @@
# Profiling
#PROFILING=-g -fprofile-arcs -ftest-coverage
#PROFILING=-g -pg
-PROFILING=-g
+#PROFILING=-g
+PROFILING=
# Compiler warnings
WARN=-Wall -Wsign-compare
#WARN=-Weverything
CXX=g++
-CXXFLAGS=-O3 -msse2 -mtune=core2 -Icityhash $(WARN) $(PROFILING)
+#CXXFLAGS=-O3 -msse2 -mtune=core2 -Icityhash $(WARN) $(PROFILING)
+CXXFLAGS=-O3 -march=corei7-avx -Icityhash $(WARN) $(PROFILING)
LINKFLAGS=$(PROFILING)
LIBS=-lpthread
@@ -64,4 +66,4 @@
$(CXX) $(CXXFLAGS) -mssse3 -DSSSE3 -c -o $@ $<
cpu_sse2.o : cpu.cc $(DEPS)
- $(CXX) $(CXXFLAGS) -c -o $@ $<
+ $(CXX) $(CXXFLAGS) -msse2 -c -o $@ $<
And compile it:
[juser@proteusa01 vsearch-1.1.3]$ make -j 8 | tee Make.out
This creates the executable named vsearch.
Test♯
This test runs the just-built vsearch in src/
. The results of the run
using the version installed on Proteus is included below:
[juser@proteusa01 eval]$ ./eval.sh v
Creating random test set
Running search
vsearch v1.1.3_linux_x86_64, 63.0GB RAM, 16 cores
https://github.com/torognes/vsearch
Reading file ./db.fsa 100%
52576347 nt in 380472 seqs, min 32, max 1875, avg 138
WARNING: 447 sequences shorter than 32 nucleotides discarded.
Indexing sequences 100%
Masking 100%
Counting unique k-mers 100%
Creating index of unique k-mers 100%
Searching 100%
Matching query sequences: 208000 of 208500 (99.76%)
318.97user 0.68system 0:24.15elapsed 1323%CPU (0avgtext+0avgdata 715288maxresident)k
0inputs+37864outputs (0major+62379minor)pagefaults 0swaps
Results
Total queries: 208500
True positives: 193500
False positives: 14500
False negatives: 15000
Recall: 92.81%
Precision: 93.03%
False negative rate: 7.19%
False discovery rate: 6.97%
F-score: 92.92%
Matched (id>=70%;cov>=90%): 154700
Matched percentage: 74.20%
Install♯
Copy the executable src/vsearch to any convenient location in your PATH.
Compiling 1.10.0♯
Version 1.10.0 has a configure script. Despite what the configure options imply, the resulting executable is not linked with bzip2 or zlib.
Configure♯
./configure CXXFLAGS="-O3 -march=corei7-avx -mtune=corei7-avx" CFLAGS="-O3 -march=corei7-avx -mtune=corei7-avx" --prefix=/mnt/HA/opt/vsearch/gcc/1.10.0 --enable-bzip2 --enable-zlib
Compiling 2.7.1♯
Please see the official instructions. Despite the instructions, sudo/root access is not required: you just have to set the prefix to a non-privileged location at the configure step.
Pre-configure♯
module load autoconf automake
./autogen.sh
Configure♯
./configure CXXFLAGS="-O3" CFLAGS="-O3" --prefix=/mnt/HA/groups/myrsrchGrp --enable-bzip2 --enable-zlib
Install♯
make -j 12 install
And set your PATH, and MANPATH appropriately.