Skip to content

Trinity for RNA-Seq De novo Assembly

Trinity is software for RNA-Seq De novo Assembly.[1]

Installed Version

A Singularity container for Trinity is installed on Picotte.[2]

Load the appropriate modulefile:

trinity/2.14.0

Using

The Singularity image is installed in the path

$TRINITYDIR/Trinity.simg

Example command:

singularity exec --bind=`pwd`:/tmp -e $TRINITYDIR/Trinity.simg \
          Trinity \
          --seqType fq \
          --left /tmp/reads.left.fq.gz  \
          --right /tmp/reads.right.fq.gz \
          --max_memory 1G --CPU 4 \
          --output /tmp/trinity_out_dir

Example job script using the sample data distributed with the Trinity source code, trinityrnaseq/sample_data/test_Trinity_Assembly:

#!/bin/bash
#SBATCH --nodes=1
#SBATCH --cpus-per-task=8
#SBATCH --time=1:00:00
#SBATCH --mem=4G

module load trinity

# NOTE
# this example assumes
# 1) the data files reads.left.fq.gz and reads.right.fq.gz are in the same directory as this job script
# 2) there is an output directory "trinity_out_dir" in the same directory as this job script

singularity exec --bind=`pwd`:/tmp -e $TRINITYDIR/Trinity.simg \
          Trinity \
          --seqType fq \
          --left /tmp/reads.left.fq.gz  \
          --right /tmp/reads.right.fq.gz \
          --max_memory 4G \
          --CPU $SLURM_CPUS_PER_TASK \
          --output /tmp/trinity_out_dir

Truncated output from the example:

/ifs/opt/src/trinityrnaseq/sample_data/test_Trinity_Assembly

     ______  ____   ____  ____   ____  ______  __ __
    |      ||    \ |    ||    \ |    ||      ||  |  |
    |      ||  D  ) |  | |  _  | |  | |      ||  |  |
    |_|  |_||    /  |  | |  |  | |  | |_|  |_||  ~  |
      |  |  |    \  |  | |  |  | |  |   |  |  |___, |
      |  |  |  .  \ |  | |  |  | |  |   |  |  |     |
      |__|  |__|_||____||__|__||____|  |__|  |____/

    Trinity-v2.14.0

Left read files: $VAR1 = [
          '/tmp/reads.left.fq.gz'
        ];
Right read files: $VAR1 = [
          '/tmp/reads.right.fq.gz'
        ];
Trinity version: Trinity-v2.14.0
-currently using the latest production release of Trinity.

Tuesday, November 8, 2022: 12:18:09 CMD: java -Xmx64m -XX:ParallelGCThreads=2  -jar /usr/local/bin/util/support_scripts/ExitTester.jar 0
Tuesday, November 8, 2022: 12:18:09 CMD: java -Xmx4g -XX:ParallelGCThreads=2  -jar /usr/local/bin/util/support_scripts/ExitTester.jar 1
Tuesday, November 8, 2022: 12:18:09 CMD: mkdir -p /tmp/trinity_out_dir/chrysalis

----------------------------------------------------------------------------------
-------------- Trinity Phase 1: Clustering of RNA-Seq Reads  ---------------------
----------------------------------------------------------------------------------

---------------------------------------------------------------
------------ In silico Read Normalization ---------------------
-- (Removing Excess Reads Beyond 200 Coverage --
---------------------------------------------------------------

# running normalization on reads: $VAR1 = [
          [
            '/tmp/reads.left.fq.gz'
          ],
          [
            '/tmp/reads.right.fq.gz'
          ]
        ];

Tuesday, November 8, 2022: 12:18:09 CMD: /usr/local/bin/util/insilico_read_normalization.pl --seqType fq --JM 4G  --max_cov 200 --min_cov 1 --CPU 8 --output /tmp/trinity_out_dir/insilico_read_normalization --max_CV 10000  --left /tmp/reads.left.fq.gz --right /tmp/reads.right.fq.gz --pairs_together  --PARALLEL_STATS
-prepping seqs
Converting input files. (both directions in parallel)CMD: seqtk-trinity seq -A -R 1  <(gunzip -c /tmp/reads.left.fq.gz) >> left.fa

... [other output clipped] ...

All commands completed successfully. :-)

** Harvesting all assembled transcripts into a single multi-fasta file...

Tuesday, November 8, 2022: 12:18:38 CMD: find /tmp/trinity_out_dir/read_partitions/ -name '*inity.fasta'  | /usr/local/bin/util/support_scripts/partitioned_trinity_aggregator.pl --token_prefix TRINITY_DN --output_prefix /tmp/trinity_out_dir/Trinity.tmp
* [Tue Nov  8 12:18:39 2022] Running CMD: /usr/local/bin/util/support_scripts/salmon_runner.pl Trinity.tmp.fasta /tmp/trinity_out_dir/both.fa 8
* [Tue Nov  8 12:18:40 2022] Running CMD: /usr/local/bin/util/support_scripts/filter_transcripts_require_min_cov.pl Trinity.tmp.fasta /tmp/trinity_out_dir/both.fa salmon_outdir/quant.sf 2 > /tmp/trinity_out_dir.Trinity.fasta
Tuesday, November 8, 2022: 12:18:40 CMD: /usr/local/bin/util/support_scripts/get_Trinity_gene_to_trans_map.pl /tmp/trinity_out_dir.Trinity.fasta > /tmp/trinity_out_dir.Trinity.fasta.gene_trans_map

#############################################################################
Finished.  Final Trinity assemblies are written to /tmp/trinity_out_dir.Trinity.fasta
#############################################################################

Grid Config

Currently unavailable on Picotte.

OBSOLETE - Grid Config

There is a default grid config file given by the environment variable TRINITYGRIDCONF, which may or may not work for your application. See the Trinity documentation for details. The default grid config file is as follows:

#--------------------------------------------------------------------------------------------
# grid type:
grid=SGE
# template for a grid submission
cmd=qsub -cwd -j y -l h_rt=2:00:00
# number of grid submissions to be maintained at steady state by the Trinity submission system
max_nodes=8
# number of commands that are batched into a single grid submission job.
cmds_per_node=1
#--------------------------------------------------------------------------------------------

References

[1] Trinity RNA-Seq

[2] Trinity Documentation - Running Trinity using Singularity