Table of Contents

TopHat

TopHat is a fast splice junction mapper for RNA-Seq reads. TopHat is a collaborative effort among Daehwan Kim and Steven Salzberg in the Center for Computational Biology at Johns Hopkins University, and Cole Trapnell in the Genome Sciences Department at the University of Washington. You can find more information on TopHat here.

Enviornment Setup

To work with TopHat we will need to load the module for TopHat and its dependencies. The easiest way to do this is to modify the .bashrc file in your $HOME directory, and add the lines below.

module load bowtie2/2.2.3
module load gcc/4.8.2
module load boost/1.57.0
module load tophat/2.1.1

In your $HOME directory create another directory where you will run the TopHat jobs from.

razor-l2:jokinsey:~$ mkdir TOPHAT-JOBS

Copy the sample data from salmon which has the reads and reference that we will use to run TopHat.

razor-l1:jokinsey:~/TOPHAT-JOBS/tophat.3583803.sched$ cp /share/apps/bioinformatics/salmon/Salmon-0.8.2_linux_x86_64/sample_data.tgz .
razor-l1:jokinsey:~/TOPHAT-JOBS/tophat.3583803.sched$ tar -xzf sample_data.tgz

Once you have the sample data use Bowtie2 to build and index that we can use run TopHat.

razor-l1:jokinsey:~/TOPHAT-JOBS/tophat.3583803.sched$ cd sample_data
razor-l1:jokinsey:~/TOPHAT-JOBS/tophat.3583803.sched$ bowtie2-build transcripts.fasta transcripts

This is a small reference for sampling otherwise we would submit a job to build the index. Now in your sampledata folder should be your reference transcripts.fasta, the index files, and your reads read1.fastq and read_2.fastq.

Example Job

To run the example job create a PBS file named tophat.pbs with the information below.

#!/bin/bash
#PBS -N TopHat
#PBS -q tiny12core
#PBS -j oe
#PBS -o tophat.$PBS_JOBID
#PBS -l nodes=1:ppn=12
#PBS -l walltime=0:05:00

cd $PBS_O_WORKDIR

cp -r sample_data /scratch/$PBS_JOBID
cd /scratch/$PBS_JOBID

tophat sample_data/transcripts sample_data/reads_1.fastq,sample_data/reads_2.fastq

mkdir $PBS_O_WORKDIR/tophat.$PBS_JOBID
cp -r * $PBS_O_WORKDIR/tophat.$PBS_JOBID

Now all that's left to do is submit the job.

razor-l1:jokinsey:~/TOPHAT-JOBS$ qsub tophat.pbs

Your should see the output in the directory tophat.$PBSJOBID/tophatout. You can find information on how to interpret the output here