TopHat is a fast splice junction mapper for RNA-Seq reads. TopHat is a collaborative effort among Daehwan Kim and Steven Salzberg in the Center for Computational Biology at Johns Hopkins University, and Cole Trapnell in the Genome Sciences Department at the University of Washington. You can find more information on TopHat here.
To work with TopHat we will need to load the module for TopHat and its dependencies. The easiest way to do this is to modify the .bashrc
file in your $HOME
directory, and add the lines below.
module load bowtie2/2.2.3 module load gcc/4.8.2 module load boost/1.57.0 module load tophat/2.1.1
In your $HOME
directory create another directory where you will run the TopHat jobs from.
razor-l2:jokinsey:~$ mkdir TOPHAT-JOBS
Copy the sample data from salmon which has the reads and reference that we will use to run TopHat.
razor-l1:jokinsey:~/TOPHAT-JOBS/tophat.3583803.sched$ cp /share/apps/bioinformatics/salmon/Salmon-0.8.2_linux_x86_64/sample_data.tgz . razor-l1:jokinsey:~/TOPHAT-JOBS/tophat.3583803.sched$ tar -xzf sample_data.tgz
Once you have the sample data use Bowtie2 to build and index that we can use run TopHat.
razor-l1:jokinsey:~/TOPHAT-JOBS/tophat.3583803.sched$ cd sample_data razor-l1:jokinsey:~/TOPHAT-JOBS/tophat.3583803.sched$ bowtie2-build transcripts.fasta transcripts
This is a small reference for sampling otherwise we would submit a job to build the index. Now in your sampledata
and folder should be your reference
transcripts.fasta, the index files, and your reads
read1.fastqread_2.fastq
.
To run the example job create a PBS
file named tophat.pbs
with the information below.
#!/bin/bash #PBS -N TopHat #PBS -q tiny12core #PBS -j oe #PBS -o tophat.$PBS_JOBID #PBS -l nodes=1:ppn=12 #PBS -l walltime=0:05:00 cd $PBS_O_WORKDIR cp -r sample_data /scratch/$PBS_JOBID cd /scratch/$PBS_JOBID tophat sample_data/transcripts sample_data/reads_1.fastq,sample_data/reads_2.fastq mkdir $PBS_O_WORKDIR/tophat.$PBS_JOBID cp -r * $PBS_O_WORKDIR/tophat.$PBS_JOBID
Now all that's left to do is submit the job.
razor-l1:jokinsey:~/TOPHAT-JOBS$ qsub tophat.pbs
Your should see the output in the directory tophat.$PBSJOBID/tophatout
. You can find information on how to interpret the output here