User Tools

Site Tools


mpiblast

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
Next revision
Previous revision
mpiblast [2017/09/12 20:18]
jokinsey
mpiblast [2017/10/10 18:07] (current)
jokinsey Changed PBS script to run in /scratch
Line 1: Line 1:
-==== mpiBlast ​====+==== mpiBLAST ​====  
 +{{ :​mpiblast.png?​nolink&​200|}}
  
 mpiBlast is a freely available, opensource, parallel implementation of NCBI Blast. mpiBlast takes advantage of shared parallel computign resources, i.e. a cluster this gives it access to more avaliable resources unlike NCBI blast which only can take advantage of shared-memory multi-processors(SMP'​s). mpiBlast is a freely available, opensource, parallel implementation of NCBI Blast. mpiBlast takes advantage of shared parallel computign resources, i.e. a cluster this gives it access to more avaliable resources unlike NCBI blast which only can take advantage of shared-memory multi-processors(SMP'​s).
Line 7: Line 8:
 ==== Environment Setup ==== ==== Environment Setup ====
  
 +Edit the ''​$HOME/​.bashrc''​ file to contain these modules.
 +
 +<​code>​
 +module load gcc/4.5.0
 +module load openmpi/​1.5.1
 +module load mpiblast/​1.6.0
 +</​code>​
 +
 +You may have to logout and log back in for the modules to load. You can check with the command ''​module list'',​ which should also be displayed on login.
 +
 +Make a directory to contain the FASTA database that will be fragmented. Download the database and decompress it.
 +
 +<​code>​
 +razor-l1:​jokinsey:​~$ mkdir db
 +razor-l1:​jokinsey:​~$ cd db
 +razor-l1:​jokinsey:​~/​db$ wget ftp://​ftp.ncbi.nlm.nih.gov/​blast/​db/​FASTA/​mito.nt.gz
 +razor-l1:​jokinsey:​~/​db$ gunzip mito.nt.gz
 +</​code>​
 +
 +Create a ''​$HOME/​.nbirc''​ file with these values. The shared path tells mpiBlast where to access the FASTA database.
 +
 +<​code>​
 +[mpiBLAST]
 +Shared=/​home/​YourUserName/​db
 +Local=/​local_scratch/​YourUserName
 +</​code>​
 +
 +Format the database for parallel use, by fragmenting the database for each processor. We will be using a node with 12 processors so we will include the option **- -nfrags=12**.
 +
 +<​code>​
 +razor-l1:​jokinsey:​~$ mpiformatdb -i ~/​db/​mito.nt --nfrags=12
 +Reading input file
 +Done, read 2605891 lines
 +Database type unspecified,​ assuming nucleotide
 +Breaking mito.nt into 12 fragments
 +Executing: formatdb -i /​home/​jokinsey/​db/​mito.nt -p F -N 12 -o T 
 +Created 12 fragments.
 +<<<​ Please make sure the formatted database fragments are placed in /​home/​jokinsey/​db/​ before executing mpiblast. >>> ​
 +</​code>​
 +
 +==== Example Job ====
 +
 +Create a directory ''​$HOME/​SCHED_PLACE''​ to store the PBS schedule files. Create another directory to store the PBS input scripts, along with the FASTA input and output files.
 +
 +<​code>​
 +razor-l1:​jokinsey:​~$ mkdir SCHED_PLACE
 +razor-l1:​jokinsey:​~$ mkdir testing
 +razor-l1:​jokinsey:​~$ cd testing
 +</​code>​
 +
 +Create an input FASTA search file and name it input.
 +
 +<​code>​
 +>​gi|45238842|gb|AY563103.1| Homo sapiens interleukin 2 receptor, alpha (IL2RA) gene, complete cds
 +GTCCATCTCAGAACCAAGAGTTGGGCCTCTTATTTACCAGAAAAATTGTGGGGGCTTTGTGATATGGCTT
 +TAAAAAAATCTTGTAATTGCCAGGCGTGGTGGCTCACACCTGTAATCCCAGCACTTTGGGAGGCCGAGGT
 +GGGTGAATCGCCTAAGGTCAGGAGTTCGAGACCAGCCTGACCAACATGGTGAAACTCCGTCTCTACTAAA
 +AATACAAAAACTAGCTGGATGTGGTGACGCGTGCCTGTAATCCTAGCTACTCAGGAGGCTGACGCAGGAG
 +AATCACTTGAACCTGGGAGGCAGAGGTTGCAGTGAGCCAAGATTGTGCCATTGCGCTCCAAAAAAAAAAA
 +AAAAAAGACATTAACATAAATTTAAATATTTTATAATGACAATCCACATTAACTACTTAAAGCATAAGCT
 +ATTTTCCAGGAGAGGCAGCAAGTGCATTCTACTCCCATGCCCAAGAAGAAAGGAGCGTGACTTTGGTGGG
 +AGTACTAGGAGTTTCTACTGGAGCACTTGCCCGCAGAGTGAGAAACGTTCCTAGAGAGGAAGTTATACCT
 +GCTGTGGAATTTAAGAGAATCTTGTCATATTTTGACAAGTTTTTTGAGATGGAAGTCTCACTCTGTCGCC
 +</​code>​
 +
 +Create a PBS input script and name it mpiBlastTest.pbs
 +
 +<​code>​
 +#​!/​bin/​bash ​
 +#PBS -N MPIBLAST
 +#PBS -q tiny12core
 +#PBS -j oe
 +#PBS -m abe
 +#PBS -M jokinsey@uark.edu
 +#PBS -o MPIBLAST.$PBS_JOBID
 +#PBS -l nodes=1:​ppn=12
 +#PBS -l walltime=02:​00:​00
 +
 +
 +cd "​$PBS_O_WORKDIR"​
 +cp input /​scratch/​$PBS_JOBID
 +cd /​scratch/​$PBS_JOBID
 +
 +mpirun -np 12 mpiblast -p blastn -d mito.nt -i input -o $HOME/​testing/​output
 +</​code>​
 +
 +In the ''​PBS''​ script first we copy our input file to the directory we will be working in ''/​scratch/​$PBS_JOBID''​. Then we go into that directory to run the computation and send the output to ''​$HOME/​testing/​output''​
 +
 +Then submit the job.
 +
 +<​code>​
 +razor-l3:​jokinsey:​~/​testing$ qsub mpiBlastTest.pbs ​
 +</​code>​
 +
 +Then notice we submit the job from the directory ''​$HOME/​SCHED_PLACE''​ this will save the schedule file in this directory, since we declared we want to run the job from the current directory. The output will be in ''​$HOME/​testing/​output''​.
  
  
  
mpiblast.1505247503.txt.gz · Last modified: 2017/09/12 20:18 by jokinsey