User Tools

Site Tools


mpiblast

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
Last revision Both sides next revision
mpiblast [2017/09/12 20:00]
jokinsey created
mpiblast [2017/09/12 22:40]
wlfarris
Line 1: Line 1:
-==== mpiBlast ====+==== mpiBLAST ====  
 +{{ :​mpiblast.png?​nolink&​200|}} 
 + 
 +mpiBlast ​is a freely available, opensource, parallel implementation of NCBI Blast. mpiBlast takes advantage of shared parallel computign resources, i.e. a cluster this gives it access to more avaliable resources unlike NCBI blast which only can take advantage of shared-memory multi-processors(SMP'​s). 
 + 
 +More information is available [[http://​www.mpiblast.org/​|here]]. ​  
 + 
 +==== Environment Setup ==== 
 + 
 +Edit the ''​$HOME/​.bashrc''​ file to contain these modules. 
 + 
 +<​code>​ 
 +module load gcc/4.5.0 
 +module load openmpi/​1.5.1 
 +module load mpiblast/​1.6.0 
 +</​code>​ 
 + 
 +You may have to logout and log back in for the modules to load. You can check with the command ''​module list'',​ which should also be displayed on login. 
 + 
 +Make a directory to contain the FASTA database that will be fragmented. Download the database and decompress it. 
 + 
 +<​code>​ 
 +razor-l1:​jokinsey:​~$ mkdir db 
 +razor-l1:​jokinsey:​~$ cd db 
 +razor-l1:​jokinsey:​~/​db$ wget ftp://​ftp.ncbi.nlm.nih.gov/​blast/​db/​FASTA/​mito.nt.gz 
 +razor-l1:​jokinsey:​~/​db$ gunzip mito.nt.gz 
 +</​code>​ 
 + 
 +Create a ''​$HOME/​.nbirc''​ file with these values. The shared path tells mpiBlast where to access the FASTA database. 
 + 
 +<​code>​ 
 +[mpiBLAST] 
 +Shared=/​home/​YourUserName/​db 
 +Local=/​local_scratch/​YourUserName 
 +</​code>​ 
 + 
 +Format the database for parallel use, by fragmenting the database for each processor. We will be using a node with 12 processors so we will include the option **- -nfrags=12**. 
 + 
 +<​code>​ 
 +razor-l1:​jokinsey:​~$ mpiformatdb -i ~/​db/​mito.nt --nfrags=12 
 +Reading input file 
 +Done, read 2605891 lines 
 +Database type unspecified,​ assuming nucleotide 
 +Breaking mito.nt into 12 fragments 
 +Executing: formatdb -i /​home/​jokinsey/​db/​mito.nt -p F -N 12 -o T  
 +Created 12 fragments. 
 +<<<​ Please make sure the formatted database fragments are placed in /​home/​jokinsey/​db/​ before executing mpiblast. >>>​  
 +</​code>​ 
 + 
 +==== Example Job ==== 
 + 
 +Create a directory ''​$HOME/​SCHED_PLACE''​ to store the PBS schedule files. Create another directory to store the PBS input scripts, along with the FASTA input and output files. 
 + 
 +<​code>​ 
 +razor-l1:​jokinsey:​~$ mkdir SCHED_PLACE 
 +razor-l1:​jokinsey:​~$ mkdir testing 
 +razor-l1:​jokinsey:​~$ cd testing 
 +</​code>​ 
 + 
 +Create an input FASTA search file and name it input. 
 + 
 +<​code>​ 
 +>​gi|45238842|gb|AY563103.1| Homo sapiens interleukin 2 receptor, alpha (IL2RA) gene, complete cds 
 +GTCCATCTCAGAACCAAGAGTTGGGCCTCTTATTTACCAGAAAAATTGTGGGGGCTTTGTGATATGGCTT 
 +TAAAAAAATCTTGTAATTGCCAGGCGTGGTGGCTCACACCTGTAATCCCAGCACTTTGGGAGGCCGAGGT 
 +GGGTGAATCGCCTAAGGTCAGGAGTTCGAGACCAGCCTGACCAACATGGTGAAACTCCGTCTCTACTAAA 
 +AATACAAAAACTAGCTGGATGTGGTGACGCGTGCCTGTAATCCTAGCTACTCAGGAGGCTGACGCAGGAG 
 +AATCACTTGAACCTGGGAGGCAGAGGTTGCAGTGAGCCAAGATTGTGCCATTGCGCTCCAAAAAAAAAAA 
 +AAAAAAGACATTAACATAAATTTAAATATTTTATAATGACAATCCACATTAACTACTTAAAGCATAAGCT 
 +ATTTTCCAGGAGAGGCAGCAAGTGCATTCTACTCCCATGCCCAAGAAGAAAGGAGCGTGACTTTGGTGGG 
 +AGTACTAGGAGTTTCTACTGGAGCACTTGCCCGCAGAGTGAGAAACGTTCCTAGAGAGGAAGTTATACCT 
 +GCTGTGGAATTTAAGAGAATCTTGTCATATTTTGACAAGTTTTTTGAGATGGAAGTCTCACTCTGTCGCC 
 +</​code>​ 
 + 
 +Create a PBS input script and name it mpiBlastTest.pbs 
 + 
 +<​code>​ 
 +#!/bin/bash  
 +#PBS -N MPIBLAST 
 +#PBS -q tiny12core 
 +#PBS -j oe 
 +#PBS -m abe 
 +#PBS -M YourUserName@uark.edu 
 +#PBS -o MPIBLAST.$PBS_JOBID 
 +#PBS -l nodes=1:ppn=12 
 +#PBS -l walltime=02:00:00 
 + 
 +cd "​$PBS_O_WORKDIR"​ 
 + 
 +mpirun -np 12 mpiblast -p blastn -d mito.nt -i /​home/​YourUserName/​testing/​input -o /​home/​YourUserName/​testing/​output 
 +</​code>​ 
 + 
 +Notice the line ''​cd "​$PBS_O_WORKDIR"''​ this is saying we want to start running the job from the current directory. 
 + 
 +Then submit the job. 
 + 
 +<​code>​ 
 +razor-l1:​jokinsey:​~/​SCHED_PLACE$ qsub /​home/​YourUserName/​testing/​mpiBlastTest.pbs  
 +</​code>​ 
 + 
 +Then notice we submit the job from the directory ''​$HOME/​SCHED_PLACE''​ this will save the schedule file in this directory, since we declared we want to run the job from the current directory. The output will be in ''​$HOME/​testing/​output''​. 
 + 
 + 
mpiblast.txt · Last modified: 2017/10/10 18:07 by jokinsey