User Tools

Site Tools


mpiblast

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
mpiblast [2017/09/12 20:00]
jokinsey created
mpiblast [2017/10/10 18:07] (current)
jokinsey Changed PBS script to run in /scratch
Line 1: Line 1:
-==== mpiBlast ====+==== mpiBLAST ====  
 +{{ :​mpiblast.png?​nolink&​200|}} 
 + 
 +mpiBlast ​is a freely available, opensource, parallel implementation of NCBI Blast. mpiBlast takes advantage of shared parallel computign resources, i.e. a cluster this gives it access to more avaliable resources unlike NCBI blast which only can take advantage of shared-memory multi-processors(SMP'​s). 
 + 
 +More information is available [[http://​www.mpiblast.org/​|here]]. ​  
 + 
 +==== Environment Setup ==== 
 + 
 +Edit the ''​$HOME/​.bashrc''​ file to contain these modules. 
 + 
 +<​code>​ 
 +module load gcc/4.5.0 
 +module load openmpi/​1.5.1 
 +module load mpiblast/​1.6.0 
 +</​code>​ 
 + 
 +You may have to logout and log back in for the modules to load. You can check with the command ''​module list'',​ which should also be displayed on login. 
 + 
 +Make a directory to contain the FASTA database that will be fragmented. Download the database and decompress it. 
 + 
 +<​code>​ 
 +razor-l1:​jokinsey:​~$ mkdir db 
 +razor-l1:​jokinsey:​~$ cd db 
 +razor-l1:​jokinsey:​~/​db$ wget ftp://​ftp.ncbi.nlm.nih.gov/​blast/​db/​FASTA/​mito.nt.gz 
 +razor-l1:​jokinsey:​~/​db$ gunzip mito.nt.gz 
 +</​code>​ 
 + 
 +Create a ''​$HOME/​.nbirc''​ file with these values. The shared path tells mpiBlast where to access the FASTA database. 
 + 
 +<​code>​ 
 +[mpiBLAST] 
 +Shared=/​home/​YourUserName/​db 
 +Local=/​local_scratch/​YourUserName 
 +</​code>​ 
 + 
 +Format the database for parallel use, by fragmenting the database for each processor. We will be using a node with 12 processors so we will include the option **- -nfrags=12**. 
 + 
 +<​code>​ 
 +razor-l1:​jokinsey:​~$ mpiformatdb -i ~/​db/​mito.nt --nfrags=12 
 +Reading input file 
 +Done, read 2605891 lines 
 +Database type unspecified,​ assuming nucleotide 
 +Breaking mito.nt into 12 fragments 
 +Executing: formatdb -i /​home/​jokinsey/​db/​mito.nt -p F -N 12 -o T  
 +Created 12 fragments. 
 +<<<​ Please make sure the formatted database fragments are placed in /​home/​jokinsey/​db/​ before executing mpiblast. >>>​  
 +</​code>​ 
 + 
 +==== Example Job ==== 
 + 
 +Create a directory ''​$HOME/​SCHED_PLACE''​ to store the PBS schedule files. Create another directory to store the PBS input scripts, along with the FASTA input and output files. 
 + 
 +<​code>​ 
 +razor-l1:​jokinsey:​~$ mkdir SCHED_PLACE 
 +razor-l1:​jokinsey:​~$ mkdir testing 
 +razor-l1:​jokinsey:​~$ cd testing 
 +</​code>​ 
 + 
 +Create an input FASTA search file and name it input. 
 + 
 +<​code>​ 
 +>​gi|45238842|gb|AY563103.1| Homo sapiens interleukin 2 receptor, alpha (IL2RA) gene, complete cds 
 +GTCCATCTCAGAACCAAGAGTTGGGCCTCTTATTTACCAGAAAAATTGTGGGGGCTTTGTGATATGGCTT 
 +TAAAAAAATCTTGTAATTGCCAGGCGTGGTGGCTCACACCTGTAATCCCAGCACTTTGGGAGGCCGAGGT 
 +GGGTGAATCGCCTAAGGTCAGGAGTTCGAGACCAGCCTGACCAACATGGTGAAACTCCGTCTCTACTAAA 
 +AATACAAAAACTAGCTGGATGTGGTGACGCGTGCCTGTAATCCTAGCTACTCAGGAGGCTGACGCAGGAG 
 +AATCACTTGAACCTGGGAGGCAGAGGTTGCAGTGAGCCAAGATTGTGCCATTGCGCTCCAAAAAAAAAAA 
 +AAAAAAGACATTAACATAAATTTAAATATTTTATAATGACAATCCACATTAACTACTTAAAGCATAAGCT 
 +ATTTTCCAGGAGAGGCAGCAAGTGCATTCTACTCCCATGCCCAAGAAGAAAGGAGCGTGACTTTGGTGGG 
 +AGTACTAGGAGTTTCTACTGGAGCACTTGCCCGCAGAGTGAGAAACGTTCCTAGAGAGGAAGTTATACCT 
 +GCTGTGGAATTTAAGAGAATCTTGTCATATTTTGACAAGTTTTTTGAGATGGAAGTCTCACTCTGTCGCC 
 +</​code>​ 
 + 
 +Create a PBS input script and name it mpiBlastTest.pbs 
 + 
 +<​code>​ 
 +#!/bin/bash  
 +#PBS -N MPIBLAST 
 +#PBS -q tiny12core 
 +#PBS -j oe 
 +#PBS -m abe 
 +#PBS -M jokinsey@uark.edu 
 +#PBS -o MPIBLAST.$PBS_JOBID 
 +#PBS -l nodes=1:ppn=12 
 +#PBS -l walltime=02:00:00 
 + 
 + 
 +cd "​$PBS_O_WORKDIR"​ 
 +cp input /​scratch/​$PBS_JOBID 
 +cd /​scratch/​$PBS_JOBID 
 + 
 +mpirun -np 12 mpiblast -p blastn -d mito.nt -i input -o $HOME/​testing/​output 
 +</​code>​ 
 + 
 +In the ''​PBS''​ script first we copy our input file to the directory we will be working in ''/​scratch/​$PBS_JOBID''​. Then we go into that directory to run the computation and send the output to ''​$HOME/​testing/​output''​ 
 + 
 +Then submit the job. 
 + 
 +<​code>​ 
 +razor-l3:​jokinsey:​~/​testing$ qsub mpiBlastTest.pbs  
 +</​code>​ 
 + 
 +Then notice we submit the job from the directory ''​$HOME/​SCHED_PLACE''​ this will save the schedule file in this directory, since we declared we want to run the job from the current directory. The output will be in ''​$HOME/​testing/​output''​. 
 + 
 + 
mpiblast.1505246459.txt.gz · Last modified: 2017/09/12 20:00 by jokinsey