.
Computational Biology Research Group University of Oxford
.
.
. Analysis tools - BLAST - blastall .
. .
.
 
CBRG Home
CBRG Accounts (molbiol)
Analysis tools  - ANALOG  - BASE (microarrays)  - BLAST  - EMBOSS  - GBrowse  - Proteomics (Mascot)  - Unix analysis software
Training courses
Tutorials
Unix help
Examples
Papers
Collaborative data
Presentations
Oxford-only section
FAQ: CBRG + UNIX
FAQ: Bioinformatics
Links
 
 
 

SITE MAP

Basic Local Alignment Search Tool.

  - BLAST introduction
  - Run a BLAST search
  - blastall
  - BLAST-searchable databases
  - blastall formatting examples
  - creating BLAST databases (formatdb)
  - sequence filtering

Command line BLAST - blastall

Information that blastall requires:

  • the query sequence name
  • the database name
  • the "flavour" of the blast search you want to run
  • although optional, it is a very good idea to give blastall the name of a file to put all the output into so that you can find it easily later on.

As with any version of blast, the databases to be searched must be properly formatted. Many blast-searchable databases are available via your molbiol account.

To create your own blast-searchable databases to be used with blastall, you need a file containing all of your sequences in fasta format. This data can then be formatted for BLAST searching using a program provided with blastall called formatdb.

Command line options

Table 1. General options available with blastall.
Table 2. Information on the different formatting options available with blastall.
Table 3. Information on the numbers that can be used with the -Q flag to invoke alternative genetic codes when running blastx or tblastx searches.
Table 4. Further options for filtering

An example command line to run a blastp search of a peptide sequence called mypep.tfa, against the swissprot database would be:

blastall -p blastp -d swissprot -i mypep.tfa -o mypep.blastp

Of course, you can add many other flags to get the type of search, and the type of output you want!


Table 1. General options available with blastall.

Flags shown in red are those that are necessary for blastall to run.

Flag Information required Type of information Default Other comments
-p Program Name String (e.g. blastn, blastp, blastx, etc.)    
-d DatabaseName String (e.g. embl, swissprot....etc.) nr default is not a viable choice on enterprise
-i Query filename Filename (and path if necessary) stdin  
-o BLAST report Output filename Filename (and path if necessary) stdout (usually to screen) if you want your results saved into a file, you must specify a filename
-e Expectation value Real number 10.0  
-m Alignment view options integer between 0 and 9 0 see Table 2 for more details
-F Filter query sequence T or F (for true or false) T (i.e. filtering is on) DUST is used with blastn, SEG with others
-G Cost to open a gap integer 0  
-E Cost to extend a gap integer 0  
-X X dropoff value for gapped alignment (in bits) integer 0  
-I show GIs in deflines T or F (for true or false) F GI = NI in EMBL. Tracks versions of an entry.
-q Penalty for a nucleotide mismatch integer -3 for use with blastn only
-r Reward for a nucleotide match integer 1 for use with blastn only
-v Number of one line descriptions integer 500  
-f Threshold for extending hits integer 0  
-b Number of alignments to show integer 250  
-g Perfom gapped alignment T or F (for True or False) T not available with tblastx
-Q Query Genetic code to use integer 1 (Universal) for blastx and tblast[nx] only
-D DB Genetic code integer 1 (Universal) tblastn and tblastx only
-a number of processors to use integer 1  
-O Seq. Align File Filename (and path if necessary) optional - no default  
-J Believe the query defline T or F (for True or False) F  
-M Matrix matrix name (and path if necessary) BLOSUM62 only certain combination of gap penalties and matrices are supported *
-W Word size integer 0 default values are used if 0 is chosen. Defaults are 11 for nucs and 3 for proteins **
-z Effective length of the database integer 0 0 means use the real size of the database (important for the statistics - leave as default)
-K Number of best hits from a region to keep integer 100  
-P 0 for multiple hits 1-pass, 1 for single hit 1-pass, 2 for 2-pass integer 0  
-Y Effective length of the search space real 0.0  
-L Location on query sequence string    
-S Query strands to search against database 1(top strand),2(bottom strand) or 3(both) 3 for blast[nx], and tblastx
-T Produce HTML output T or F (for True or False) F see Table 2 for more details
-l Restrict search of database to list of GIs string    
-U use lower case filtering of FASTA sequence T or F (for True or False) F  
-y Dropoff (X) for blast extensions in bits real 0.0  
-Z X dropoff value for final gapped alignment (in bits) real 0.0  
-R PSI-TBLASTN checkpoint file File In    
-n Megablast search T or F (for True or False) F  
-A Multiple Hits window size integer 40 Zero for single hit algorithm

* See NCBI notes for more information on matrices and gap penalty combinations in BLAST.

** blastn will not work with a word size of less than 7

top | back



Table 2. Information on the different formatting options available with blastall.

Click on the flags to look at examples of the resulting output.
Note: These options are not available for tblastx

Flag Formatting of BLAST alignments
-m 0 pairwise
-m 1 master-slave showing identities
-m 2 master-slave no identities
-m 3 flat master-slave, show identities
-m 4 flat master-slave, no identities
-m 5 master-slave no identities and blunt ends
-m 6 flat master-slave, no identities and blunt ends
-m 7 XML Blast output
-m 8 tabular
-m 9 tabular with comment lines
-T T produce HTML Blast output

top | back



Table 3. Information on the numbers that can be used with the -Q flag to invoke alternative genetic codes when running blastx or tblastx searches.

Number Genetic code
1 standard or universal
2 vertebrate mitochondrial
3 yeast mitochondrial
4 mold, protozoan, coelenterate mitochondrial and mycoplasma/spiroplasma
5 invertebrate mitochondrial
6 ciliate macronuclear
9 echinodermate mitochondrial
10 alternative ciliate macronuclear
11 eubacterial
12 alternative yeast
13 ascidian mitochondrial
14 flatworm mitochondrial

top | back



Table 4. Further options for filtering.

Further information about the actions of SEG and DUST can be found, here.


Flag Filtering effects Example command Default values
-F F Turn off all filtering    
-F "S <window> <locut> <hicut>" Change the SEG options while running to values given after the letter S -F "S 10 1.0 1.5" Seg filters with a window=12, a locut=2.2, and a hicut= 2.5
-F "C" Filter sequences using a coiled-coil filter    
-F "C <window> <cutoff> <linker> Filter sequences using a coiled-coil filter and change the default settings for this filter -F "C 28 40.0 32" The default values are: window=22, cutoff=40/0, linker=32. The example command would change the window length only.
-F "C;S" Run both seg and coiled-coil filtering together    
-F "D" Specify that you wish to filter using DUST    
-F "m S" Specify that masking with SEG should only be done during the building of initial words.    
-F "m D" Specify that masking with coiled-coil should only be done during the building of initial words.    
-F "m C" Specify that masking with DUST should only be done during the building of initial words.    

top | back



Search CBRG web site:

CBRG support

This file last modified Tuesday May 03, 2005