ali2gff(1) [debian man page]
ALI2GFF(1) General Commands Manual ALI2GFF(1)
NAME
ali2gff - Module to translate a MUMmer output files into gff formatted output.
SYNOPSIS
ali2gff [-r] [-t <.|0|1|2>] [-x <name>] [-y <name>] [-H] [-f] [-h] <MUMmer_output_file>
OPTIONS
-h, --help
Show summary of options.
-r Interchange the order of sequences (sequence 1 on y-axis, sequence 2 on x-axis).
-t <.|0|1|2>
Put label 'frame' in gff output.
-x <name>
Specify the species name for species1 (default: "Seq1").
-y <name>
Specify the species name for species2 (default: "Seq2").
-i Ignore full sequence identities.
-f Write output to file.
SEE ALSO
blat2gff(1), gff2aplot(1), parseblast(1), sim2gff(1).
AUTHOR
ali2gff was written by Steffi Gebauer-Jung.
This manual page was written by Nelson A. de Oliveira <naoliv@gmail.com>, for the Debian project (but may be used by others).
Mon, 21 Mar 2005 21:44:15 -0300 ALI2GFF(1)
Check Out this Related Man Page
PARSEBLAST(1) General Commands Manual PARSEBLAST(1)
NAME
parseblast - Filtering High-scoring Segment Pairs (HSPs) from WU/NCBI BLAST.
SYNOPSIS
parseblast [options] <results.from.blast>
DESCRIPTION
This manual page documents briefly the parseblast command.
Different output options are available, the most important here are those allowing to write HSPs in GFF format (GFFv1, GFFv2 or APLOT).
Sequences can be included in the GFF records as a comment field. Furthermore, this script can output also the alignments for each HSP in
ALN, MSF or tabular formats.
NOTE - If first line from blast program output (the one containing which flavour has been run, say here BLASTN, BLASTP, BLASTX, TBLASTN or
TBLASTX), is missing, the program assumes that it contains BLASTN HSP records. So that, ensure that you feed the parseblast script with a
well formatted BLAST file. Sometimes there are no spaces between the HSP coords and its sequence, as it sometimes happens in Web-Blast or
Paracel-Blast outputs. Now those records are processed ok and that HSP is retrieved as well as "standard" ones.
WARNING - Frame fields from GFF records generated with parseblast contain BLAST frame (".","1","2","3") instead of the GFF standard values
(".","0","1","2"). As the frame for reverse strand must be recalculated from the original sequence length, we suggest users to post-process
the GFF output from this script with a suitable filter that fix the frames (in case that the program that is going to use the GFF records
will not work with the original BLAST frames). We provide the command-line option "--no-frame" to set frames to "." (meaning that there is
no frame).
OPTIONS
parseblast prints output in "HSP" format by default (see below). It takes input from <STDIN> or single/multiple files, and writes its out-
put to <STDOUT>, so user can redirect to a file but he also could use the program as a filter within a pipe. "-N", "-M", "-P", "-G", "-F",
"-A" and "-X" options (also the long name versions for each one) are mutually exclusive, and their precedence order is shown above.
GFF OPTIONS:
-G, --gff
Prints output in GFFv1 format.
-F, --fullgff
Prints output in GFFv2 "alignment" format ("target").
-A, --aplot
Prints output in pseudo-GFF APLOT "alignment" format.
-S, --subject
Projecting GFF output by SUBJECT (default by QUERY).
-Q, --sequence
Append query and subject sequences to GFF record.
-b, --bit-score
Set <score> field to Bits (default Alignment Score).
-i, --identity-score
Set <score> field to Identities (default Alignment).
-s, --full-scores
Include all scores for each HSP in each GFF record.
-u, --no-frame
Set all frames to "." (GFF for not available frames).
-t, --compact-tags
Target coords+strand+frame in short form (NO GFFv2!).
ALIGNMENT OPTIONS:
-P, --pairwise
Prints pairwise alignment for each HSP in TBL format.
-M, --msf
Prints pairwise alignment for each HSP in MSF format.
-N, --aln
Prints pairwise alignment for each HSP in ALN format.
-W, --show-coords
Adds start/end positions to alignment output.
GENERAL OPTIONS:
-X, --expanded
Expanded output (producing multiline output records).
-c, --comments
Include parameters from blast program as comments.
-n, --no-comments
Do not print "#" lines (raw output without comments).
-v, --verbose
Warnings sent to <STDERR>.
--version
Prints program version and exits.
-h, --help
Shows this help and exits.
OUTPUT FORMATS
:
"S_" stands for "Subject_Sequence" and "Q_" for "Query_Sequence". <Program> name is taken from input blast file. <Strands> are calculated
from <start> and <end> positions on original blast file. <Frame> is obtained from the blast file if is present else is set to ".". <SCORE>
is set to Alignment Score by default, you can change it with "-b" and "-i".
If "-S" or "--subject" options are given, then QUERY fields are referred to SUBJECT and SUBJECT fields are relative to QUERY (this only
available for GFF output records).
Dots ("...") mean that record description continues in the following line, but such record is printed as a single line record by parse-
blast.
[HSP] <- (This is the DEFAULT OUTPUT FORMAT)
<Program> <DataBase> : ...
... <IdentityMatches> <Min_Length> <IdentityScore> ...
... <AlignmentScore> <BitScore> <E_Value> <P_Sum> : ...
... <Q_Name> <Q_Start> <Q_End> <Q_Strand> <Q_Frame> : ...
... <S_Name> <S_Start> <S_End> <S_Strand> <S_Frame> : <S_FullDescription>
[GFF]
<Q_Name> <Program> hsp <Q_Start> <Q_End> <SCORE> <Q_Strand> <Q_Frame> <S_Name>
[FULL GFF] <- (GFF showing alignment data)
<Q_Name> <Program> hsp <Q_Start> <Q_End> <SCORE> <Q_Strand> <Q_Frame> ...
... Target "<S_Name>" <S_Start> <S_End> ...
... E_value <E_Value> Strand <S_Strand> Frame <S_Frame>
[APLOT] <- (GFF format enhanced for APLOT program)
<Q_Name>:<S_Name> <Program> hsp <Q_Start>:<S_Start> <Q_End>:<S_End> <SCORE> ...
... <Q_Strand>:<S_Strand> <Q_Frame>:<S_Frame> <BitScore>:<HSP_Number> ...
... # E_value <E_Value>
[EXPANDED]
MATCH(<HSP_Number>): <Q_Name> x <S_Name>
SCORE(<HSP_Number>): <AlignmentScore>
BITSC(<HSP_Number>): <BitScore>
EXPEC(<HSP_Number>): <E_Value> Psum(<P_Sum>)
IDENT(<HSP_Number>): <IdentityMatches>/<Min_Length> : <IdentityScore> %
T_GAP(<HSP_Number>): <TotalGaps(BothSeqs)>
FRAME(<HSP_Number>): <Q_Frame>/<S_Frame>
STRND(<HSP_Number>): <Q_Strand>/<S_Strand>
MXLEN(<HSP_Number>): <Max_Length>
QUERY(<HSP_Number>): length <Q_Length> : gaps <Q_TotalGaps> : ...
... <Q_Start> <Q_End> : <Q_Strand> : <Q_Frame> : <Q_FullSequence>
SBJCT(<HSP_Number>): length <S_Length> : gaps <S_TotalGaps> : ...
... <S_Start> <S_End> : <S_Strand> : <S_Frame> : <S_FullSequence>
SEE ALSO
ali2gff(1), blat2gff(1), gff2aplot(1), sim2gff(1).
AUTHOR
parseblast was written by Josep F. Abril <abril@imim.es>.
This manual page was written by Nelson A. de Oliveira <naoliv@gmail.com>, for the Debian project (but may be used by others).
Mon, 21 Mar 2005 21:44:15 -0300 PARSEBLAST(1)