hhmake(1) [debian man page]

HHMAKE(1)							   User Commands							 HHMAKE(1)

NAME

       hhmake - build an HMM from an input alignment or convert between HMMER format and HHsearch format

SYNOPSIS

       hhmake -i file [options]

DESCRIPTION

       HHmake version 2.0.15 (June 2012) Build an HMM from an input alignment in A2M, A3M, or FASTA format, or convert between HMMER format (.hmm)
       and HHsearch format (.hhm).  Remmert M, Biegert A, Hauser A, and Soding J.  HHblits: Lightning-fast iterative protein sequence searching by
       HMM-HMM alignment.  Nat. Methods 9:173-175 (2011).  (C) Johannes Soeding, Michael Remmert, Andreas Biegert, Andreas Hauser

       -i <file>
	      query alignment (A2M, A3M, or FASTA), or query HMM

   Output options:
       -o <file>
	      HMM file to be written to  (default=<infile.hhm>)

       -a <file>
	      HMM file to be appended to

       -v <int>
	      verbose mode: 0:no screen output	1:only warings	2: verbose

       -seq <int>
	      max. number of query/template sequences displayed (def=10) Beware of overflows! All these sequences are stored in memory.

       -cons  make consensus sequence master sequence of query MSA

       -name <name>
	      use this name for HMM (default: use name of first sequence)

       Filter query multiple sequence alignment

       -id    [0,100]  maximum pairwise sequence identity (%) (def=90)

       -diff [0,inf[
	      filter MSA by selecting most diverse set of sequences, keeping at least this many seqs in each MSA block of length 50 (def=100)

       -cov   [0,100]  minimum coverage with query (%) (def=0)

       -qid   [0,100]  minimum sequence identity with query (%) (def=0)

       -qsc   [0,100]  minimum score per column with query  (def=-20.0)

       -neff [1,inf]
	      target diversity of alignment (default=off)

   Input alignment format:
       -M a2m use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted)

       -M first
	      use FASTA: columns with residue in 1st sequence are match states

       -M [0,100]
	      use FASTA: columns with fewer than X% gaps are match states

   Pseudocount (pc) options:
       -pcm   0-2      position dependence of pc admixture 'tau' (pc mode, default=0)

       0: no pseudo counts:
	      tau = 0

       1: constant
	      tau = a

	      2: diversity-dependent: tau = a/(1 + ((Neff[i]-1)/b)^c) (Neff[i]: number of effective seqs in local MSA around column i) 3: constant
	      diversity pseudocounts

       -pca   [0,1]    overall pseudocount admixture (def=1.0)

       -pcb   [1,inf[  Neff threshold value for -pcm 2 (def=1.5)

       -pcc   [0,3]    extinction exponent c for -pcm 2 (def=1.0)

       -pre_pca [0,1]
	      PREFILTER pseudocount admixture (def=0.8)

       -pre_pcb [1,inf[ PREFILTER threshold for Neff (def=1.8)

   Context-specific pseudo-counts:
       -nocontxt
	      use substitution-matrix instead of context-specific pseudocounts

       -contxt <file> context file for computing context-specific pseudocounts (default=debian/tmp/usr/share/hhsuite/data/context_data.lib)

       -cslib
	      <file> column state file for fast database prefiltering (default=debian/tmp/usr/share/hhsuite/data/cs219.lib)

       Example: hhmake -i test.a3m

hhmake 2.0.15							     June 2012								 HHMAKE(1)

Check Out this Related Man Page

HHALIGN(1)							   User Commands							HHALIGN(1)

NAME

       hhalign - align a query alignment/HMM to a template alignment/HMM

SYNOPSIS

       hhalign -i query [-t template] [options]

DESCRIPTION

       HHalign	version  2.0.15 (June 2012) Align a query alignment/HMM to a template alignment/HMM by HMM-HMM alignment If only one alignment/HMM
       is given it is compared to itself and the best off-diagonal alignment  plus  all  further  non-overlapping  alignments  above  significance
       threshold  are  shown.	Remmert  M,  Biegert  A,  Hauser A, and Soding J.  HHblits: Lightning-fast iterative protein sequence searching by
       HMM-HMM alignment.  Nat. Methods 9:173-175 (2011).  (C) Johannes Soeding, Michael Remmert, Andreas Biegert, Andreas Hauser

       -i <file>
	      input query alignment  (fasta/a2m/a3m) or HMM file (.hhm)

       -t <file>
	      input template alignment (fasta/a2m/a3m) or HMM file (.hhm)

       -png <file>
	      write dotplot into PNG-file (default=none)

   Output options:
       -o <file>
	      write output alignment to file

       -ofas <file>
	      write alignments in FASTA, A2M (-oa2m) or A3M (-oa3m) format

       -Oa3m <file>
	      write query alignment in a3m format to file (default=none)

       -Aa3m <file>
	      append query alignment in a3m format to file (default=none)

       -atab <file>
	      write alignment as a table (with posteriors) to file (default=none)

       -index <file> use given alignment to calculate Viterbi score (default=none)

       -v <int>
	      verbose mode: 0:no screen output	1:only warings	2: verbose

       -seq   [1,inf[ max. number of query/template sequences displayed  (def=1)

       -nocons
	      don't show consensus sequence in alignments (default=show)

       -nopred
	      don't show predicted 2ndary structure in alignments (default=show)

       -nodssp
	      don't show DSSP 2ndary structure in alignments (default=show)

       -ssconf
	      show confidences for predicted 2ndary structure in alignments

       -aliw int
	      number of columns per line in alignment list (def=80)

       -P <float>
	      for self-comparison: max p-value of alignments (def=0.001

       -p <float>
	      minimum probability in summary and alignment list (def=0)

       -E <float>
	      maximum E-value in summary and alignment list (def=1E+06)

       -Z <int>
	      maximum number of lines in summary hit list (def=100)

       -z <int>
	      minimum number of lines in summary hit list (def=1)

       -B <int>
	      maximum number of alignments in alignment list (def=100)

       -b <int>
	      minimum number of alignments in alignment list (def=1)

       -rank int
	      specify rank of alignment to write with -Oa3m or -Aa3m option (default=1)

   Dotplot options:
       -dthr <float> probability/score threshold for dotplot (default=0.50)

       -dsca <int>
	      if value <= 20: size of dot plot unit box in pixels if value > 20: maximum dot plot size in pixels (default=600)

       -dwin <int>
	      average score over window [i-W..i+W] (for -norealign) (def=10)

       -dali <list>
	      show alignments with indices in <list> in dot plot <list> = <index1> ... <indexN>  or  <list> = all

   Filter input alignment (options can be combined):
       -id    [0,100] maximum pairwise sequence identity (%) (def=90)

       -diff [0,inf[ filter most diverse set of sequences, keeping at least this

	      many sequences in each block of >50 columns (def=100)

       -cov   [0,100] minimum coverage with query (%) (def=0)

       -qid   [0,100] minimum sequence identity with query (%) (def=0)

       -qsc   [0,100] minimum score per column with query  (def=-20.0)

   Input alignment format:
       -M a2m use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted)

       -M first
	      use FASTA: columns with residue in 1st sequence are match states

       -M [0,100]
	      use FASTA: columns with fewer than X% gaps are match states

   HMM-HMM alignment options:
       -glob/-loc
	      global or local alignment mode (def=local)

       -alt <int>
	      show up to this number of alternative alignments (def=1)

       -realign
	      realign displayed hits with max. accuracy (MAC) algorithm

       -norealign
	      do NOT realign displayed hits with MAC algorithm (def=realign)

       -mact [0,1[
	      posterior probability threshold for MAC alignment (def=0.350) A threshold value of 0.0 yields global alignments.

       -sto <int>
	      use global stochastic sampling algorithm to sample this many alignments

       -excl <range> exclude query positions from the alignment, e.g. '1-33,97-168'

       -shift [-1,1] score offset (def=-0.030)

       -corr [0,1]
	      weight of term for pair correlations (def=0.10)

       -ssm   0-4     0:no ss scoring [default=2]

	      1:ss scoring after alignment 2:ss scoring during alignment

       -ssw   [0,1]   weight of ss score  (def=0.11)

       -def   read default options from ./.hhdefaults or <home>/.hhdefault.

       Example: hhalign -i T0187.a3m -t d1hz4a_.hhm -png T0187pdb.png

   Output options:
       -o <file>
	      write output alignment to file

       -ofas <file>
	      write alignments in FASTA, A2M (-oa2m) or A3M (-oa3m) format

       -Oa3m <file>
	      write query alignment in a3m format to file (default=none)

       -Aa3m <file>
	      append query alignment in a3m format to file (default=none)

       -atab <file>
	      write alignment as a table (with posteriors) to file (default=none)

       -v <int>
	      verbose mode: 0:no screen output	1:only warings	2: verbose

       -seq   [1,inf[  max. number of query/template sequences displayed  (def=1)

       -nocons
	      don't show consensus sequence in alignments (default=show)

       -nopred
	      don't show predicted 2ndary structure in alignments (default=show)

       -nodssp
	      don't show DSSP 2ndary structure in alignments (default=show)

       -ssconf
	      show confidences for predicted 2ndary structure in alignments

       -aliw int
	      number of columns per line in alignment list (def=80)

       -P <float>
	      for self-comparison: max p-value of alignments (def=0.001

       -p <float>
	      minimum probability in summary and alignment list (def=0)

       -E <float>
	      maximum E-value in summary and alignment list (def=1E+06)

       -Z <int>
	      maximum number of lines in summary hit list (def=100)

       -z <int>
	      minimum number of lines in summary hit list (def=1)

       -B <int>
	      maximum number of alignments in alignment list (def=100)

       -b <int>
	      minimum number of alignments in alignment list (def=1)

       -rank int
	      specify rank of alignment to write with -Oa3m or -Aa3m option (default=1)

       -tc <file>
	      write a TCoffee library file for the pairwise comparison

       -tct [0,100]
	      min. probobability of residue pairs for TCoffee (def=5%)

   Dotplot options:
       -dwin int
	      average score in dotplot over window [i-W..i+W] (def=10)

       -dthr float
	      score threshold for dotplot (default=0.50)

       -dsca int
	      size of dot plot box in pixels  (default=600)

       -dali <list>
	      show alignments with indices in <list> in dot plot <list> = <index1> ... <indexN>  or  <list> = all

       -dmap <file>
	      print list of coordinates in png plot

   Options to filter input alignment (options can be combined):
       -id    [0,100]  maximum pairwise sequence identity (%) (def=90)

       -diff [0,inf[
	      filter most diverse set of sequences, keeping at least this many sequences in each block of >50 columns (def=100)

       -cov   [0,100]  minimum coverage with query (%) (def=0)

       -qid   [0,100]  minimum sequence identity with query (%) (def=0)

       -qsc   [0,100]  minimum score per column with query  (def=-20.0)

   HMM-building options:
       -M a2m use A2M/A3M (default): upper case = Match; lower case = Insert; '-' = Delete; '.' = gaps aligned to inserts (may be omitted)

       -M first
	      use FASTA: columns with residue in 1st sequence are match states

       -M [0,100]
	      use FASTA: columns with fewer than X% gaps are match states

       -tags  do NOT neutralize His-, C-myc-, FLAG-tags, and trypsin recognition sequence to background distribution

   Pseudocount (pc) options:
       -pcm   0-2      position dependence of pc admixture 'tau' (pc mode, default=2)

       0: no pseudo counts:
	      tau = 0

       1: constant
	      tau = a

	      2: diversity-dependent: tau = a/(1 + ((Neff[i]-1)/b)^c) (Neff[i]: number of effective seqs in local MSA around column i) 3: constant
	      diversity pseudocounts

       -pca   [0,1]    overall pseudocount admixture (def=1.0)

       -pcb   [1,inf[  Neff threshold value for -pcm 2 (def=1.5)

       -pcc   [0,3]    extinction exponent c for -pcm 2 (def=1.0)

       -pre_pca [0,1]
	      PREFILTER pseudocount admixture (def=0.8)

       -pre_pcb [1,inf[ PREFILTER threshold for Neff (def=1.8)

   Context-specific pseudo-counts:
       -nocontxt
	      use substitution-matrix instead of context-specific pseudocounts

       -contxt <file> context file for computing context-specific pseudocounts (default=debian/tmp/usr/share/hhsuite/data/context_data.lib)

       -cslib
	      <file> column state file for fast database prefiltering (default=debian/tmp/usr/share/hhsuite/data/cs219.lib)

   Gap cost options:
       -gapb [0,inf[
	      Transition pseudocount admixture (def=1.00)

       -gapd [0,inf[
	      Transition pseudocount admixture for open gap (default=0.15)

       -gape [0,1.5]
	      Transition pseudocount admixture for extend gap (def=1.00)

       -gapf ]0,inf]
	      factor to increase/reduce the gap open penalty for deletes (def=0.60)

       -gapg ]0,inf]
	      factor to increase/reduce the gap open penalty for inserts (def=0.60)

       -gaph ]0,inf]
	      factor to increase/reduce the gap extend penalty for deletes(def=0.60)

       -gapi ]0,inf]
	      factor to increase/reduce the gap extend penalty for inserts(def=0.60)

       -egq   [0,inf[  penalty (bits) for end gaps aligned to query residues (def=0.00)

       -egt   [0,inf[  penalty (bits) for end gaps aligned to template residues (def=0.00)

   Alignment options:
       -glob/-loc
	      global or local alignment mode (def=global)

       -mac   use Maximum Accuracy (MAC) alignment instead of Viterbi

       -mact [0,1]
	      posterior prob threshold for MAC alignment (def=0.350)

       -sto <int>
	      use global stochastic sampling algorithm to sample this many alignments

       -sc    <int>    amino acid score 	(tja: template HMM at column j) (def=1)

       0      = log2 Sum(tja*qia/pa)   (pa: aa background frequencies)

       1      = log2 Sum(tja*qia/pqa)  (pqa = 1/2*(pa+ta) )

       2      = log2 Sum(tja*qia/ta)   (ta: av. aa freqs in template)

       3      = log2 Sum(tja*qia/qa)   (qa: av. aa freqs in query)

       -corr [0,1]
	      weight of term for pair correlations (def=0.10)

       -shift [-1,1]
	      score offset (def=-0.030)

       -r     repeat identification: multiple hits not treated as independent

       -ssm   0-2      0:no ss scoring [default=2]

	      1:ss scoring after alignment 2:ss scoring during alignment

       -ssw   [0,1]    weight of ss score compared to column score (def=0.11)

       -ssa   [0,1]    ss confusion matrix = (1-ssa)*I + ssa*psipred-confusion-matrix [def=1.00)

       -calm 0-3
	      empirical score calibration of 0:query 1:template 2:both (def=off)

       Default options can be specified in './.hhdefaults' or '~/.hhdefaults'

hhalign 2.0.15							     June 2012								HHALIGN(1)

Linux and UNIX Man Pages

hhmake(1) [debian man page]

Check Out this Related Man Page