Convert a DNA sequence into Amino Acid Post: 302957292

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

escape sequence for $

Hi all, I have a requirement where the variable name starts with $, like $Amd=/home/student/test/ How to work wit it? can some one help me, am in gr8 confusion:confused:

2. Shell Programming and Scripting

How to remove those sequence with same amino acid?What command line I should type?

My input is listed as: giNumber RefAminoAcid VarAminoAcid 10190711 P P 10190711 D D 109255248 I A 110349771 A ...

3. Shell Programming and Scripting

Extracting DNA sequences from GenBank files using Perl

Hi all, Using Perl, I need to extract DNA bases from a GenBank file for a given plant species. A sample GenBank file is here... Nucleotide This is saved on my computer as NC_001666.gb. I also have a file that is saved on my computer as NC_001666.txt. This text file has a list of all...

4. Shell Programming and Scripting

Tricky task with DNA sequences.

I am trying to reverse and complement my DNA sequences. The file format is FASTA, something like this: Now, to reverse the sequence, I should start reading from right to left. At the same should be complemented. Thus, "A" should be read as "T"; "C" should be read as "G"; "T" should be converted...

5. Shell Programming and Scripting

find common entries and match the number with long sequence and cut that sequence in output

Hi all, I have a file like this ID 3BP5L_HUMAN Reviewed; 393 AA. AC Q7L8J4; Q96FI5; Q9BQH8; Q9C0E3; DT 05-FEB-2008, integrated into UniProtKB/Swiss-Prot. DT 05-JUL-2004, sequence version 1. DT 05-SEP-2012, entry version 71. FT COILED 59 140 ...

6. Shell Programming and Scripting

How to convert multiple number ranges into sequence?

Looking for a simple way to convert ranges to a numerical sequence that would assign the original value of the range to the individual numbers that are on the range. Thank you given data 13196-13199 0 13200 4 13201 10 13202-13207 3 13208-13210 7 desired...

7. Shell Programming and Scripting

Sequence generator

Thanks Guys This really helped

8. Shell Programming and Scripting

Shell script for changing the accession number of DNA sequences in a FASTA file

Hi, I am having a file of dna sequences in fasta format which look like this: >admin_1_45 atatagcaga >admin_1_46 atatagcagaatatatat with many such thousands of sequences in a single file. I want to the replace the accession Id "admin_1_45" similarly in following sequences to...

9. Red Hat

Rm -rf * sequence

If I run rm -rf * command under one parent directory. /data > rm -rf * Is there anyway to know which files will be deleted first ? Start using code tags please, ty.

LEARN ABOUT DEBIAN

phyml

PhyML(1)							   User Commands							  PhyML(1)

NAME

       phyml - Phylogenetic estimation using Maximum Likelihood

SYNOPSIS
:
       phyml [command args]

	      All the options below are optional (except '-i' if you want to use the command-line interface).

       Command options:

       -i (or --input) seq_file_name

	      seq_file_name is the name of the nucleotide or amino-acid sequence file in PHYLIP format.

       -d (or --datatype) data_type

	      data_type  is  'nt'  for nucleotide (default), 'aa' for amino-acid sequences, or 'generic', (use NEXUS file format and the 'symbols'
	      parameter here).

       -q (or --sequential)

	      Changes interleaved format (default) to sequential format.

       -n (or --multiple) nb_data_sets

	      nb_data_sets is an integer corresponding to the number of data sets to analyse.

       -p (or --pars) [] Use a minimum parsimony starting tree. This option is taken into account when the '-u' option is  absent  and	when  tree
	      topoLOGy modifications are to be done.

       -b (or --bootstrap) int

	      int > 0: int is the number of bootstrap replicates.

	      int = 0: neither approximate likelihood ratio test nor bootstrap values are computed.

	      int = -1: approximate likelihood ratio test returning aLRT statistics.

	      int = -2: approximate likelihood ratio test returning Chi2-based parametric branch supports.

	      int = -4: (default) SH-like branch supports alone.

       -m (or --model) model

	      model  :	substitution  model name.  - Nucleotide-based models : HKY85 (default) | JC69 | K80 | F81 | F84 | TN93 | GTR | custom (for
	      the custom option, a string of six digits identifies the model. For instance, 000000)

	      corresponds to F81 (or JC69 provided the distribution of nucleotide frequencies is uniform).  012345 corresponds to GTR. This option
	      can be used for encoding any model that is a nested within GTR.

	      -  Amino-acid based models : LG (default) | WAG | JTT | MtREV | Dayhoff | DCMut | RtREV | CpREV | VT Blosum62 | MtMam | MtArt | HIVw
	      | HIVb | custom

       --aa_rate_file filename

	      filename is the name of the file that provides the amino acid substitution rate matrix in PAML format.  It is compulsory to use this
	      option when analysing amino acid sequences with the `custom' model.

       -f e, m, or fA,fC,fG,fT

	      e : the character frequencies are determined as follows :

	      -  Nucleotide sequences: (Empirical) the equilibrium base frequencies are estimated by counting the occurence of the different bases
	      in the alignment.

	      - Amino-acid sequences: (Empirical) the equilibrium amino-acid frequencies are estimated by counting the occurence of the  different
	      amino-acids in the alignment.

	      m : the character frequencies are determined as follows :

	      - Nucleotide sequences: (ML) the equilibrium base frequencies are estimated using maximum likelihood

	      -  Amino-acid sequences: (Model) the equilibrium amino-acid frequencies are estimated using the frequencies defined by the substitu-
	      tion model.

	      "fA,fC,fG,fT" : only valid for nucleotide-based models. fA, fC, fG and fT are floating numbers that correspond to the frequencies of
	      A, C, G and T respectively (WARNING: do not use any blank space between your values of nucleotide frequencies, only commas!)

       -t (or --ts/tv) ts/tv_ratio

	      ts/tv_ratio  :  transition/transversion  ratio.  DNA sequences only.  Can be a fixed positive value (ex:4.0) or e to get the maximum
	      likelihood estimate.

       -v (or --pinv) prop_invar

	      prop_invar: proportion of invariable sites.  Can be a fixed value in the [0,1] range or e to get the maximum likelihood estimate.

       -c (or --nclasses) nb_subst_cat

	      nb_subst_cat : number of relative substitution rate categories. Default: nb_subst_cat=4.	Must be a positive integer.

       -a (or --alpha) gamma

	      gamma : distribution of the gamma distribution shape parameter.  Can be a fixed positive value or e to get  the  maximum	likelihood
	      estimate.

       -s (or --search) move

	      Tree  topoLOGy  search  operation option.  Can be either NNI (default, fast) or SPR (a bit slower than NNI) or BEST (best of NNI and
	      SPR search).

       -u (or --inputtree) user_tree_file

	      user_tree_file : starting tree filename. The tree must be in Newick format.

       -o params

	      This option focuses on specific parameter optimisation.

	      params=tlr : tree topoLOGy (t), branch length (l) and rate parameters (r) are optimised.

	      params=tl  : tree topoLOGy and branch length are optimised.

	      params=lr  : branch length and rate parameters are optimised.

	      params=l	 : branch length are optimised.

	      params=r	 : rate parameters are optimised.

	      params=n	 : no parameter is optimised.

       --rand_start

	      This option sets the initial tree to random. It is only valid if SPR searches are to be performed.

       --n_rand_starts num

	      num is the number of initial random trees to be used.  It is only valid if SPR searches are to be performed.

       --r_seed num

	      num is the seed used to initiate the random number generator.  Must be an integer.

       --print_site_lnl

	      Print the likelihood for each site in file *_phyml_lk.txt.

       --print_trace

	      Print each phyLOGeny explored during the tree search process in file *_phyml_trace.txt.

       --run_id ID_string

	      Append the string ID_string at the end of each PhyML output file.  This option may be  useful  when  running  simulations  involving
	      PhyML.

       --quiet

	      No interactive question (for running in batch mode) and quiet output.

       --no_memory_check

	      No interactive question for memory usage (for running in batch mode). Normal output otherwise.

       --alias_subpatt

	      Site  aliasing  is generalized at the subtree level. Sometimes lead to faster calculations.  See Kosakovsky Pond SL, Muse SV, Sytem-
	      atic Biology (2004) for an example.

       --boot_progress_display num (default=20)

	      num is the frequency at which the bootstrap progress bar will be updated.  Must be an integer.

PHYLIP-LIKE INTERFACE
       You can also use PhyML with no argument, in this case change the value of a parameter by typing its corresponding  character  as  shown	on
       screen.

EXAMPLES

       DNA interleaved sequence file, default parameters :

	      phyml -i seqs1

       AA interleaved sequence file, default parameters :

	      phyml -i seqs2 -d aa

       AA sequential sequence file, with customization :

	      phyml -i seqs3 -q -d aa -m JTT -c 4 -a e

SEE ALSO

       A simple, fast, and accurate algorithm to estimate large phyLOGenies by maximum likelihood

       Stephane Guindon and Olivier Gascuel, Systematic BioLOGy 52(5):696-704, 2003.

       Please cite this paper if you use this software in your publications.

AUTHOR

       PhyML was written by Stephane Guindon and Olivier Gascuel and others

       This manual page was written by Andreas Tille <tille@debian.org>, for the Debian project (but may be used by others).

phyml									3.0								  PhyML(1)