I am trying to write a bash script that would be able to read DNA sequences (each line in the file is a sequence) from a file, where sequences are separated by an empty line. I am then to find the amino acid that these DNA sequences encode per codon (each group of three literals.) For example, if I have a file with the sequence:
then starting from GCA (first three literals,) I want to decode the DNA into amino acids based on the following table:
Code:
Codon(s) Amino-acid
TTT,TTC Phe
TTA,TTG,CTT,CTC,CTA,CTG Leu
ATT,ATC,ATA Ile
ATG Met
GTT,GTC,GTA,GTG Val
TCT,TCC,TCA,TCG Ser
CCT,CCC,CCA,CCG Pro
ACT,ACC,ACA,ACG Thr
GCT,GCC,GCA,GCG Ala
TAT,TAC Tyr
TAA,TAG Stop
CAT,CAC His
CAA,CAG Gln
AAT,AAC Asn
AAA,AAG Lys
GAT,GAC Asp
GAA,GAG Glu
TGT,TGC Cys
TGA Stop
TGG Trp
CGT,CGC,CGA,CGG Arg
AGT,AGC Ser
AGA,AGG Arg
GGT,GGC,GGA,GGG Gly
Then I need to print the name of each Amino Acid and how many times it was used. For example:
Code:
Ala: 4
Cys: 4
and so on. I have 100s of files with DNA sequences in them, but I am not that good at bash. I tried awk and tr but I did not know how to code the table into a bash script.
Last edited by faizlo; 10-08-2015 at 09:08 PM..
Reason: Add CODE tags.
Hi all,
I have a requirement where the variable name starts with $, like
$Amd=/home/student/test/
How to work wit it? can some one help me, am in gr8 confusion:confused: (5 Replies)
Hi all,
Using Perl, I need to extract DNA bases from a GenBank file for a given plant species. A sample GenBank file is here...
Nucleotide
This is saved on my computer as NC_001666.gb. I also have a file that is saved on my computer as NC_001666.txt. This text file has a list of all... (5 Replies)
I am trying to reverse and complement my DNA sequences. The file format is FASTA, something like this:
Now, to reverse the sequence, I should start reading from right to left. At the same should be complemented. Thus, "A" should be read as "T"; "C" should be read as "G"; "T" should be converted... (8 Replies)
Hi all,
I have a file like this
ID 3BP5L_HUMAN Reviewed; 393 AA.
AC Q7L8J4; Q96FI5; Q9BQH8; Q9C0E3;
DT 05-FEB-2008, integrated into UniProtKB/Swiss-Prot.
DT 05-JUL-2004, sequence version 1.
DT 05-SEP-2012, entry version 71.
FT COILED 59 140 ... (1 Reply)
Looking for a simple way to convert ranges to a numerical sequence that would assign the original value of the range to the individual numbers that are on the range.
Thank you
given data
13196-13199 0
13200 4
13201 10
13202-13207 3
13208-13210 7
desired... (3 Replies)
Hi,
I am having a file of dna sequences in fasta format which look like this:
>admin_1_45
atatagcaga
>admin_1_46
atatagcagaatatatat
with many such thousands of sequences in a single file. I want to the replace the accession Id "admin_1_45" similarly in following sequences to... (5 Replies)
If I run rm -rf * command under one parent directory.
/data > rm -rf *
Is there anyway to know which files will be deleted first ?
Start using code tags please, ty. (2 Replies)
Discussion started by: sameermohite
2 Replies
LEARN ABOUT CENTOS
locale::codes::langext
Locale::Codes::LangExt(3) User Contributed Perl Documentation Locale::Codes::LangExt(3)NAME
Locale::Codes::LangExt - standard codes for language extension identification
SYNOPSIS
use Locale::Codes::LangExt;
$lext = code2langext('acm'); # $lext gets 'Mesopotamian Arabic'
$code = langext2code('Mesopotamian Arabic'); # $code gets 'acm'
@codes = all_langext_codes();
@names = all_langext_names();
DESCRIPTION
The "Locale::Codes::LangExt" module provides access to standard codes used for identifying language extensions, such as those as defined in
the IANA language registry.
Most of the routines take an optional additional argument which specifies the code set to use. If not specified, the default IANA language
registry codes will be used.
SUPPORTED CODE SETS
There are several different code sets you can use for identifying language extensions. A code set may be specified using either a name, or
a constant that is automatically exported by this module.
For example, the two are equivalent:
$lext = code2langext('acm','alpha');
$lext = code2langext('acm',LOCALE_LANGEXT_ALPHA);
The codesets currently supported are:
alpha
This is the set of three-letter (lowercase) codes from the IANA language registry, such as 'acm' for Mesopotamian Arabic.
This is the default code set.
ROUTINES
code2langext ( CODE [,CODESET] )
langext2code ( NAME [,CODESET] )
langext_code2code ( CODE ,CODESET ,CODESET2 )
all_langext_codes ( [CODESET] )
all_langext_names ( [CODESET] )
Locale::Codes::LangExt::rename_langext ( CODE ,NEW_NAME [,CODESET] )
Locale::Codes::LangExt::add_langext ( CODE ,NAME [,CODESET] )
Locale::Codes::LangExt::delete_langext ( CODE [,CODESET] )
Locale::Codes::LangExt::add_langext_alias ( NAME ,NEW_NAME )
Locale::Codes::LangExt::delete_langext_alias ( NAME )
Locale::Codes::LangExt::rename_langext_code ( CODE ,NEW_CODE [,CODESET] )
Locale::Codes::LangExt::add_langext_code_alias ( CODE ,NEW_CODE [,CODESET] )
Locale::Codes::LangExt::delete_langext_code_alias ( CODE [,CODESET] )
These routines are all documented in the Locale::Codes::API man page.
SEE ALSO
Locale::Codes
The Locale-Codes distribution.
Locale::Codes::API
The list of functions supported by this module.
http://www.iana.org/assignments/language-subtag-registry
The IANA language subtag registry.
AUTHOR
See Locale::Codes for full author history.
Currently maintained by Sullivan Beck (sbeck@cpan.org).
COPYRIGHT
Copyright (c) 2011-2013 Sullivan Beck
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.16.3 2013-02-27 Locale::Codes::LangExt(3)