06-24-2009
Thanks, but apparently the OP did not like my suggestions, he has recently posted the same question on more perl forums without replying to this thread. Oh well, some people just want the code.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
PROJECT: Extracting data from an employee timesheet. The timesheets are done in excel (for user ease) and then converted to .csv files that look like this (see color code key below):
,,,,,,,,,,,,,,,,,,,
9/14/2003,<-- Week Ending,,,,,,,,,,,,,,,,,,
Craig Brennan,,,,,,,,,,,,,,,,,,,... (3 Replies)
Discussion started by: kregh99
3 Replies
2. Shell Programming and Scripting
Hey guys,
I'm doing some Perl scripting for genomic data out of GenBank files...I have to extract the name of the plant, the file name, the number of bases, and all of the genes including their starting and ending positions...for example, with this GenBank file,
LOCUS NC_010093 ... (7 Replies)
Discussion started by: akreibich07
7 Replies
3. Shell Programming and Scripting
Hi ,
I have list of genbank id's and ref number in this format.
gi|9910297|ref|NM_019974.1|
I want to retrive the gene name and fuction for each genbank list. I have around 1300 gi numbers in my excel sheet.
So anybody can help me to retrive the information from NCBI through perl script... (0 Replies)
Discussion started by: shibujohn82
0 Replies
4. UNIX for Advanced & Expert Users
i want to write a perl script that gets/displays all those files having multiple links (in current directory) (4 Replies)
Discussion started by: guptesanket
4 Replies
5. Shell Programming and Scripting
Hello
Kindly help me to find out the first column from first line of a flat file in perl
I/P
9869912|20110830|00000000000013009|130|09|10/15/2010 12:36:22|W860944|N|00
9869912|20110830|00000000000013013|130|13|10/15/2010 12:36:22|W860944|N|00... (5 Replies)
Discussion started by: Pratik4891
5 Replies
6. Shell Programming and Scripting
I am trying to reverse and complement my DNA sequences. The file format is FASTA, something like this:
Now, to reverse the sequence, I should start reading from right to left. At the same should be complemented. Thus, "A" should be read as "T"; "C" should be read as "G"; "T" should be converted... (8 Replies)
Discussion started by: Xterra
8 Replies
7. Shell Programming and Scripting
I have two files containing hundreds of different sequences with the same Identifiers (ID-001, ID-002, etc.,), something like this:
Infile1:
ID-001 ATGGGAGCGGGGGCGTCTGCCTTGAGGGGAGAGAAGCTAGATACA
ID-002 ATGGGAGCGGGGGCGTCTGTTTTGAGGGGAGAGAAGCTAGATACA
ID-003... (18 Replies)
Discussion started by: Xterra
18 Replies
8. Shell Programming and Scripting
Hi,
I am having a file of dna sequences in fasta format which look like this:
>admin_1_45
atatagcaga
>admin_1_46
atatagcagaatatatat
with many such thousands of sequences in a single file. I want to the replace the accession Id "admin_1_45" similarly in following sequences to... (5 Replies)
Discussion started by: margarita
5 Replies
9. Shell Programming and Scripting
hey!!! I have 2 files file1 is as ids.txt and is
>gi|546473186|gb|AWWX01630222.1|
>gi|546473233|gb|AWWX01630175.1|
>gi|546473323|gb|AWWX01630097.1|
>gi|546474044|gb|AWWX01629456.1|
>gi|546474165|gb|AWWX01629352.1|
file2 is sequences.fasta and is like
>gi|546473233|gb|AWWX01630175.1|... (9 Replies)
Discussion started by: harpreetmanku04
9 Replies
10. Shell Programming and Scripting
I am trying to write a bash script that would be able to read DNA sequences (each line in the file is a sequence) from a file, where sequences are separated by an empty line. I am then to find the amino acid that these DNA sequences encode per codon (each group of three literals.) For example, if I... (3 Replies)
Discussion started by: faizlo
3 Replies
LEARN ABOUT DEBIAN
asn2gb
ASN2GB(1) NCBI Tools User's Manual ASN2GB(1)
NAME
asn2gb - convert ASN.1 biological data to a GenBank-style flat format
SYNOPSIS
asn2gb [-] [-A accession] [-F] [-a asn-type] [-b] [-c] [-d] [-f format] [-g N] [-h N] [-i filename] [-j N] [-k N] [-l filename] [-m mode]
[-n filename] [-o filename] [-p] [-q filename] [-r] [-s style] [-t N] [-u N] [-y N]
DESCRIPTION
asn2gb converts descriptions of biological sequences from NCBI's ASN.1 format to one of several flat-file formats, and is the successor to
asn2ff(1).
OPTIONS
A summary of options is included below.
- Print usage message
-A accession
Accession to fetch; may take the form accession,complexity,flags where complexity should normally be 0 and a flags value of -1
enables fetching of external features (as with the legacy -F option)
-F Fetch remote annotations (equivalent to specifying -A accession,0,-1)
-a asn-type
ASN.1 Type:
[Single record]
a Any (autodetected; default)
e seq-Entry
b Bioseq
s bioseq-Set
m seq-subMit
q Catenated
[Release file; components individually processed and freed]
t baTch bioseq-set
u batch seq-sUbmit
-b Input file is binary
-c Batch file is compressed
-d Seq-loc minus strand
-f format
Format:
b GenBank (default)
bp or pb
GenBank and GenPept
e EMBL
p GenPept
q nucleotide GBSet (XML)
r protein GBSet (XML)
t Feature table only
x nucleotide INSDSet (XML)
y tiny seq (XML)
Y FASTA
z protein INSDSet (XML)
-g N Bit flags (all default to off):
1 HTML
2 XML
4 ContigFeats
8 ContigSrcs
16 FarTransl
-h N Lock/Lookup Flags (all default to off):
8 LockProd
16 LookupComp
64 LookupProd
-i filename
Input file name (default = stdin)
-j N Start location (default is 0, beginning of sequence)
-k N End location (default is 0, end of sequence)
-l filename
Log file
-m mode
Mode:
r Release
e Entrez
s Sequin (default)
d Dump
-n filename
Asn2Flat Executable (default = asn2flat)
-o filename
Output file name (default = stdout)
-p Propagate top descriptors
-q filename
Ffdiff Executable (default = /netopt/genbank/subtool/bin/ffdiff)
-r Enable remote fetching
-s style
Style:
n Normal (default)
s Segment
m Master
c Contig
-t N Batch:
1 Report
2 Sequin/Release
3 asn2gb SSEC/nocleanup
4 asn2flat BSEC/nocleanup
5 asn2gb/asn2flat
6 asn2gb NEW dbxref/OLD dbxref
7 oldasn2gb/newasn2gb
-u N Custom flags (all default to off):
4 Hide features
1792 Hide references
8192 Hide sources
262144 Hide translations
-y N Feature itemID
AUTHOR
The National Center for Biotechnology Information.
SEE ALSO
asn2all(1), asn2asn(1), asn2ff(1), asn2fsa(1), asn2xml(1), asndhuff(1), insdseqget(1), /usr/share/doc/libncbi6-dev/asn2gb.txt.gz.
NCBI
2011-09-02 ASN2GB(1)