06-24-2009
Thanks, but apparently the OP did not like my suggestions, he has recently posted the same question on more perl forums without replying to this thread. Oh well, some people just want the code.
![Embarrassment Smilie](https://www.unix.com/images/smilies/redface.gif)
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
PROJECT: Extracting data from an employee timesheet. The timesheets are done in excel (for user ease) and then converted to .csv files that look like this (see color code key below):
,,,,,,,,,,,,,,,,,,,
9/14/2003,<-- Week Ending,,,,,,,,,,,,,,,,,,
Craig Brennan,,,,,,,,,,,,,,,,,,,... (3 Replies)
Discussion started by: kregh99
3 Replies
2. Shell Programming and Scripting
Hey guys,
I'm doing some Perl scripting for genomic data out of GenBank files...I have to extract the name of the plant, the file name, the number of bases, and all of the genes including their starting and ending positions...for example, with this GenBank file,
LOCUS NC_010093 ... (7 Replies)
Discussion started by: akreibich07
7 Replies
3. Shell Programming and Scripting
Hi ,
I have list of genbank id's and ref number in this format.
gi|9910297|ref|NM_019974.1|
I want to retrive the gene name and fuction for each genbank list. I have around 1300 gi numbers in my excel sheet.
So anybody can help me to retrive the information from NCBI through perl script... (0 Replies)
Discussion started by: shibujohn82
0 Replies
4. UNIX for Advanced & Expert Users
i want to write a perl script that gets/displays all those files having multiple links (in current directory) (4 Replies)
Discussion started by: guptesanket
4 Replies
5. Shell Programming and Scripting
Hello
Kindly help me to find out the first column from first line of a flat file in perl
I/P
9869912|20110830|00000000000013009|130|09|10/15/2010 12:36:22|W860944|N|00
9869912|20110830|00000000000013013|130|13|10/15/2010 12:36:22|W860944|N|00... (5 Replies)
Discussion started by: Pratik4891
5 Replies
6. Shell Programming and Scripting
I am trying to reverse and complement my DNA sequences. The file format is FASTA, something like this:
Now, to reverse the sequence, I should start reading from right to left. At the same should be complemented. Thus, "A" should be read as "T"; "C" should be read as "G"; "T" should be converted... (8 Replies)
Discussion started by: Xterra
8 Replies
7. Shell Programming and Scripting
I have two files containing hundreds of different sequences with the same Identifiers (ID-001, ID-002, etc.,), something like this:
Infile1:
ID-001 ATGGGAGCGGGGGCGTCTGCCTTGAGGGGAGAGAAGCTAGATACA
ID-002 ATGGGAGCGGGGGCGTCTGTTTTGAGGGGAGAGAAGCTAGATACA
ID-003... (18 Replies)
Discussion started by: Xterra
18 Replies
8. Shell Programming and Scripting
Hi,
I am having a file of dna sequences in fasta format which look like this:
>admin_1_45
atatagcaga
>admin_1_46
atatagcagaatatatat
with many such thousands of sequences in a single file. I want to the replace the accession Id "admin_1_45" similarly in following sequences to... (5 Replies)
Discussion started by: margarita
5 Replies
9. Shell Programming and Scripting
hey!!! I have 2 files file1 is as ids.txt and is
>gi|546473186|gb|AWWX01630222.1|
>gi|546473233|gb|AWWX01630175.1|
>gi|546473323|gb|AWWX01630097.1|
>gi|546474044|gb|AWWX01629456.1|
>gi|546474165|gb|AWWX01629352.1|
file2 is sequences.fasta and is like
>gi|546473233|gb|AWWX01630175.1|... (9 Replies)
Discussion started by: harpreetmanku04
9 Replies
10. Shell Programming and Scripting
I am trying to write a bash script that would be able to read DNA sequences (each line in the file is a sequence) from a file, where sequences are separated by an empty line. I am then to find the amino acid that these DNA sequences encode per codon (each group of three literals.) For example, if I... (3 Replies)
Discussion started by: faizlo
3 Replies
LEARN ABOUT DEBIAN
asn2all
ASN2ALL(1) NCBI Tools User's Manual ASN2ALL(1)
NAME
asn2all - generate reports from ASN.1 biological data
SYNOPSIS
asn2all [-] [-A acc] [-F filename] [-G] [-J n] [-K n] [-M] [-T] [-X] [-a type] [-b] [-c] [-d path] [-f format] [-h] [-i filename] [-k] [-l]
[-n policy] [-o filename] [-p path] [-r] [-v filename] [-x ext]
DESCRIPTION
asn2all is primarily intended for generating reports from the binary ASN.1 Bioseq-set release files downloaded from the NCBI ftp site
(ncbi-asn1 directory). It can produce GenBank and GenPept flatfiles, FASTA sequence files, INSDSet structured XML, TinySeq XML, and
Sequin-style 5-column feature tables.
The release files (which have the extension .aso.gz) should be uncompressed with gunzip(1), resulting in files with the extension .aso.
For example, gbpri1.aso is the first file in the primate division, and the command
gunzip gbpri1.aso.gz
will result in gbpri1.aso being created. The original gbpri1.aso.gz file is removed after successful decompression.
In asn2all, the name of the file to be processed is specified by the -i command line argument. Use -a t to indicate that it is a release
file and -b to indicate that it is binary ASN.1. A text ASN.1 file obtained from Entrez can be processed by using -a a instead of -a t -b.
Nucleotide and protein records can be processed simultaneously. Use the -o argument to indicate the nucleotide output file, and the -v
argument for the protein output file.
The -f argument determines the format to be generated, and is documented in more detail (along with other options) in the following sec-
tion.
OPTIONS
A summary of options is included below.
- Print usage message
-A accession
Accession to fetch; may take the form accession,complexity,flags where complexity should normally be 0 and a flags value of -1
enables fetching of external features
-F filename
Accession Filter file
-G Relaxed Genome Mapping
-J n Seq-loc from
-K n Seq-loc to
-M Seq-loc Minus strand
-T Use Threads
-X EXtended qualifier output
-a type
Input ASN.1 type:
a Automatic (default)
c Catenated
z Any
e Seq-entry
b Bioseq
s Bioseq-set
m Seq-submit
t batch processing (suitable for official releases; autodetects specific type)
-b Bioseq-set is Binary
-c Bioseq-set is Compressed
-d path
Path to indexed binary ASN.1 Data
-f format
Output Format:
g GenBank/GenPept (default)
m GenBank Master Style
f FASTA
d CDS FASTA
e Gene FASTA
t Sequin-style 5-column feature table
y TinySet XML (akin to FASTA)
s INSDSet XML (akin to GenBank/GenPept)
a structurally equivalent text ASN.1
x structurally equivalent XML
c cache components
-h Display extra Help message
-i filename
Input file name (standard input by default)
-k Enable local fetching
-l Lock components in advance
-n policy
Near FASTA policy:
a All
n Near only (default)
f Far only
-o filename
Nucleotide Output file name
-p path
Path to files
-r Enable Remote fetching
-v filename
Protein output file name
-x ext File selection suffix when working with entire directories. (default is .aso)
EXAMPLES
The command
asn2all -i gbpri1.aso -a t -b -f g -o gbpri1.nuc -v gbpri1.prt
will generate GenBank and GenPept reports from gbpri1.aso.
AUTHOR
The National Center for Biotechnology Information.
SEE ALSO
asn2asn(1), asn2ff(1), asn2fsa(1), asn2gb(1), asn2idx(1), asn2xml(1), asndhuff(1), gene2xml(1), gunzip(1).
NCBI
2012-06-24 ASN2ALL(1)