06-23-2009
Quote:
Originally Posted by
KevinADC
What you want is "Beginning Perl for Bioinformatics" you can purchase on amazon.com. Also look into
BioPerl
Excellent thanks!
I thought this article on the BioPERL wiki was great,
How Perl Saved the Human Genome Project.
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
PROJECT: Extracting data from an employee timesheet. The timesheets are done in excel (for user ease) and then converted to .csv files that look like this (see color code key below):
,,,,,,,,,,,,,,,,,,,
9/14/2003,<-- Week Ending,,,,,,,,,,,,,,,,,,
Craig Brennan,,,,,,,,,,,,,,,,,,,... (3 Replies)
Discussion started by: kregh99
3 Replies
2. Shell Programming and Scripting
Hey guys,
I'm doing some Perl scripting for genomic data out of GenBank files...I have to extract the name of the plant, the file name, the number of bases, and all of the genes including their starting and ending positions...for example, with this GenBank file,
LOCUS NC_010093 ... (7 Replies)
Discussion started by: akreibich07
7 Replies
3. Shell Programming and Scripting
Hi ,
I have list of genbank id's and ref number in this format.
gi|9910297|ref|NM_019974.1|
I want to retrive the gene name and fuction for each genbank list. I have around 1300 gi numbers in my excel sheet.
So anybody can help me to retrive the information from NCBI through perl script... (0 Replies)
Discussion started by: shibujohn82
0 Replies
4. UNIX for Advanced & Expert Users
i want to write a perl script that gets/displays all those files having multiple links (in current directory) (4 Replies)
Discussion started by: guptesanket
4 Replies
5. Shell Programming and Scripting
Hello
Kindly help me to find out the first column from first line of a flat file in perl
I/P
9869912|20110830|00000000000013009|130|09|10/15/2010 12:36:22|W860944|N|00
9869912|20110830|00000000000013013|130|13|10/15/2010 12:36:22|W860944|N|00... (5 Replies)
Discussion started by: Pratik4891
5 Replies
6. Shell Programming and Scripting
I am trying to reverse and complement my DNA sequences. The file format is FASTA, something like this:
Now, to reverse the sequence, I should start reading from right to left. At the same should be complemented. Thus, "A" should be read as "T"; "C" should be read as "G"; "T" should be converted... (8 Replies)
Discussion started by: Xterra
8 Replies
7. Shell Programming and Scripting
I have two files containing hundreds of different sequences with the same Identifiers (ID-001, ID-002, etc.,), something like this:
Infile1:
ID-001 ATGGGAGCGGGGGCGTCTGCCTTGAGGGGAGAGAAGCTAGATACA
ID-002 ATGGGAGCGGGGGCGTCTGTTTTGAGGGGAGAGAAGCTAGATACA
ID-003... (18 Replies)
Discussion started by: Xterra
18 Replies
8. Shell Programming and Scripting
Hi,
I am having a file of dna sequences in fasta format which look like this:
>admin_1_45
atatagcaga
>admin_1_46
atatagcagaatatatat
with many such thousands of sequences in a single file. I want to the replace the accession Id "admin_1_45" similarly in following sequences to... (5 Replies)
Discussion started by: margarita
5 Replies
9. Shell Programming and Scripting
hey!!! I have 2 files file1 is as ids.txt and is
>gi|546473186|gb|AWWX01630222.1|
>gi|546473233|gb|AWWX01630175.1|
>gi|546473323|gb|AWWX01630097.1|
>gi|546474044|gb|AWWX01629456.1|
>gi|546474165|gb|AWWX01629352.1|
file2 is sequences.fasta and is like
>gi|546473233|gb|AWWX01630175.1|... (9 Replies)
Discussion started by: harpreetmanku04
9 Replies
10. Shell Programming and Scripting
I am trying to write a bash script that would be able to read DNA sequences (each line in the file is a sequence) from a file, where sequences are separated by an empty line. I am then to find the amino acid that these DNA sequences encode per codon (each group of three literals.) For example, if I... (3 Replies)
Discussion started by: faizlo
3 Replies
LEARN ABOUT DEBIAN
bp_biofetch_genbank_proxy
BP_BIOFETCH_GENBANK_PROXY(1p) User Contributed Perl Documentation BP_BIOFETCH_GENBANK_PROXY(1p)
NAME
biofetch_genbank_proxy.pl - Caching BioFetch-compatible web proxy for GenBank
SYNOPSIS
Install in cgi-bin directory of a Web server. Stand back.
DESCRIPTION
This CGI script acts as the server side of the BioFetch protocol as described in http://obda.open-bio.org/Specs/. It provides two database
access services, one for data source "genbank" (nucleotide entries) and the other for data source "genpep" (protein entries).
This script works by forwarding its requests to NCBI's eutils script, which lives at http://www.ncbi.nlm.nih.gov/entrez/eutils/efetch.fcgi.
It then reformats the output according to the BioFetch format so the sequences can be processed and returned by the Bio::DB::BioFetch
module. Returned entries are temporarily cached on the Web server's file system, allowing frequently-accessed entries to be retrieved
without another round trip to NCBI.
INSTALLATION
You must have the following installed in order to run this script:
1) perl
2) the perl modules LWP and Cache::FileCache
3) a web server (Apache recommended)
To install this script, copy it into the web server's cgi-bin directory. You might want to shorten its name; "dbfetch" is recommended.
There are several constants located at the top of the script that you may want to adjust. These are:
CACHE_LOCATION
This is the location on the filesystem where the cached files will be located. The default is /usr/tmp/dbfetch_cache.
MAX_SIZE
This is the maximum size that the cache can grow to. When the cache exceeds this size older entries will be deleted automatically. The
default setting is 100,000,000 bytes (100 MB).
EXPIRATION
Entries that haven't been accessed in this length of time will be removed from the cache. The default is 1 week.
PURGE
This constant specifies how often the cache will be purged for older entries. The default is 1 hour.
TESTING
To see if this script is performing as expected, you may test it with this script:
use Bio::DB::BioFetch;
my $db = Bio::DB::BioFetch->new(-baseaddress=>'http://localhost/cgi-bin/dbfetch',
-format =>'genbank',
-db =>'genbank');
my $seq = $db->get_Seq_by_id('DDU63596');
print $seq->seq,"
";
This should print out a DNA sequence.
SEE ALSO
Bio::DB::BioFetch, Bio::DB::Registry
AUTHOR
Lincoln Stein, <lstein-at-cshl.org>
Copyright (c) 2003 Cold Spring Harbor Laboratory
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. See DISCLAIMER.txt for
disclaimers of warranty.
perl v5.14.2 2012-03-02 BP_BIOFETCH_GENBANK_PROXY(1p)