Sponsored Content
Top Forums Shell Programming and Scripting extract perticular lines and make them into speadsheet Post 85321 by mskcc on Tuesday 4th of October 2005 02:58:36 PM
Old 10-04-2005
extract perticular lines and make them into speadsheet

Hi Masters,

I knew this isn't a new issue, but couldn't find any similar threads. So, I have to bother you. Here is my input file (genomic data). The file has many sessions, each session seperated by //. Within eash session there is only one ID and GN line.

ID 3HAO_HUMAN STANDARD; PRT; 286 AA.
AC P46952; Q8N6N9;
DT 01-NOV-1995 (Rel. 32, Created)
DT 01-NOV-1995 (Rel. 32, Last sequence update)
DT 10-MAY-2005 (Rel. 47, Last annotation update)
DE 3-hydroxyanthranilate 3,4-dioxygenase (EC 1.13.11.6) (3-HAO) (3-
DE hydroxyanthranilic acid dioxygenase) (3-hydroxyanthranilate
DE oxygenase).
GN Name=HAAO;
OS Homo sapiens (Human).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae;
OC Homo.
OX NCBI_TaxID=9606;
//
ID A4GCT_HUMAN STANDARD; PRT; 340 AA.
AC Q9UNA3;
DT 28-FEB-2003 (Rel. 41, Created)
DT 28-FEB-2003 (Rel. 41, Last sequence update)
DT 13-SEP-2005 (Rel. 48, Last annotation update)
DE Alpha-1,4-N-acetylglucosaminyltransferase (EC 2.4.1.-) (Alpha4GnT).
GN Name=A4GNT;
OS Homo sapiens (Human).
OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi;
OC Mammalia; Eutheria; Euarchontoglires; Primates; Catarrhini; Hominidae;
OC Homo.
OX NCBI_TaxID=9606;
//
................

What I need to do is to extract part of line GN, ID and put them into this format. Thanks in advance.

GN ID
HAAO 3HAO_HUMAN
A4GNT A4GCT_HUMAN
.... ....
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Make sed ignore lines

Hi I use sed in a script for severall changes in files. I whish one of the substitutions I made to be aplied to every line that has the word "scripts" with the exception for the ones that start with "rsh", wich I wish sed to ignore . Is this possible? If yes, how can I do it? The substitution... (2 Replies)
Discussion started by: Scarlos
2 Replies

2. UNIX for Dummies Questions & Answers

How to make all lines into 1 line?

I have a file listing IP addresses, 1 per line, such as: 1.2.3.4 3.4.5.6 12.13.14.15 7.8.9.6 I want all of the entries to be on the same line, and quoted, such as: "1.2.3.4" "3.4.5.6" "12.13.14.15" "7.8.9.6" I got the quotes on there in vi with ":%s/^/"/g" and "%s/$/"/g" ... is there... (8 Replies)
Discussion started by: earnstaf
8 Replies

3. Shell Programming and Scripting

Replace a perticular character of all lines of a file

Hi all, I am new to UNIX, so sorry if my question seem stupid to u. well i want to replace the first character of first 30 lines of a file, only if the first character is h. and in anothe script i want to replace a particular string/character say hello/h of a file.Condition: It should... (1 Reply)
Discussion started by: abovais
1 Replies

4. Shell Programming and Scripting

To extract data of a perticular interval (date-time wise)

I want a shell script which extract data from a log file which contains date and time-wise data and i need the data for a perticular interval of time...what can i do??? (3 Replies)
Discussion started by: abhishek27
3 Replies

5. Shell Programming and Scripting

how to make a log file of extract time

Dear All, Please apology to me if this question already posted, because I try to find it but not found. I have make bash script to automatically download data from ftp and this running very well. and after the data downloaded it will automatically extract the data and keep in the specific... (2 Replies)
Discussion started by: chenboly
2 Replies

6. Shell Programming and Scripting

How to make duplicate lines

input tophr5:178975153-178982740:+ tophrX:14502176-14502376:+ output >tophr5:178975153-178982740:+ tophr5:178975153-178982740:+ >tophrX:14502176-14502376:+ tophrX:14502176-14502376:+ (2 Replies)
Discussion started by: quincyjones
2 Replies

7. UNIX for Dummies Questions & Answers

Extract lines with specific words with addition 2 lines before and after

Dear all, Greetings. I would like to ask for your help to extract lines with specific words in addition 2 lines before and after these lines by using awk or sed. For example, the input file is: 1 ak1 abc1.0 1 ak2 abc1.0 1 ak3 abc1.0 1 ak4 abc1.0 1 ak5 abc1.1 1 ak6 abc1.1 1 ak7... (7 Replies)
Discussion started by: Amanda Low
7 Replies

8. Shell Programming and Scripting

Search for a pattern,extract value(s) from next line, extract lines having those extracted value(s)

I have hundreds of files to process. In each file I need to look for a pattern then extract value(s) from next line and then search for value(s) selected from point (2) in the same file at a specific position. HEADER ELECTRON TRANSPORT 18-MAR-98 1A7V TITLE CYTOCHROME... (7 Replies)
Discussion started by: AshwaniSharma09
7 Replies

9. Shell Programming and Scripting

Make all lines divisible by three

Hi, I need some help with the following: I need all lines in a file divisible by three, so for a file like this: 1 11 111 I want to add characters to make them all divisible by three (e.g. with an X): 1XX 11X 111 I would like to also ignore all lines that begin with the... (2 Replies)
Discussion started by: mikey11415
2 Replies

10. Shell Programming and Scripting

ksh sed - Extract specific lines with mulitple occurance of interesting lines

Data file example I look for primary and * to isolate the interesting slot number. slot=`sed '/^primary$/,/\*/!d' filename | tail -1 | sed s'/*//' | awk '{print $1" "$2}'` Now I want to get the Touch line for only the associate slot number, in this case, because the asterisk... (2 Replies)
Discussion started by: popeye
2 Replies
Bio::Species(3pm)					User Contributed Perl Documentation					 Bio::Species(3pm)

NAME
Bio::Species - Generic species object. SYNOPSIS
$species = Bio::Species->new(-classification => [@classification]); # Can also pass classification # array to new as below $species->classification(qw( sapiens Homo Hominidae Catarrhini Primates Eutheria Mammalia Vertebrata Chordata Metazoa Eukaryota )); $genus = $species->genus(); $bi = $species->binomial(); # $bi is now "Homo sapiens" # For storing common name $species->common_name("human"); # For storing subspecies $species->sub_species("accountant"); DESCRIPTION
NOTE: This class is planned for deprecation in favor of the simpler Bio::Taxon. Please use that class instead. Provides a very simple object for storing phylogenetic information. The classification is stored in an array, which is a list of nodes in a phylogenetic tree. Access to getting and setting species and genus is provided, but not to any of the other node types (eg: "phylum", "class", "order", "family"). There's plenty of scope for making the model more sophisticated, if this is ever needed. A methods are also provided for storing common names, and subspecies. FEEDBACK
Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing list. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track of the bugs and their resolution. Bug reports can be submitted via the web: https://redmine.open-bio.org/projects/bioperl/ AUTHOR
James Gilbert email jgrg@sanger.ac.uk CONTRIBUTORS
Sendu Bala, bix@sendu.me.uk Chris Fields, cjfields at bioperl dot org APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ new Title : new Usage : my $obj = Bio::Species->new(-classification => @class) Function: Build a new Species object Returns : Bio::Species object Args : -ncbi_taxid => NCBI taxonomic ID (optional) -classification => arrayref of classification classification Title : classification Usage : $self->classification(@class_array); @classification = $self->classification(); Function: Get/set the lineage of this species. The array provided must be in the order ... ---> SPECIES, GENUS ---> KINGDOM ---> etc. Example : $obj->classification(qw( 'Homo sapiens' Homo Hominidae Catarrhini Primates Eutheria Mammalia Vertebrata Chordata Metazoa Eukaryota)); Returns : Classification array Args : Classification array OR A reference to the classification array. In the latter case if there is a second argument and it evaluates to true, names will not be validated. NB: in any case, names are never validated anyway. ncbi_taxid Title : ncbi_taxid Usage : $obj->ncbi_taxid($newval) Function: Get/set the NCBI Taxon ID Returns : the NCBI Taxon ID as a string Args : newvalue to set or undef to unset (optional) common_name Title : common_name Usage : $self->common_name( $common_name ); $common_name = $self->common_name(); Function: Get or set the common name of the species Example : $self->common_name('human') Returns : The common name in a string Args : String, which is the common name (optional) division Title : division Usage : $obj->division($newval) Function: Genbank Division for a species Returns : value of division (a scalar) Args : value of division (a scalar) species Title : species Usage : $self->species( $species ); $species = $self->species(); Function: Get or set the species name. Note that this is NOT genus and species -- use $self->binomial() for that. Example : $self->species('sapiens'); Returns : species name as string (NOT genus and species) Args : species name as string (NOT genus and species) genus Title : genus Usage : $self->genus( $genus ); $genus = $self->genus(); Function: Get or set the scientific genus name. Example : $self->genus('Homo'); Returns : Scientific genus name as string Args : Scientific genus name as string sub_species Title : sub_species Usage : $obj->sub_species($newval) Function: Get or set the scientific subspecies name. Returns : value of sub_species Args : newvalue (optional) variant Title : variant Usage : $obj->variant($newval) Function: Get/set variant information for this species object (strain, isolate, etc). Example : Returns : value of variant (a scalar) Args : new value (a scalar or undef, optional) binomial Title : binomial Usage : $binomial = $self->binomial(); $binomial = $self->binomial('FULL'); Function: Returns a string "Genus species", or "Genus species subspecies", if the first argument is 'FULL' (and the species has a subspecies). Args : Optionally the string 'FULL' to get the full name including the subspecies. Note : This is just munged from the taxon() name validate_species_name Title : validate_species_name Usage : $result = $self->validate_species_name($string); Function: Validate the species portion of the binomial Args : string Notes : The string following the "genus name" in the NCBI binomial is so variable that it's not clear that this is a useful function. Consider the binomials "Simian 11 rotavirus (serotype 3 / strain SA11-Patton)", or "St. Thomas 3 rotavirus", straight from GenBank. This is particularly problematic in microbes and viruses. As such, this isn't actually used automatically by any Bio::Species method. organelle Title : organelle Usage : $self->organelle( $organelle ); $organelle = $self->organelle(); Function: Get or set the organelle name Example : $self->organelle('Chloroplast') Returns : The organelle name in a string Args : String, which is the organelle name Note : TODO: We currently do not know where the organelle definition will eventually go. This is stored in the source seqfeature, though, so the information isn't lost. Delegation The following methods delegate to the internal Bio::Taxon instance. This is mainly to allow code continue using older methods, with the mind to migrate to using Bio::Taxon and related methods when this class is deprecated. taxon Title : taxon Usage : $obj->taxon Function : retrieve the internal Bio::Taxon instance Returns : A Bio::Taxon. If one is not previously set, an instance is created lazily Args : Bio::Taxon (optional) tree Title : tree Usage : $obj->tree Function : Returns a Bio::Tree::Tree object Returns : A Bio::Tree::Tree. If one is not previously set, an instance is created lazily Args : Bio::Tree::Tree (optional) perl v5.14.2 2012-03-02 Bio::Species(3pm)
All times are GMT -4. The time now is 12:49 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy