Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Extract string between paranthesis Post 302774939 by evelibertine on Sunday 3rd of March 2013 10:55:14 PM
Old 03-03-2013
Extract string between paranthesis

Hi,

I have a file of fasta headers that looks like the following:

Code:
>gi|28476830|ref|NR_001281.1| Homo sapiens protocadherin beta 18 pseudogene (PCDHB18), non-coding RNA
>gi|187937204|ref|NR_023342.1| Homo sapiens keratin associated protein 20-4 (KRTAP20-4), non-coding RNA
>gi|221139737|ref|NR_024072.2| Homo sapiens MRS2 magnesium homeostasis factor homolog (S. cerevisiae) pseudogene 2 (MRS2P2), non-coding RNA
>gi|219881533|ref|NR_003932.2| Homo sapiens ribosomal protein L13a pseudogene 20 (RPL13AP20), non-coding RNA
>gi|93204855|ref|NR_003024.1| Homo sapiens eukaryotic translation initiation factor 3, subunit I pseudogene 1 (EIF3IP1), non-coding RNA
>gi|222831626|ref|NR_026740.1| Homo sapiens placenta-specific 9 pseudogene (LOC389033), non-coding RNA

I want to write a code to extract the string inside parantheses in each line. The difficulty is some of the lines how more than two strings inside parantheses (i.e. line 3) In such cases, I only want to extract the string inside the second paranthesis. My output should look like:

Code:
PCDHB18
KRTAP20-4
MRS2P2
RPL13AP20
EIF3IP1
LOC389033

How do I go about doing this? Thanks!
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

extract a sub string from a main string

i need a shell program to extract a substring from a main string.. for eg:- main string is madhu.. sub string is mad o/p:- be mad. try to solve this one (5 Replies)
Discussion started by: madhu.it
5 Replies

2. Shell Programming and Scripting

Search for string in a file and extract another string to a variable

Hi, guys. I have one question: I need to search for a string in a file, and then extract another string from the file and assign it to a variable. For example: the contents of the file (group) is below: ... ftp:x:23: mail:x:34 ... testing:x:2001 sales:x:2002 development:x:2003 ...... (6 Replies)
Discussion started by: daikeyang
6 Replies

3. Shell Programming and Scripting

extract a string within a string using a pattern

hi all, i have a file name using the following pattern: PREFIX: AR SOURCE: LEGACY DATETIME: YYYYMMDD_HH24MISS SUFFIX: .txt sample filename: AR_LEGACY_20101104_105500.txt i want to extract the source which is LEGACY in this case. how do i do this using shell? thanks. (4 Replies)
Discussion started by: adshocker
4 Replies

4. Shell Programming and Scripting

to extract string from main string and string comparison

continuing from my previous post, whose link is given below as a reference https://www.unix.com/shell-programming-scripting/171076-shell-scripting.html#post302573569 consider there is create table commands in a file for eg: CREATE TABLE `Blahblahblah` ( `id` int(11) NOT NULL... (2 Replies)
Discussion started by: vivek d r
2 Replies

5. Shell Programming and Scripting

Extract a string between 2 ref string from a file

Hi, May i ask if someone share some command for extracting a string between 2 ref string in a txt file My objective: i had a file with multiple lines and wants only to extract the string "watch?v=IbkAXOmEHpY" or "watch?v=<11 random character>", when i used "grep 'watch?=*' i got a results per... (4 Replies)
Discussion started by: jao_madn
4 Replies

6. Shell Programming and Scripting

Extract a string from another string in UNIX

I have a string string="Please have a nice day and sleep well Replace_12123_31233_32134_12342 Good day" How do i replace "Replace_12123_31233_32134_1234" in the above string.?? Please help. Regards, Qwerty (3 Replies)
Discussion started by: qwertyu
3 Replies

7. Shell Programming and Scripting

Search String and extract few lines under the searched string

Need Assistance in shell programming... I have a huge file which has multiple stations and i wanted to search particular station and extract few lines from it and the rest is not needed Bold letters are the stations . The whole file has multiple stations . Below example i wanted to search... (4 Replies)
Discussion started by: ajayram_arya
4 Replies

8. Shell Programming and Scripting

To Search for a string and to extract the string from the text

Hi Team I have an huge xml where i need to search for a ceratin numbers. For example 2014-05-06 15:15:41,498 INFO WebContainer : 10 CommonServicesLogs - CleansingTriggerService.invokeCleansingService Entered PUBSUB NOTIFY MESSAGE () - <?xml version="1.0" encoding="UTF-8"... (5 Replies)
Discussion started by: Kannannair
5 Replies

9. Shell Programming and Scripting

How to extract every repeated string between two specific string?

Hello guys, I have problem with hpux shell script. I have one big text file that contains like SOH bla bla bla bla bla bla ETX SOH bla bla bla ETX SOH bla bla bla ETX What I need to do is save first SOH*BLA into file1.txt, save second SOH*BLA into file2.txt and so on.... (17 Replies)
Discussion started by: sembii
17 Replies

10. Programming

Replace comma which is not inside brackets,quotes or paranthesis

Hi All, I want to replace the commas which are not inside parenthesis,quotes if input is abc,,lm,(no,pq,rs),{tu,vw,xy},zs,"as,as,fr",'ab,cd,ef' output should be abc lm (no,pq,rs) {tu,vw,xy} zs "as,as,fr" 'ab,cd,ef' I tried this str.replaceAll("\\(.*?\\)|(,)", " "); say my string... (3 Replies)
Discussion started by: preethy
3 Replies
Bio::Variation::SeqDiff(3pm)				User Contributed Perl Documentation			      Bio::Variation::SeqDiff(3pm)

NAME
Bio::Variation::SeqDiff - Container class for mutation/variant descriptions SYNOPSIS
$seqDiff = Bio::Variation::SeqDiff->new ( -id => $M20132, -alphabet => 'rna', -gene_symbol => 'AR' -chromosome => 'X', -numbering => 'coding' ); # get a DNAMutation object somehow $seqDiff->add_Variant($dnamut); print $seqDiff->sys_name(), " "; DESCRIPTION
SeqDiff stores Bio::Variation::VariantI object references and descriptive information common to all changes in a sequence. Mutations are understood to be any kind of sequence markers and are expected to occur in the same chromosome. See Bio::Variation::VariantI for details. The methods of SeqDiff are geared towards describing mutations in human genes using gene-based coordinate system where 'A' of the initiator codon has number 1 and the one before it -1. This is according to conventions of human genetics. There will be class Bio::Variation::Genotype to describe markers in different chromosomes and diploid genototypes. Classes implementing Bio::Variation::VariantI interface are Bio::Variation::DNAMutation, Bio::Variation::RNAChange, and Bio::Variation::AAChange. See Bio::Variation::VariantI, Bio::Variation::DNAMutation, Bio::Variation::RNAChange, and Bio::Variation::AAChange for more information. Variant objects can be added using two ways: an array passed to the constructor or as individual Variant objects with add_Variant method. FEEDBACK
Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the Bioperl mailing lists Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web: https://redmine.open-bio.org/projects/bioperl/ AUTHOR - Heikki Lehvaslaiho Email: heikki-at-bioperl-dot-org CONTRIBUTORS
Eckhard Lehmann, ecky@e-lehmann.de APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ new Title : new Usage : $seqDiff = Bio::Variation::SeqDiff->new; Function: generates a new Bio::Variation::SeqDiff Returns : reference to a new object of class SeqDiff Args : id Title : id Usage : $obj->id(H0001); $id = $obj->id(); Function: Sets or returns the id of the seqDiff. Should be used to give the collection of variants a UID without semantic associations. Example : Returns : value of id, a scalar Args : newvalue (optional) sysname Title : sysname Usage : $obj->sysname('5C>G'); $sysname = $obj->sysname(); Function: Sets or returns the systematic name of the seqDiff. The name should follow the HUGO Mutation Database Initiative approved nomenclature. If called without first setting the value, will generate it from L<Bio::Variation::DNAMutation> objects attached. Example : Returns : value of sysname, a scalar Args : newvalue (optional) trivname Title : trivname Usage : $obj->trivname('[A2G;T56G]'); $trivname = $obj->trivname(); Function: Sets or returns the trivial name of the seqDiff. The name should follow the HUGO Mutation Database Initiative approved nomenclature. If called without first setting the value, will generate it from L<Bio::Variation::AAChange> objects attached. Example : Returns : value of trivname, a scalar Args : newvalue (optional) chromosome Title : chromosome Usage : $obj->chromosome('X'); $chromosome = $obj->chromosome(); Function: Sets or returns the chromosome ("linkage group") of the seqDiff. Example : Returns : value of chromosome, a scalar Args : newvalue (optional) gene_symbol Title : gene_symbol Usage : $obj->gene_symbol('FOS'); $gene_symbol = $obj->gene_symbol; Function: Sets or returns the gene symbol for the studied CDS. Example : Returns : value of gene_symbol, a scalar Args : newvalue (optional) description Title : description Usage : $obj->description('short description'); $descr = $obj->description(); Function: Sets or returns the short description of the seqDiff. Example : Returns : value of description, a scalar Args : newvalue (optional) alphabet Title : alphabet Usage : if( $obj->alphabet eq 'dna' ) { /Do Something/ } Function: Returns the type of primary reference sequence being one of 'dna', 'rna' or 'protein'. This is case sensitive. Returns : a string either 'dna','rna','protein'. Args : none numbering Title : numbering Usage : $obj->numbering('coding'); $numbering = $obj->numbering(); Function: Sets or returns the string giving the numbering schema used to describe the variants. Example : Returns : value of numbering, a scalar Args : newvalue (optional) offset Title : offset Usage : $obj->offset(124); $offset = $obj->offset(); Function: Sets or returns the offset from the beginning of the DNA sequence to the coordinate start used to describe variants. Typically the beginning of the coding region of the gene. The cds_start should be 1 + offset. Example : Returns : value of offset, a scalar Args : newvalue (optional) cds_start Title : cds_start Usage : $obj->cds_start(123); $cds_start = $obj->cds_start(); Function: Sets or returns the cds_start from the beginning of the DNA sequence to the coordinate start used to describe variants. Typically the beginning of the coding region of the gene. Needs to be and is implemented as 1 + offset. Example : Returns : value of cds_start, a scalar Args : newvalue (optional) cds_end Title : cds_end Usage : $obj->cds_end(321); $cds_end = $obj->cds_end(); Function: Sets or returns the position of the last nucleotitide of the termination codon. The coordinate system starts from cds_start. Example : Returns : value of cds_end, a scalar Args : newvalue (optional) rna_offset Title : rna_offset Usage : $obj->rna_offset(124); $rna_offset = $obj->rna_offset(); Function: Sets or returns the rna_offset from the beginning of the RNA sequence to the coordinate start used to describe variants. Typically the beginning of the coding region of the gene. Example : Returns : value of rna_offset, a scalar Args : newvalue (optional) rna_id Title : rna_id Usage : $obj->rna_id('transcript#3'); $rna_id = $obj->rna_id(); Function: Sets or returns the ID for original RNA sequence of the seqDiff. Example : Returns : value of rna_id, a scalar Args : newvalue (optional) add_Variant Title : add_Variant Usage : $obj->add_Variant($variant) Function: Pushes one Bio::Variation::Variant into the list of variants. At the same time, creates a link from the Variant to SeqDiff using its SeqDiff method. Example : Returns : 1 when succeeds, 0 for failure. Args : Variant object each_Variant Title : each_Variant Usage : $obj->each_Variant(); Function: Returns a list of Variants. Example : Returns : list of Variants Args : none add_Gene Title : add_Gene Usage : $obj->add_Gene($gene) Function: Pushes one L<Bio::LiveSeq::Gene> into the list of genes. Example : Returns : 1 when succeeds, 0 for failure. Args : Bio::LiveSeq::Gene object See Bio::LiveSeq::Gene for more information. each_Gene Title : each_Gene Usage : $obj->each_Gene(); Function: Returns a list of L<Bio::LiveSeq::Gene>s. Example : Returns : list of Genes Args : none dna_ori Title : dna_ori Usage : $obj->dna_ori('atgctgctgctgct'); $dna_ori = $obj->dna_ori(); Function: Sets or returns the original DNA sequence string of the seqDiff. Example : Returns : value of dna_ori, a scalar Args : newvalue (optional) dna_mut Title : dna_mut Usage : $obj->dna_mut('atgctggtgctgct'); $dna_mut = $obj->dna_mut(); Function: Sets or returns the mutated DNA sequence of the seqDiff. If sequence has not been set generates it from the original sequence and DNA mutations. Example : Returns : value of dna_mut, a scalar Args : newvalue (optional) rna_ori Title : rna_ori Usage : $obj->rna_ori('atgctgctgctgct'); $rna_ori = $obj->rna_ori(); Function: Sets or returns the original RNA sequence of the seqDiff. Example : Returns : value of rna_ori, a scalar Args : newvalue (optional) rna_mut Title : rna_mut Usage : $obj->rna_mut('atgctggtgctgct'); $rna_mut = $obj->rna_mut(); Function: Sets or returns the mutated RNA sequence of the seqDiff. Example : Returns : value of rna_mut, a scalar Args : newvalue (optional) aa_ori Title : aa_ori Usage : $obj->aa_ori('MAGVLL*'); $aa_ori = $obj->aa_ori(); Function: Sets or returns the original protein sequence of the seqDiff. Example : Returns : value of aa_ori, a scalar Args : newvalue (optional) aa_mut Title : aa_mut Usage : $obj->aa_mut('MA*'); $aa_mut = $obj->aa_mut(); Function: Sets or returns the mutated protein sequence of the seqDiff. Example : Returns : value of aa_mut, a scalar Args : newvalue (optional) seqobj Title : seqobj Usage : $dnaobj = $obj->seqobj('dna_mut'); Function: Returns the any original or mutated sequences as a Bio::PrimarySeq object. Example : Returns : Bio::PrimarySeq object for the requested sequence Args : string, method name for the sequence requested See Bio::PrimarySeq for more information. alignment Title : alignment Usage : $obj->alignment Function: Returns a pretty RNA/AA sequence alignment from linked objects. Under construction: Only simple coding region point mutations work. Example : Returns : Args : none perl v5.14.2 2012-03-02 Bio::Variation::SeqDiff(3pm)
All times are GMT -4. The time now is 11:46 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy