Sponsored Content
Top Forums Shell Programming and Scripting How to remove duplicate sentence/string in perl? Post 302261238 by radoulov on Monday 24th of November 2008 05:19:48 AM
Old 11-24-2008
Did you read my post?

Code:
$ cat p
#! /usr/bin/env perl

@arr =(
'TDP-43 is a highly conserved, 43-kDa RNA-binding protein implicated to play a role in transcription repression, nuclear organization, and alternative splicing. More recently, this factor has been identified as the major disease protein of several neurodegenerative diseases, including frontotemporal lobar degeneration with ubiquitin-positive inclusions and amyotrophic lateral sclerosis.',
'For the splicing activity, the factor has been shown to be mainly an exon-skipping promoter.',
'In this study using the survival of motor neuron (SMN) minigenes as the reporters in transfection assay, we show for the first time that TDP-43 could also act as an exon-inclusion factor. Furthermore, both RNA-recognition motif domains are required for its ability to enhance the SMN2 exon 7 inclusion.',
'Combined protein-immunoprecipitation and RNA-immunoprecipitation experiments also suggested that this exon inclusion activity might be mediated by multimeric complex(es) consisting of this protein interacting with other splicing factors, including Htra2-beta1.',
'Our data further evidence TDP-43 as a multifunctional RNA-binding protein for a diverse set of cellular activities.',
'TDP-43 is a highly conserved, 43-kDa RNA-binding protein implicated to play a role in transcription repression, nuclear organization, and alternative splicing. More recently, this factor has been identified as the major disease protein of several neurodegenerative diseases, including frontotemporal lobar degeneration with ubiquitin-positive inclusions and amyotrophic lateral sclerosis.',
'For the splicing activity, the factor has been shown to be mainly an exon-skipping promoter.',
'In this study using the survival of motor neuron (SMN) minigenes as the reporters in transfection assay, we show for the first time that TDP-43 could also act as an exon-inclusion factor. Furthermore, both RNA-recognition motif domains are required for its ability to enhance the SMN2 exon 7 inclusion.',
'Combined protein-immunoprecipitation and RNA-immunoprecipitation experiments also suggested that this exon inclusion activity might be mediated by multimeric complex(es) consisting of this protein interacting with other splicing factors, including Htra2-beta1.',
'Our data further evidence TDP-43 as a multifunctional RNA-binding protein for a diverse set of cellular activities.'
);

$, = "\n\n";
$\ = "\n";

print grep !$_{$_}++, @arr;

$ ./p
TDP-43 is a highly conserved, 43-kDa RNA-binding protein implicated to play a role in transcription repression, nuclear organization, and alternative splicing. More recently, this factor has been identified as the major disease protein of several neurodegenerative diseases, including frontotemporal lobar degeneration with ubiquitin-positive inclusions and amyotrophic lateral sclerosis.

For the splicing activity, the factor has been shown to be mainly an exon-skipping promoter.

In this study using the survival of motor neuron (SMN) minigenes as the reporters in transfection assay, we show for the first time that TDP-43 could also act as an exon-inclusion factor. Furthermore, both RNA-recognition motif domains are required for its ability to enhance the SMN2 exon 7 inclusion.

Combined protein-immunoprecipitation and RNA-immunoprecipitation experiments also suggested that this exon inclusion activity might be mediated by multimeric complex(es) consisting of this protein interacting with other splicing factors, including Htra2-beta1.

Our data further evidence TDP-43 as a multifunctional RNA-binding protein for a diverse set of cellular activities.
$

 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Replacement of sentence in perl

Hi, I have 3 arrays: @arr1=("Furthermore, apigenin treatment increased the level of association of the RNA binding protein HuR with endogenous p53 mRNA","one of the mechanisms by which apigenin induces p53 protein expression is enhancement of translation through the RNA binding protein... (1 Reply)
Discussion started by: vanitham
1 Replies

2. Shell Programming and Scripting

Remove duplicate files based on text string?

Hi I have been struggling with a script for removing duplicate messages from a shared mailbox. I would like to search for duplicate messages based on the “Message-ID” string within the messages files. I have managed to find the duplicate “Message-ID” strings and (if I would like) delete... (1 Reply)
Discussion started by: spangberg
1 Replies

3. Shell Programming and Scripting

Command to remove duplicate lines with perl,sed,awk

Input: hello hello hello hello monkey donkey hello hello drink dance drink Output should be: hello hello monkey donkey drink dance (9 Replies)
Discussion started by: cola
9 Replies

4. Shell Programming and Scripting

perl/shell need help to remove duplicate lines from files

Dear All, I have multiple files having number of records, consist of more than 10 columns some column values are duplicate and i want to remove these duplicate values from these files. Duplicate values may come in different files.... all files laying in single directory.. Need help to... (3 Replies)
Discussion started by: arvindng
3 Replies

5. Shell Programming and Scripting

Remove duplicate chars and sort string [SED]

Hi, INPUT: DCBADD OUTPUT: ABCD The SED script should alphabetically sort the chars in the string and remove the duplicate chars. (5 Replies)
Discussion started by: jds93
5 Replies

6. Shell Programming and Scripting

Remove not only the duplicate string but also the keyword of the string in Perl

Hi Perl users, I have another problem with text processing in Perl. I have a file below: Linux Unix Linux Windows SUN MACOS SUN SUN HP-AUX I want the result below: Unix Windows SUN MACOS HP-AUX so the duplicate string will be removed and also the keyword of the string on... (2 Replies)
Discussion started by: askari
2 Replies

7. UNIX for Dummies Questions & Answers

Help with if then sentence (string in file)

Hello! I'd like some help with a sentance, this 'if' should take a string from the user, then search my list for that string, now only those lines that string is found should be worked on. I'm new to this, but i'm guessing it's something like this.. #!/bin/bash ... (10 Replies)
Discussion started by: klskl
10 Replies

8. Shell Programming and Scripting

Remove string perl with first or last word is in a list

Hello, I try to delete all strings if their first or last word is one of this list of words : "the", "i", "in", "there", "this", "with", "on", "we", "that", "of" For example if i have this string in an input file "with me" this string will be removed, Example: input "the european... (2 Replies)
Discussion started by: cyrine
2 Replies

9. Shell Programming and Scripting

Remove First word of a sentence in shell

Hi there, How I remove the first word of a sentence. I have tried. echo '1.1;' ; echo "$one" | grep '1.1 ' | awk '{print substr($0,index($0," ")+1)}' For the below input. 1.1 Solaris 10 8/07 s10s_u4wos_12b SPARC Just want to know if there is any shorter alternative. (3 Replies)
Discussion started by: alvinoo
3 Replies

10. Shell Programming and Scripting

Remove duplicate consecutive lines with specific string

Hello, I'm trying to remove the duplicate consecutive lines with specific string "WARNING". File.txt abc; WARNING 2345 WARNING 2345 WARNING 2345 WARNING 2345 WARNING 2345 bcd; abc; 123 123 123 WARNING 1234 WARNING 2345 WARNING 2345 efgh; (6 Replies)
Discussion started by: Mannu2525
6 Replies
Bio::SeqFeature::Gene::Exon(3pm)			User Contributed Perl Documentation			  Bio::SeqFeature::Gene::Exon(3pm)

NAME
Bio::SeqFeature::Gene::Exon - a feature representing an exon SYNOPSIS
# obtain an exon instance $exon somehow print "exon from ", $exon->start(), " to ", $exon->end(), " on seq ", $exon->seq_id(), ", strand ", $exon->strand(), ", encodes the peptide sequence ", $exon->cds()->translate()->seq(), " "; DESCRIPTION
This module implements a feature representing an exon by implementing the Bio::SeqFeature::Gene::ExonI interface. By default an Exon is coding. Supply -is_coding => 0 to the constructor or call $exon->is_coding(0) otherwise. Apart from that, this class also implements Bio::SeqFeatureI by inheriting off Bio::SeqFeature::Generic. FEEDBACK
Mailing Lists User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to one of the Bioperl mailing lists. Your participation is much appreciated. bioperl-l@bioperl.org - General discussion http://bioperl.org/wiki/Mailing_lists - About the mailing lists Support Please direct usage questions or support issues to the mailing list: bioperl-l@bioperl.org rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address it. Please include a thorough description of the problem with code and data examples if at all possible. Reporting Bugs Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution. Bug reports can be submitted via the web: https://redmine.open-bio.org/projects/bioperl/ AUTHOR - Hilmar Lapp Email hlapp@gmx.net APPENDIX
The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _ is_coding Title : is_coding Usage : if($exon->is_coding()) { # do something } if($is_utr) { $exon->is_coding(0); } Function: Get/set whether or not the exon codes for amino acid. Returns : TRUE if the object represents a feature translated into protein, and FALSE otherwise. Args : A boolean value on set. primary_tag Title : primary_tag Usage : $tag = $feat->primary_tag() $feat->primary_tag('exon') Function: Get/set the primary tag for the exon feature. This method is overridden here in order to allow only for tag values following a certain convention. For consistency reasons, the tag value must either contain the string 'exon' or the string 'utr' (both case-insensitive). In the case of 'exon', a string describing the type of exon may be appended or prefixed. Presently, the following types are allowed: initial, internal, and terminal (all case-insensitive). If the supplied tag value matches 'utr' (case-insensitive), is_coding() will automatically be set to FALSE, and to TRUE otherwise. Returns : A string. Args : A string on set. location Title : location Usage : my $location = $exon->location() Function: Returns a location object suitable for identifying the location of the exon on the sequence or parent feature. This method is overridden here to restrict allowed location types to non-compound locations. Returns : Bio::LocationI object Args : none cds Title : cds() Usage : $cds = $exon->cds(); Function: Get the coding sequence of the exon as a sequence object. The sequence of the returned object is prefixed by Ns (lower case) if the frame of the exon is defined and different from zero. The result is that the first base starts a codon (frame 0). This implementation returns undef if the particular exon is not translated to protein, i.e., is_coding() returns FALSE. Undef will also be returned if no sequence is attached to this exon feature. Returns : A Bio::PrimarySeqI implementing object. Args : perl v5.14.2 2012-03-02 Bio::SeqFeature::Gene::Exon(3pm)
All times are GMT -4. The time now is 11:31 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy