How to identify sentences from a text? Post: 302215760

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

spliting up sentences

hello, i'm looking to split up text into a list of words but can't figure it out, any help would be great. thanks steven

2. Shell Programming and Scripting

Hi, I have a file and that file contains the following sentences. Here we show that a virus-encoded transcription factor, viral mRNA, cellular RNA-binding protein heterodimer G3BP/Caprin-1 (p137), translation initiation factors eIF4E and eIF4G, and ribosomal proteins are concentrated in the...

3. UNIX for Dummies Questions & Answers

How to filter sentences??

Hi, I have few sentences here. $a1="Division of Hematology-Oncology, and Stem cell transplantation, Schneider Childrens Hospital, Albert Einstein College of Medicine, New Hyde Park, New York. "; $a2="Department of Cell Biology and Anatomy, College of Medicine, National Cheng Kung...

4. Programming

How to extract a sentences of word from a text file.

Hi , i have a text file that contain a story How do i extract the out all the sentences that contain the word Mon. in C++ I only want to show those sentences that contain the word mon eg. Monkey on a tree. Rabbit jumping around the tree. I am very rich, I have lots of money. Today...

5. Shell Programming and Scripting

Identify high values "�" in a text file using Unix command

I have high values (such as ��) in a text file contained in an Unix AIX server. I need to identify all the records which are having these high values and also get the position/column number in the record structure if possible. Is there any Unix command by which this can be done to : 1....

6. Shell Programming and Scripting

Extract all the sentences from a text file that matches a pattern list

Hi I have a big text file. I want to extract all the sentences that matches at least 70% (seventy percent) of the words from each sentence based on a word list called A. Say the format of the text file is as given below: This is the first sentence which consists of fifteen words...

7. Shell Programming and Scripting

How to identify exact text and then add a blank line above it using sed?

I need to identify the exact text of San Antonio Generator Running in the output my script which lands to a text file. Once SED finds the specific text, I need it to insert one line above the matched text. Here is what I have so far that isn't working all that well for me. Any help would be...

8. Shell Programming and Scripting

Extract sentence and its details from a text file based on another file of sentences

Hi I have two text files. The first file is TEXTFILEONE.txt as given below: <Text Text_ID="10155645315851111_10155645333076543" From="460350337461111" Created="2011-03-16T17:05:37+0000" use_count="123">This is the first text</Text> <Text Text_ID="10155645315851111_10155645317023456"...

9. Shell Programming and Scripting

How to identify varying unique fields values from a text file in UNIX?

Hi, I have a huge unsorted text file. We wanted to identify the unique field values in a line and consider those fields as a primary key for a table in upstream system. Basically, the process or script should fetch the values from each line that are unique compared to the rest of the lines in...

10. Shell Programming and Scripting

How to use $variable in conditional sentences?

Hello all I am doing a Makefile but I can't return the value of $var to use it in conditional sentences: #!/bin/sh GO=$(shell) go GOPATH=$(GO) env GOPATH make: @$(GOPATH) @if ; then mkdir -p "$(GOPATH)/bin" ; fi When I type "make", @$GOPATH returns /home/icvallejo/go...

LEARN ABOUT DEBIAN

bio::tools::codontable

Bio::Tools::CodonTable(3pm)				User Contributed Perl Documentation			       Bio::Tools::CodonTable(3pm)

NAME

       Bio::Tools::CodonTable - Codon table object

SYNOPSIS

	 # This is a read-only class for all known codon tables.  The IDs are
	 # the ones used by nucleotide sequence databases.  All common IUPAC
	 # ambiguity codes for DNA, RNA and amino acids are recognized.

	 use Bio::Tools::CodonTable;

	 # defaults to ID 1 "Standard"
	 $myCodonTable	 = Bio::Tools::CodonTable->new();
	 $myCodonTable2  = Bio::Tools::CodonTable->new( -id => 3 );

	 # change codon table
	 $myCodonTable->id(5);

	 # examine codon table
	 print	join (' ', "The name of the codon table no.", $myCodonTable->id(4),
		  "is:", $myCodonTable->name(), "
");

	 # print possible codon tables
	 $tables = Bio::Tools::CodonTable->tables;
	 while ( ($id,$name) = each %{$tables} ) {
	   print "$id = $name
";
	 }

	 # translate a codon
	 $aa = $myCodonTable->translate('ACU');
	 $aa = $myCodonTable->translate('act');
	 $aa = $myCodonTable->translate('ytr');

	 # reverse translate an amino acid
	 @codons = $myCodonTable->revtranslate('A');
	 @codons = $myCodonTable->revtranslate('Ser');
	 @codons = $myCodonTable->revtranslate('Glx');
	 @codons = $myCodonTable->revtranslate('cYS', 'rna');

	 # reverse translate an entire amino acid sequence into a IUPAC
	 # nucleotide string

	 my $seqobj    = Bio::PrimarySeq->new(-seq => 'FHGERHEL');
	 my $iupac_str = $myCodonTable->reverse_translate_all($seqobj);

	 # boolean tests
	 print "Is a start
"	    if $myCodonTable->is_start_codon('ATG');
	 print "Is a terminator
" if $myCodonTable->is_ter_codon('tar');
	 print "Is a unknown
"     if $myCodonTable->is_unknown_codon('JTG');

DESCRIPTION

       Codon tables are also called translation tables or genetic codes since that is what they represent. A bit more complete picture of the full
       complexity of codon usage in various taxonomic groups is presented at the NCBI Genetic Codes Home page.

       CodonTable is a BioPerl class that knows all current translation tables that are used by primary nucleotide sequence databases (GenBank,
       EMBL and DDBJ). It provides methods to output information about tables and relationships between codons and amino acids.

       This class and its methods recognized all common IUPAC ambiguity codes for DNA, RNA and animo acids. The translation method follows the
       conventions in EMBL and TREMBL databases.

       It is a nuisance to separate RNA and cDNA representations of nucleic acid transcripts. The CodonTable object accepts codons of both type as
       input and allows the user to set the mode for output when reverse translating. Its default for output is DNA.

       Note:

       This class deals primarily with individual codons and amino acids. However in the interest of speed you can translate longer sequence, too.
       The full complexity of protein translation is tackled by Bio::PrimarySeqI::translate.

       The amino acid codes are IUPAC recommendations for common amino acids:

		 A	     Ala	    Alanine
		 R	     Arg	    Arginine
		 N	     Asn	    Asparagine
		 D	     Asp	    Aspartic acid
		 C	     Cys	    Cysteine
		 Q	     Gln	    Glutamine
		 E	     Glu	    Glutamic acid
		 G	     Gly	    Glycine
		 H	     His	    Histidine
		 I	     Ile	    Isoleucine
		 L	     Leu	    Leucine
		 K	     Lys	    Lysine
		 M	     Met	    Methionine
		 F	     Phe	    Phenylalanine
		 P	     Pro	    Proline
		 O	     Pyl	    Pyrrolysine (22nd amino acid)
		 U	     Sec	    Selenocysteine (21st amino acid)
		 S	     Ser	    Serine
		 T	     Thr	    Threonine
		 W	     Trp	    Tryptophan
		 Y	     Tyr	    Tyrosine
		 V	     Val	    Valine
		 B	     Asx	    Aspartic acid or Asparagine
		 Z	     Glx	    Glutamine or Glutamic acid
		 J	     Xle	    Isoleucine or Valine (mass spec ambiguity)
		 X	     Xaa	    Any or unknown amino acid

       It is worth noting that, "Bacterial" codon table no. 11 produces an polypeptide that is, confusingly, identical to the standard one. The
       only differences are in available initiator codons.

       NCBI Genetic Codes home page:
	    http://www.ncbi.nlm.nih.gov/Taxonomy/Utils/wprintgc.cgi?mode=c

       EBI Translation Table Viewer:
	    http://www.ebi.ac.uk/cgi-bin/mutations/trtables.cgi

       Amended ASN.1 version with ids 16 and 21 is at:
	    ftp://ftp.ebi.ac.uk/pub/databases/geneticcode/

       Thanks to Matteo diTomasso for the original Perl implementation of these tables.

FEEDBACK

   Mailing Lists
       User feedback is an integral part of the evolution of this and other Bioperl modules. Send your comments and suggestions preferably to the
       Bioperl mailing lists  Your participation is much appreciated.

	 bioperl-l@bioperl.org			- General discussion
	 http://bioperl.org/wiki/Mailing_lists	- About the mailing lists

   Support
       Please direct usage questions or support issues to the mailing list:

       bioperl-l@bioperl.org

       rather than to the module maintainer directly. Many experienced and reponsive experts will be able look at the problem and quickly address
       it. Please include a thorough description of the problem with code and data examples if at all possible.

   Reporting Bugs
       Report bugs to the Bioperl bug tracking system to help us keep track the bugs and their resolution.  Bug reports can be submitted via the
       web:

	 https://redmine.open-bio.org/projects/bioperl/

AUTHOR - Heikki Lehvaslaiho
       Email:  heikki-at-bioperl-dot-org

APPENDIX

       The rest of the documentation details each of the object methods. Internal methods are usually preceded with a _

   id
	Title	: id
	Usage	: $obj->id(3); $id_integer = $obj->id();
	Function: Sets or returns the id of the translation table.  IDs are
		  integers from 1 to 15, excluding 7 and 8 which have been
		  removed as redundant. If an invalid ID is given the method
		  returns 0, false.
	Example :
	Returns : value of id, a scalar, 0 if not a valid
	Args	: newvalue (optional)

   name
	Title	: name
	Usage	: $obj->name()
	Function: returns the descriptive name of the translation table
	Example :
	Returns : A string
	Args	: None

   tables
	Title	: tables
	Usage	: $obj->tables()  or  Bio::Tools::CodonTable->tables()
	Function: returns a hash reference where each key is a valid codon
		  table id() number, and each value is the corresponding
		  codon table name() string
	Example :
	Returns : A hashref
	Args	: None

   translate
	Title	: translate
	Usage	: $obj->translate('YTR')
	Function: Returns a string of one letter amino acid codes from
		  nucleotide sequence input. The imput can be of any length.

		  Returns 'X' for unknown codons and codons that code for
		  more than one amino acid. Returns an empty string if input
		  is not three characters long. Exceptions for these are:

		    - IUPAC amino acid code B for Aspartic Acid and
		      Asparagine, is used.
		    - IUPAC amino acid code Z for Glutamic Acid, Glutamine is
		      used.
		    - if the codon is two nucleotides long and if by adding
		      an a third character 'N', it codes for a single amino
		      acid (with exceptions above), return that, otherwise
		      return empty string.

		  Returns empty string for other input strings that are not
		  three characters long.

	Example :
	Returns : a string of one letter ambiguous IUPAC amino acid codes
	Args	: ambiguous IUPAC nucleotide string

   translate_strict
	Title	: translate_strict
	Usage	: $obj->translate_strict('ACT')
	Function: returns one letter amino acid code for a codon input

		  Fast and simple translation. User is responsible to resolve
		  ambiguous nucleotide codes before calling this
		  method. Returns 'X' for unknown codons and an empty string
		  for input strings that are not three characters long.

		  It is not recommended to use this method in a production
		  environment. Use method translate, instead.

	Example :
	Returns : A string
	Args	: a codon = a three nucleotide character string

   revtranslate
	Title	: revtranslate
	Usage	: $obj->revtranslate('G')
	Function: returns codons for an amino acid

		  Returns an empty string for unknown amino acid
		  codes. Ambiguous IUPAC codes Asx,B, (Asp,D; Asn,N) and
		  Glx,Z (Glu,E; Gln,Q) are resolved. Both single and three
		  letter amino acid codes are accepted. '*' and 'Ter' are
		  used for terminator.

		  By default, the output codons are shown in DNA.  If the
		  output is needed in RNA (tr/t/u/), add a second argument
		  'RNA'.

	Example : $obj->revtranslate('Gly', 'RNA')
	Returns : An array of three lower case letter strings i.e. codons
	Args	: amino acid, 'RNA'

   reverse_translate_all
	Title	: reverse_translate_all
	Usage	: my $iup_str = $cttable->reverse_translate_all($seq_object)
		  my $iup_str = $cttable->reverse_translate_all($seq_object,
								$cutable,
								15);
	Function: reverse translates a protein sequence into IUPAC nucleotide
		  sequence. An 'X' in the protein sequence is converted to 'NNN'
		  in the nucleotide sequence.
	Returns : a string
	Args	: a Bio::PrimarySeqI compatible object (mandatory)
		  a Bio::CodonUsage::Table object and a threshold if only
		    codons with a relative frequency above the threshold are
		    to be considered.

   reverse_translate_best
	Title	: reverse_translate_best
	Usage	: my $str = $cttable->reverse_translate_best($seq_object,$cutable);
	Function: Reverse translates a protein sequence into plain nucleotide
		  sequence (GATC), uses the most common codon for each amino acid
	Returns : A string
	Args	: A Bio::PrimarySeqI compatible object and a Bio::CodonUsage::Table object

   is_start_codon
	Title	: is_start_codon
	Usage	: $obj->is_start_codon('ATG')
	Function: returns true(1) for all codons that can be used as a
		  translation start, false(0) for others.
	Example : $myCodonTable->is_start_codon('ATG')
	Returns : boolean
	Args	: codon

   is_ter_codon
	Title	: is_ter_codon
	Usage	: $obj->is_ter_codon('GAA')
	Function: returns true(1) for all codons that can be used as a
		  translation tarminator, false(0) for others.
	Example : $myCodonTable->is_ter_codon('ATG')
	Returns : boolean
	Args	: codon

   is_unknown_codon
	Title	: is_unknown_codon
	Usage	: $obj->is_unknown_codon('GAJ')
	Function: returns false(0) for all codons that are valid,
	       true(1) for others.
	Example : $myCodonTable->is_unknown_codon('NTG')
	Returns : boolean
	Args	: codon

   unambiguous_codons
	Title	: unambiguous_codons
	Usage	: @codons = $self->unambiguous_codons('ACN')
	Returns : array of strings (one-letter unambiguous amino acid codes)
	Args	: a codon = a three IUPAC nucleotide character string

   _unambiquous_codons
       deprecated, now an alias for unambiguous_codons

   add_table
	Title	: add_table
	Usage	: $newid = $ct->add_table($name, $table, $starts)
	Function: Add a custom Codon Table into the object.
		  Know what you are doing, only the length of
		  the argument strings is checked!
	Returns : the id of the new codon table
	Args	: name, a string, optional (can be empty)
		  table, a string of 64 characters
		  startcodons, a string of 64 characters, defaults to standard

perl v5.14.2							    2012-03-02					       Bio::Tools::CodonTable(3pm)