The UNIX and Linux Forums  


Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Anyways to find sentences with data format and extract it??? cyberray Shell Programming and Scripting 4 10-30-2007 11:22 PM
identify this mobo blowFish@ubuntu What's on Your Mind? 0 07-18-2007 02:06 AM
identify function xinfinity Shell Programming and Scripting 7 04-04-2006 07:18 PM
identify hardware patrickb Filesystems, Disks and Memory 0 02-21-2006 10:29 AM
spliting up sentences stevox UNIX for Dummies Questions & Answers 2 04-17-2001 04:36 AM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 07-17-2008
vanitham vanitham is offline
Registered User
  
 

Join Date: Sep 2007
Posts: 56
How to identify sentences from a text?

Hi,

I have to identify sentences from this text.

If i split these statements by this way:


Code:
@sentence= split(/\.\W*/,$text);

I will get these following things also in the output along with proper sentences.

Biol Reprod.

2002 Mar;66(3):785-95.

Egydio de Carvalho C, Tanaka H, Iguchi N, Ventela S, Nojima H, Nishimune Y.

Department of Science for Laboratory Animal Experimentation, Research Institutefor Microbial Diseases, Osaka University, Suita City, Osaka 565-0871, Japan.

Research Support, Non-U.S.

I should get proper sentences only.

How can i identify proper sentences in perl?

I don't want to use any modules without using modules can we do this?

Here is the text:

1: Biol Reprod. 2002 Mar;66(3):785-95.

Molecular cloning and characterization of a complementary DNA encoding sperm tail
protein SHIPPO 1.

Egydio de Carvalho C, Tanaka H, Iguchi N, Ventela S, Nojima H, Nishimune Y.(Author's names)

Department of Science for Laboratory Animal Experimentation, Research Institute for Microbial Diseases, Osaka University, Suita City, Osaka 565-0871, Japan.

Formation of the tail in developing sperm is a complex process involving the organization of the axoneme, transport of periaxonemal proteins from the
cytoplasm to the tail, and assembly of the outer dense fibers and fibrous sheath.Although detailed morphological descriptions of these events are available, the molecular mechanisms remain to be fully elucidated. We have isolated a new gene, named shippo 1, from a haploid germ cell-specific cDNA library of mouse testis,and also its human orthologue (h-shippo 1). The isolated cDNA is 1.2 kilobases long, carrying a 762-base pair open reading frame that encodes SHIPPO 1, a sperm protein predicted to consist of 254 amino acids. The amino acid sequence includes 6 Pro-Gly-Pro repeats, which are also present in the human orthologue protein (hSHIPPO 1) as well as in 2 other newly reported proteins of Drosophila melanogaster. Transcription of shippo 1 is exclusively observed in haploid germ cells. Antibody raised against SHIPPO 1 identified a testis-specific M(r) 32 x 10(-3) band in Western blot analysis. The protein was further localized in the flagella of the elongated spermatids and along the entire length of the tail in mature sperm. SHIPPO 1 in sperm is resistant to treatment with nonionic detergents and coextracted with the cytoskeletal core proteins of the mouse sperm tail.

Publication Types:
Research Support

ID:1187

Pls tell me how to identify senetences?

with regards
Vanitha
  #2 (permalink)  
Old 07-17-2008
jim mcnamara jim mcnamara is online now Forum Staff  
...@...
  
 

Join Date: Feb 2004
Location: NM
Posts: 5,791
If you have to do a lot of these, you are in trouble IMO.

Finding sentences vs scientific citations requires some sort of AI. You would have to identify a block of text ending in . that has a subject and a predicate. Either thsat or create some sort of monstrous filter that traps every single journal and author name.
It would be easier to simply edit the file by hand.
  #3 (permalink)  
Old 07-18-2008
vanitham vanitham is offline
Registered User
  
 

Join Date: Sep 2007
Posts: 56
Quote:
Originally Posted by jim mcnamara View Post
If you have to do a lot of these, you are in trouble IMO.

Finding sentences vs scientific citations requires some sort of AI. You would have to identify a block of text ending in . that has a subject and a predicate. Either thsat or create some sort of monstrous filter that traps every single journal and author name.
It would be easier to simply edit the file by hand.
Hi,

Thanks for the reply.
Otherwise no other way!!
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 04:04 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0