Extracting text between two strings | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Extracting text between two strings

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 06-27-2010
JamesForeman JamesForeman is offline
Registered User
 
Join Date: Jun 2010
Last Activity: 25 August 2010, 8:42 AM EDT
Location: Hong Kong
Posts: 5
Thanks: 4
Thanked 0 Times in 0 Posts
Extracting text between two strings

Hi,

I've looked at a few existing posts on this, but they don't seem to work for my inputs.

I have a text file where I want to extract all the text between two strings, every time that occurs.

Eg my input file is

Anna said that she would fetch the bucket.
Anna and Ben moved the bucket.
I would not like Anna to do it.


I was expecting that


Code:
sed -n '/Anna/,/would/p' inputfile > outputfile

would give me

said that she
and Ben moved the bucket.
I

But instead I get back

Anna said that she would fetch the bucket.
Anna and Ben moved the bucket.
I would not like Anna to do it.

What am I missing?

Thanks
Sponsored Links
    #2  
Old 06-27-2010
bartus11's Avatar
bartus11 bartus11 is offline Forum Staff  
Moderator
 
Join Date: Apr 2009
Last Activity: 24 October 2014, 3:10 PM EDT
Posts: 3,710
Thanks: 7
Thanked 1,142 Times in 1,113 Posts
Try
Code:
perl -0777 -ne '/(?<=Anna).*(?=would)/s;print $&;' file

or
Code:
perl -0777 -ne '/(?<=Anna).*?(?=would)/s;print $&;' file

The Following 2 Users Say Thank You to bartus11 For This Useful Post:
JamesForeman (06-27-2010), Tribe (12-28-2012)
Sponsored Links
    #3  
Old 06-27-2010
Scrutinizer's Avatar
Scrutinizer Scrutinizer is offline Forum Staff  
Moderator
 
Join Date: Nov 2008
Last Activity: 25 October 2014, 2:06 AM EDT
Location: Amsterdam
Posts: 9,549
Thanks: 285
Thanked 2,426 Times in 2,174 Posts

Code:
sed -n '/Anna/,/would/p' inputfile > outputfile

prints the whole line that contains "Anna" upto and including any next line that contains "would"

Last edited by Scrutinizer; 06-27-2010 at 04:44 AM..
    #4  
Old 06-27-2010
bartus11's Avatar
bartus11 bartus11 is offline Forum Staff  
Moderator
 
Join Date: Apr 2009
Last Activity: 24 October 2014, 3:10 PM EDT
Posts: 3,710
Thanks: 7
Thanked 1,142 Times in 1,113 Posts
Quote:
Originally Posted by Scrutinizer View Post
Code:
sed -n '/Anna/,/would/p' inputfile > outputfile

prints the whole line that contains "Anna" upto and including any next line that contains "would"
From what OP wrote, he already tried that code, and its result didn't meet his needs.
Sponsored Links
    #5  
Old 06-27-2010
Scrutinizer's Avatar
Scrutinizer Scrutinizer is offline Forum Staff  
Moderator
 
Join Date: Nov 2008
Last Activity: 25 October 2014, 2:06 AM EDT
Location: Amsterdam
Posts: 9,549
Thanks: 285
Thanked 2,426 Times in 2,174 Posts
Hi Bartus11, I know, I did not try to provide a solution, I just tried to explain what a sed construction such as he used does, since it did not work as he expected (actually that was what he was asking).
The Following User Says Thank You to Scrutinizer For This Useful Post:
JamesForeman (06-27-2010)
Sponsored Links
    #6  
Old 06-27-2010
bartus11's Avatar
bartus11 bartus11 is offline Forum Staff  
Moderator
 
Join Date: Apr 2009
Last Activity: 24 October 2014, 3:10 PM EDT
Posts: 3,710
Thanks: 7
Thanked 1,142 Times in 1,113 Posts
Quote:
Originally Posted by Scrutinizer View Post
Hi Bartus11, I know, I did not try to provide a solution, I just tried to explain what a sed construction such as he used does, since it did not work as he expected (actually that was what he was asking).
Sorry for misunderstanding your post
Sponsored Links
    #7  
Old 06-27-2010
JamesForeman JamesForeman is offline
Registered User
 
Join Date: Jun 2010
Last Activity: 25 August 2010, 8:42 AM EDT
Location: Hong Kong
Posts: 5
Thanks: 4
Thanked 0 Times in 0 Posts
Thanks all, now I have a slightly improved understanding of sed (and perl as well)

Bartus11's second bit of perl gives me almost what I want: it gives me the text between the first instance of 'Anna' and the first 'would' after that. But if I have multiple occurrences of 'Anna' and 'would' in my file, how do I get all of them?

Just to clarify, if the text file was

Anna A would Anna B would Anna C would

then I'd want the output to be
A
B
C

and not
A
AB
B
BC
C

or any similar permutation. Should I just get rid of the first occurence in the file and then run Bartus11's second script again (and again and again) until I get no more output? Or is there an elegant way to avoid doing that? (Not that it has to be elegant: I'm quite happy with brute force )
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Extracting text between two strings, first instance only fubaya Shell Programming and Scripting 4 11-07-2009 04:37 PM
extracting numbers from strings gobi Shell Programming and Scripting 2 05-26-2008 11:44 PM
extracting a set of strings from a text file Deanne Shell Programming and Scripting 2 09-20-2007 11:31 PM
Help with extracting strings from a file cmsdelhi Shell Programming and Scripting 7 01-12-2007 08:49 AM
Extracting strings hugow UNIX for Dummies Questions & Answers 1 06-24-2005 06:09 AM



All times are GMT -4. The time now is 03:01 AM.