Extract words before and after a pattern/regexp


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract words before and after a pattern/regexp
# 1  
Old 03-21-2008
Data Extract words before and after a pattern/regexp

Couldn't find much help on the kind of question I've here:
There is this text file with text as:

Line one has a bingo
Line two does not have a bingo but it has a tango
Bingo is on line three
Line four has both tango and bingo

Now I would want to search for the pattern "bingo" in this file and split the line to extract the words immediately preceding and following - "bingo".

Tried with this script using perl.. but my output file shows no result.

#! /usr/bin/perl

open (INFILE, "text1.txt");
open (OUTFILE,">outtext1.txt");

while (<INFILE>)
{
if (s/\w*(\w{1})bingo(\w{1})\w*/\1\2/) {
print OUTFILE;
}

}
close (INFILE);
close (OUTIFLE);

WIth this expression, I expect to get atleast the 2nd line to pass and the output file to have a but ['a bingo but']. But I get an empty outfile.

Can someone please point out how to accomplish this?

Thanks!
# 2  
Old 03-21-2008
Hi,

I am not sure whether i have understood your req correcctly. Just try below awk script.

Code:
nawk '{
for(i=1;i<=NF;i++)
{
j=i+1
k=i-1
if($i=="bingo" && $j!="" && $k!="")
print $k" "$j
}
}' filename

# 3  
Old 03-21-2008
Thanks Summer cherry! You got my question and nawk does the work, but can't we get this done by writing a simple regular expression?

Thanks again!
# 4  
Old 03-21-2008
Hi.

Adjusting the RE in script file p1 to account for whitespace on either side of bingo, etc.:
Code:
#!/usr/bin/perl

use warnings;
use strict;

open( INFILE, "data1" ) || die " Can't open input file.\n";

while (<INFILE>) {
  if (s/.*(\w+)\s*bingo\s*(\w+).*/$1 $2/) {
    print;
  }

}

exit(0);

Producing:
Code:
% ./p1
a but

See man perlre for details ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

TCL script to delete a pattern(regexp)

Hi I am writing a TCL script to delete a certain in a file My Input file module bist_logic_inst(a, ab , dhd, dhdh , djdj, hdh, djjd, jdj, dhd, dhp, dk ); input a; input ab; input dhd; input djdj; input dhd; output hdh; output djjd; output jdj; output dk; (1 Reply)
Discussion started by: kshitij
1 Replies

2. Shell Programming and Scripting

Extract whole word preceding a specific character pattern with first occurence of the pattern

Hello. Here is a file contents : declare -Ax NEW_FORCE_IGNORE_ARRAY=(="§" ="§" ="§" ="§" ="§" .................. ="§"Here is a pattern =I want to extract 'NEW_FORCE_IGNORE_ARRAY' which is the whole word before the first occurrence of pattern '=' Is there a better solution than mine :... (3 Replies)
Discussion started by: jcdole
3 Replies

3. Shell Programming and Scripting

awk regexp to print repetitive pattern

How to use regexp to print out repetitive pattern in awk? $ awk '{print $0, "-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-\t-"}' output: - - - - - - - - - - - -I tried following which does not give what I want, of course. awk '{print $0, "-\t{11}-"}' output: - ... (10 Replies)
Discussion started by: yifangt
10 Replies

4. Shell Programming and Scripting

Extract words starting with a pattern from a file

Hi Guys.. I have a file and i want to extract all words that starts with a pattern 'ABC_' or 'ADF_' For example, ABC.txt ---- INSERT INTO ABC_DLKFJAL_FJKLD SELECT DISTINCT S,B,C FROM ADF_DKF_KDFJ_IERU8 A, ABC_LKDJFREUE9_FJKDF B WHERE A.FI=B.EI; COMMIT; Output : ABS_DLKFJAL_FJKLD,... (5 Replies)
Discussion started by: Pramod_009
5 Replies

5. Shell Programming and Scripting

Perl regexp to extract first and second column

Hi, I am trying with the below Perl one-liner using regular expression to extract the first and second column of a text file: perl -p -e "s/\s*(\w+).*/$1/" perl -p -e "s/\s*.+\s(.+)\s*/$1\n/" whereas the text file's data looks like: Error: terminated 2233 Warning: reboot 3434 Warning:... (3 Replies)
Discussion started by: royalibrahim
3 Replies

6. UNIX for Dummies Questions & Answers

Match Pattern after certain pattern and Print words next to Pattern

Hi experts , im new to Unix,AWK ,and im just not able to get this right. I need to match for some patterns if it matches I need to print the next few words to it.. I have only three such conditions to match… But I need to print only those words that comes after satisfying the first condition..... (2 Replies)
Discussion started by: 100bees
2 Replies

7. Shell Programming and Scripting

Search for a pattern,extract value(s) from next line, extract lines having those extracted value(s)

I have hundreds of files to process. In each file I need to look for a pattern then extract value(s) from next line and then search for value(s) selected from point (2) in the same file at a specific position. HEADER ELECTRON TRANSPORT 18-MAR-98 1A7V TITLE CYTOCHROME... (7 Replies)
Discussion started by: AshwaniSharma09
7 Replies

8. Shell Programming and Scripting

extract string until regexp from backside

Hi, I searched in the forums, but I didn't find a good solution. My problem is: I have a string like "TEST.ABC201005.MONTHLY.D101010203". I just want to have the string until the D100430, so that the string should look like: "TEST.ABC201005.MONTHLY.D" The last characters after the D can be... (8 Replies)
Discussion started by: elifchen
8 Replies

9. UNIX for Advanced & Expert Users

I am trying to find pattern between two words but unable to get that pattern..

HI.... It's fallow up file .. #./show.sh click enter button.. i am gettng the fallowup file. its keep on running every time why because there are lots of users working on it. In that file i want to search pattern between two words for ex: SELECT DISTINCT... (7 Replies)
Discussion started by: ksr.test
7 Replies

10. Shell Programming and Scripting

Problem with regexp for IP-Adress Pattern

Hi all Unix Gurus! Since hours (even days :-)) I'm trying to find the correct pattern to search for IP addesses in text files. The pattern to find a IP address itself is not too difficult: '((||1{2}|2|2{2})\.){3,}(||1{2}|2|2{2})' BUT, of course the above pattern is also matching lines like... (9 Replies)
Discussion started by: desertchannel
9 Replies
Login or Register to Ask a Question