Extract if pattern matches


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract if pattern matches
# 1  
Old 10-20-2007
it very easy if you want to try to understand,just print them out !
Code:
awk 'FNR==NR{print "File processing now: " FILENAME "  FNR: "FNR " NR: "NR ;print $0;next}
     { print "File processing now: " FILENAME " NR: " NR " FNR:  " FNR " : "$0 }
' "file1" "file2"

# 2  
Old 10-20-2007
Quote:
Originally Posted by ghostdog74
it very easy if you want to try to understand,just print them out !
Code:
awk 'FNR==NR{print "File processing now: " FILENAME "  FNR: "FNR " NR: "NR ;print $0;next}
     { print "File processing now: " FILENAME " NR: " NR " FNR:  " FNR " : "$0 }
' "file1" "file2"


Hi GhostDog,

Thanks for that!!!! It really helps me to understand the codes better. Smilie
# 3  
Old 10-21-2007
awk

Hi,

I think this one should be ok for you!

code:
Code:
awk 'BEGIN{flag=0}
{
if ($2=="XXX")
{
	print
	flag=1
}
if ($1=="**" && $2!="XXX")
	flag=0
if (flag==1 && $2=="k=")
	print
}' filename

# 4  
Old 10-22-2007
Hi Summer,

Thanks for your code. But we do not know the value of 2nd field (which is XXX) in the 1st place so your code can't apply here.

Hi GhostDog,

I have a little problem here. I have added some more data to my input file (highlighted in blue).
If the 2nd field of last occurence of last occurence of this pattern " ** abc ccc cc cc cc cc 2007 " does not start with " XX ", then the below will be output (that is only the very last portion of the block which matches the pattern will be printed out)

Can you help ?

Input:

wwwwww
0999 k= 1
wwwwww
** XXX ccc ccc cc cc ccc 2007
wwwwww
wwwwww
0001 k= 1
wwwwww
0002 k= 1
** abc ccc cc cc cc cc 2007
wwwwww
0001 k= 1
wwwwww
0002 k= 1
wwwwww
wwwwww
0003 k= 1
wwwwww
** XXX ccc ccc cc cc ccc 2007
wwwwww
0003 k= 1
wwwwww
0004 k= 1
0005 k= 1
** abc ccc cc cc cc cc 2007
0001 k= 1
wwwwww
0002 k= 1
0003 k= 1


Output:

** abc ccc cc cc cc cc 2007
0001 k= 1
0002 k= 1
0003 k= 1
# 5  
Old 10-22-2007
Hi, Raynon.

Thanks for your emphasis on the function of your notation "XXX". Here is an amended perl script to account for that:
Code:
#!/usr/bin/perl

# @(#) p3       Demonstrate extraction after pattern match.

use warnings;
use strict;

my ($debug);
$debug = 1;
$debug = 0;

our ($lines) = 0;
my ($key_string);

my (@a);

# Get second field of first line that begins with **, as in:
#
# ** XXX
#
# then use that second field as the key_string. Anytime that
# key_string appears, we begin scanning for "k=" lines, and only
# stopping when another "**" line appears.
#
# So, read until ** XXX, then turn over control to function to
# scan for other pattern.

my ($first) = 1;
while (<>) {
  if ($first) {
    @a = split;
    if ( $a[0] eq "**" ) {
      $key_string = $a[1];
      $first      = 0;
      redo;
    }
  }
  $lines++;
  chomp;
  @a = split;
  if ( $a[0] eq "**" && $a[1] eq $key_string ) {
    print " Found XXX line at $.\n" if $debug;
    print "$_\n";

    # last if not extract_k();
    $_ = extract_k();
    if ( not $_ ) {
      last;
    }
    else {
      print " cycling with line $. ", $_ if $debug;

      # Adjust line count to avoid counting twice.
      $lines--;
      redo;
    }
  }
}

print STDERR " ( Lines read: $lines )\n";

# Extract k= lines until line with "**".

sub extract_k {
  our ($lines);
  my (@a);
  while (<>) {
    $lines++;
    chomp();
    @a = split;
    return "$_\n" if $a[0] eq "**";    # not EOF
    print "$_\n" if /k=/;
  }
  return 0;                            # EOF
}

exit(0);

Running on the new data in file data3:
Code:
% ./p3 data3
** XXX ccc ccc cc cc ccc 2007
0001 k= 1
0002 k= 1
** XXX ccc ccc cc cc ccc 2007
0003 k= 1
0004 k= 1
0005 k= 1
 ( Lines read: 29 )

cheers, drl

( edit 1: corrected line count )

Last edited by drl; 10-22-2007 at 01:39 AM..
# 6  
Old 10-22-2007
Quote:
Originally Posted by Raynon
Hi Summer,

Thanks for your code. But we do not know the value of 2nd field (which is XXX) in the 1st place so your code can't apply here.

Hi GhostDog,

I have a little problem here. I have added some more data to my input file (highlighted in blue).
If the 2nd field of last occurence of last occurence of this pattern " ** abc ccc cc cc cc cc 2007 " does not start with " XX ", then the below will be output (that is only the very last portion of the block which matches the pattern will be printed out)

Can you help ?

Input:

wwwwww
0999 k= 1
wwwwww
** XXX ccc ccc cc cc ccc 2007
wwwwww
wwwwww
0001 k= 1
wwwwww
0002 k= 1
** abc ccc cc cc cc cc 2007
wwwwww
0001 k= 1
wwwwww
0002 k= 1
wwwwww
wwwwww
0003 k= 1
wwwwww
** XXX ccc ccc cc cc ccc 2007
wwwwww
0003 k= 1
wwwwww
0004 k= 1
0005 k= 1
** abc ccc cc cc cc cc 2007
0001 k= 1
wwwwww
0002 k= 1
0003 k= 1


Output:

** abc ccc cc cc cc cc 2007
0001 k= 1
0002 k= 1
0003 k= 1
Hi GhostDOg,

Seems that i am pretty near towards my target.
But there's still a contraint. If the term " ** abc ccc ccc cc cc ccc 2007 " occurs more than 2 times, all the 2nd blocks onwards will be outputted because of these 2 statements.
occur++;
if (occur > 1) print;

Is there any way i could find out the last number of the " occur " variable and make sure that only the last occurence will be printed out ?

Code:
FNR==NR&&/^\*\*/{line=$2; CODE = substr ($2,1,2); next}

FNR != NR && $0 ~ line {
      print 
      flag=1
     }
     flag == 1 && $0 ~ /^\*\*/ && CODE == "XX"{ 
       if($2 !~ line) flag=0
     }
     flag == 1 && $2 == "k="{print}


FNR != NR && $2 ~ line && CODE != "XX"  {
      flag=2;
      occur++;
      if (occur > 1)  print;
     }
      flag==2 && occur > 1 && $2 == "k=" { print }


Last edited by Raynon; 10-22-2007 at 05:42 AM..
# 7  
Old 10-22-2007
So "XXX" is actually what you want to get?
Code:
awk 'FNR==NR&&/^\*\*/&&$2=="XXX"{line=$2;next}
     FNR!=NR&&$0~line{
      print 
      f=1
     }
     f&&$0~/^\*\*/{ 
       if($2 !~ line) f=0
     }
     f&&$2=="k="{print}
' "file" "file"

output:
Code:
# ./testnew.sh
** XXX ccc ccc cc cc ccc 2007
0001 k= 1
0002 k= 1
** XXX ccc ccc cc cc ccc 2007
0003 k= 1
0004 k= 1
0005 k= 1

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Extract the whole set if a pattern matches

Hi, I have to extract the whole set if a pattern matches.i have a file called input.txt input.txt ------------ CREATE TABLE ABC ( A, B, C ); CREATE TABLE XYZ ( X, Y, Z, P, Q ); (6 Replies)
Discussion started by: raju2016
6 Replies

2. Shell Programming and Scripting

How to get a 1st line which matches the particular pattern?

Hi all, I have file on which I do grep on "/tmp/data" then I get 5 lines as dir Path: /tmp/data/20162343134 Starting to listen on ports logging: -- Moving results files from local storage: /tmp/resultsFiles/20162343134/*.gz to NFS: /data/temp/20162343134/outgoing from above got to get... (7 Replies)
Discussion started by: girijajoshi
7 Replies

3. Shell Programming and Scripting

Insert tags which matches the pattern

Hi Guys, How to achieve this in awk or sed: Patterns: A.B. No. T-8346 or A.B. No. T-8xxx will look like this: Patterns: A.B. No. T-8346<br> or A.B. No. T-8xxx<br> #cat file.txt JHON VS. PETER, AGOO PET. How Old Are Youthe file will look like this: A.B. No. T-8346<br> January 01,... (10 Replies)
Discussion started by: lxdorney
10 Replies

4. Shell Programming and Scripting

Extract all the sentences from a text file that matches a pattern list

Hi I have a big text file. I want to extract all the sentences that matches at least 70% (seventy percent) of the words from each sentence based on a word list called A. Say the format of the text file is as given below: This is the first sentence which consists of fifteen words... (4 Replies)
Discussion started by: my_Perl
4 Replies

5. Shell Programming and Scripting

Blocks of text in a file - extract when matches...

I sat down yesterday to write this script and have just realised that my methodology is broken........ In essense I have..... ----------------------------------------------------------------- (This line really is in the file) Service ID: 12345 ... (7 Replies)
Discussion started by: Bashingaway
7 Replies

6. Shell Programming and Scripting

awk with range but matches pattern

To match range, the command is: awk '/BEGIN/,/END/' but what I want is the range is printed only if there is additional pattern that matches in the range itself? maybe like this: awk '/BEGIN/,/END/ if only in that range there is /pattern/' Thanks (8 Replies)
Discussion started by: zorrox
8 Replies

7. Shell Programming and Scripting

Extract columns where header matches a given string

Hi, I'm having trouble pulling out columns where the headers match a file of key ID's I'm interested in and was looking for some help. file1.txt I Name 34 56 84 350 790 1215 1919 7606 9420 file2.txt I Name 1 1 2 2 3 3 ... 34 34... 56 56... 84 84... 350 350... M 1 A A A A... (20 Replies)
Discussion started by: flotsam
20 Replies

8. Shell Programming and Scripting

Remove if the above line matches pattern

but keep if does not I have a file: --> my.out foo: bar foo: moo blarg i am on vacation foo: goose foo: lucy foo: moose foo: stucky groover@monkey.org foo: bozo grimace@gonzo.net dear sir - blargo blargo foo: goon foo: sloppy foo: saudi gimme gimme gimme (3 Replies)
Discussion started by: spacegoose
3 Replies

9. Shell Programming and Scripting

get value that matches file name pattern

Hi I have files with names that contain the date in several formats as, YYYYMMDD, DD-MM-YY,DD.MM.YY or similar combinations. I know if a file fits in one pattern or other, but i donīt know how to extract the substring contained in the file that matches the pattern. For example, i know that ... (1 Reply)
Discussion started by: pjrm
1 Replies

10. Shell Programming and Scripting

awk to count pattern matches

i have an awk statement which i am using to count the number of occurences of the number ,5, in the file: awk '/,5,/ {count++}' TRY.txt | awk 'END { printf(" Total parts: %d",count)}' i know there is a total of 10 matches..what is wrong here? thanks (16 Replies)
Discussion started by: npatwardhan
16 Replies
Login or Register to Ask a Question