find pattern matches in consecutive lines in certain fields-awk


 
Thread Tools Search this Thread
Top Forums UNIX for Beginners Questions & Answers find pattern matches in consecutive lines in certain fields-awk
# 1  
Old 08-09-2018
find pattern matches in consecutive lines in certain fields-awk

I have a text file with many thousands of lines, a small sample of which looks like this:

InputFile:
Code:
PS002,003 D                  -1   5 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   6   6  -1      -1      -1      -1    0  509     0
PS002,003 PSQ                 0   1  7 18  1  0 -1    1  1  3 -1    -1   1   1  -1      -1      -1      -1    0  501     0
PS002,003 XNQ                 0   2 -1 -1 -1  5 -1   -1 -1  3  2     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS002,003 HWN=                0   7 -1 -1 -1 -1 -1   -1  3  3  2    -1   7   2   2      -1      -1      -1    0  503     0
           * 0 -1 512 1 411 0 0 .q 4 LineNr 5 ClauseNr 1: 1: 3: 131: 0 0 SentenceNr 3 TxtType: ?Q      Pargr: 12 ClType:xYq0
           * 0 -2 111 1 411 0 0 .. 3 LineNr 10 ClauseNr 1: 1: 4: 131: 0 0 SentenceNr 6 TxtType: ?       Pargr: 1 ClType:xYq0
PS002,005 W                   0   6 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   6   6  -1      -1      -1      -1    0  509     0
PS002,005 B                   0   5 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   5   0  -1      -1      -1      -1   -1   -1    -1
PS002,005 XM>                 0   2 -1 -1 -1 11 -1   -1 -1  1  1     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS002,005 H                   0   7 -1 -1 -1 -1 -1   -1  3  1  2    -1   7   5   2      -1      -1      -1    0  505     0
PS002,005 DLX                 0   1  5 18  1  0 -1    1  3  1  2    -1   1   1  -1      -1      -1      -1    0  501     0
PS002,005 >NWN                0   7 -1 -1 -1 -1 -1   -1  3  3  2    -1   7   7   2      -1      -1      -1    0  503     0
PS012,004 >BD                 0   1  5 15  1  0 -1    1  3  1  2    -1   1   1  -1      -1      -1      -1    0  501     0
PS012,004 MRJ>                0   3 -1 -1 -1  1 -1   -1 -1  0  0     2   3   3   2      -1      -1      -1    0  502     0
PS012,004 KL                  0   2 -1 -1 -1  1 -1   -1 -1  0  0     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS012,004 HJN                 0   7 -1 -1 -1 -1 -1   -1  3  3  1    -1   7   2   2      -1      -1      -1    0  503     0
PS012,004 SP>                 0   2 -1 -1 -1 12 -1   -1 -1  3  1     3   2   0  -1      -1      -1      -1   -1   -1    -1
PS012,004 PLG                 0   1  6 18  1 12 -1   62 -1  3  1     3  13  -2   2      -1      -1      -1  -11  500     0

What I would like to do is that if a given line meets the conditions $16=="0" && $22=="-1" and the immediately following line has $22=="503" && $4=="7" && $16=="2" then print every set of these two consecutive lines.

Desired Output:
Code:
PS002,003 XNQ                 0   2 -1 -1 -1  5 -1   -1 -1  3  2     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS002,003 HWN=                0   7 -1 -1 -1 -1 -1   -1  3  3  2    -1   7   2   2      -1      -1      -1    0  503     0
PS012,004 KL                  0   2 -1 -1 -1  1 -1   -1 -1  0  0     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS012,004 HJN                 0   7 -1 -1 -1 -1 -1   -1  3  3  1    -1   7   2   2      -1      -1      -1    0  503     0

Thus far I have tried various revisions of the following awk code which has gotten me fairly close:

Code:
awk '$16=="0" && $22=="-1"{f=$0; f++; next} $22=="503" && $4=="7"{n=$0} {print f"\n"n}' InputFile

Nevertheless, I continue to not be able to figure out how to get this to work. I would very much appreciate any help to get this one-liner to work as desired. Thanks!

Last edited by jvoot; 08-09-2018 at 08:23 PM.. Reason: Left off one condition to arrive at desired output.
# 2  
Old 08-09-2018
I'm not getting exactly your desired output, but something to start with:
Code:
awk '$16==0 && $22==-1 {l=$0;next} l && $22==503 && $4==7 {print l ORS $0;l=""}' myFile

produces:
Code:
PS002,003 XNQ                 0   2 -1 -1 -1  5 -1   -1 -1  3  2     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS002,003 HWN=                0   7 -1 -1 -1 -1 -1   -1  3  3  2    -1   7   2   2      -1      -1      -1    0  503     0
PS002,005 XM>                 0   2 -1 -1 -1 11 -1   -1 -1  1  1     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS002,005 >NWN                0   7 -1 -1 -1 -1 -1   -1  3  3  2    -1   7   7   2      -1      -1      -1    0  503     0
PS012,004 KL                  0   2 -1 -1 -1  1 -1   -1 -1  0  0     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS012,004 HJN                 0   7 -1 -1 -1 -1 -1   -1  3  3  1    -1   7   2   2      -1      -1      -1    0  503     0


Last edited by vgersh99; 08-09-2018 at 08:42 PM..
This User Gave Thanks to vgersh99 For This Post:
# 3  
Old 08-09-2018
Thank you so much vgersh99. Indeed, in the verbal parameters of my desired output I left out one condition (viz., $16=="2"). I have since fixed my original post. Thanks for pointing that out to me and the help with the code.

A slight adjustment to the code you offered reached the desired output as indicated in the original post.

Code:
awk '$16==0 && $22==-1 {l=$0;next} l && $22==503 && $4==7 && $16=="2"{print l ORS $0;l=""}' InputFile


Last edited by jvoot; 08-09-2018 at 08:26 PM..
This User Gave Thanks to jvoot For This Post:
# 4  
Old 08-10-2018
I always prefer to have a state variable and a store variable.
Code:
awk '
met==1 && $22=="503" && $4=="7" && $16=="2" {print save; print }
{ met=0 }
$16=="0" && $22=="-1" { save=$0; met=1 }
'

The { met=0 } clears the state, in order to only continue the search in the immediately following line.
The order 2. condition then 1. condition saves a next.
This User Gave Thanks to MadeInGermany For This Post:
# 5  
Old 08-10-2018
What behavior do you want with the following input file?
Code:
PS012,004 SP>                 0   2 -1 -1 -1 12 -1   -1 -1  3  1     3   2   0  -1      -1      -1      -1   -1   -1    -1
PS012,004 PLG                 0   1  6 18  1 12 -1   62 -1  3  1     3  13  -2   2      -1      -1      -1  -11  500     0
PS012,004 HJN                 0   7 -1 -1 -1 -1 -1   -1  3  3  1    -1   7   2   2      -1      -1      -1    0  503     0

This User Gave Thanks to MadeInGermany For This Post:
# 6  
Old 08-10-2018
In this case @MadeInGermany, I would not want any output to be generated given your proposed input file.

Maybe it would be helpful if I gave some more sample data from my input.

InputSample
Code:
        * 0 -4 110 1 511 0 0 .. 4 LineNr 11 ClauseNr 1: 1: 3: 111: 0 0 SentenceNr 5 TxtType: Q       Pargr: 1 ClType:ZYqX
PS016,004 D                  -1   5 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   6   6  -1      -1      -1      -1    0  509     0
PS016,004 L>                  0  11 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1  11  11  -1      -1      -1      -1    0  510     0
PS016,004 NQJ                 0   1  4 18  1  0 -1    1  1  1 -1    -1   1   1  -1      -1      -1      -1    0  501     0
PS016,004 NWQJ                0   2 -1 -1 -1  5 -1   -1 -1  3  2     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS016,004 HWN=                0   7 -1 -1 -1 -1 -1   -1  3  3  2    -1   7   2   2      -1      -1      -1    0  503     0
PS016,004 MN                  0   5 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   5   0  -1      -1      -1      -1   -1   -1    -1
PS016,004 DM                  0   2 -1 -1 -1  1 -1   -1 -1  0  2     3   2   5   2      -1      -1      -1    0  505     0           
        * 0 -1 620 0 0 .. 10 LineNr 14 ClauseNr 1: 1: 4: 132: 0 0 SentenceNr 12 TxtType: Q       Pargr: 1 ClType:xQt0
PS017,005 SMK                 0   1  0 18 11  0 -1    2  2  1  2    -1   1   1  -1      -1      -1      -1    0  501     0
PS017,005 HLK>                0   2 -1 -1 -1 12 -1   -1 -1  3  1     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS017,005 J                   0   7 -1 -1 -1 -1 -1   -1  1  1 -1    -1   7   2   2      -1      -1      -1    0  503     0
PS017,005 B                   0   5 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   5   0  -1      -1      -1      -1   -1   -1    -1
PS017,005 CBJL                0   2 -1 -1 -1  5 -1   -1 -1  3  2     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS017,005 K                   0   7 -1 -1 -1 -1 -1   -1  2  1  2    -1   7   5   2      -1      -1      -1    0  504     0
        * 0 -3 122 1 11 0 0 .. 8 LineNr 15 ClauseNr 1: 1: 3: 102: 0 0 SentenceNr 13 TxtType: Q       Pargr: 1 ClType:ZQt0

Desired Output
Code:
PS016,004 NWQJ                0   2 -1 -1 -1  5 -1   -1 -1  3  2     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS016,004 HWN=                0   7 -1 -1 -1 -1 -1   -1  3  3  2    -1   7   2   2      -1      -1      -1    0  503     0
PS017,005 HLK>                0   2 -1 -1 -1 12 -1   -1 -1  3  1     1   2   0  -1      -1      -1      -1   -1   -1    -1
PS017,005 J                   0   7 -1 -1 -1 -1 -1   -1  1  1 -1    -1   7   2   2      -1      -1      -1    0  503     0

Thus, when there is a line that meets the conditions:
Code:
$16=="0" && $22=="-1"

Check the immediately following line to see if it has the conditions:
Code:
$4=="7" && $16=="2" && $22=="503"

If both of these are met, then print both lines to output; else do nothing. Ideally, I would like to be able to use the code help I receive here as a kind of template to vary the conditions on the various fields to extract a wide range of data patterns. I attempted the code that you offered and while I have not checked its accuracy in detail due to the size of the output, it seems that at a first glance it worked perfectly.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Using awk or sed to find a pattern that has lines before and after it

Dear gurus, Please help this beginner to write and understand the required script. I am looking for useing awk for sed. I have a few thousand lines file whose contain are mostly as below and I am trying to achieve followings. 1. Find a string, say user1. Then hash the line containing the... (6 Replies)
Discussion started by: ran_bon_78
6 Replies

2. UNIX for Dummies Questions & Answers

Eliminate consecutive lines with the same pattern

Hi, I would like to know how to remove lines which has the same pattern as the next line through sed/awk. Stream 39 (wan stream 7) Stream 40 (wan stream 8) WINQ Counter 115955 1 1613 (BYTE) 11204787 163 ... (2 Replies)
Discussion started by: sarn_nat
2 Replies

3. Shell Programming and Scripting

[awk] find pattern, change next two lines

Hi, hope you can help me... It seems like a straightforward problem, but I haven't had any success so far using my basic scripting and awk "skills": I need to find a pattern /VEL/ in an input file that looks like this: 1110SOL OW25489 1.907 7.816 26.338 -0.4365 0.4100 -0.0736 ... (3 Replies)
Discussion started by: origamisven
3 Replies

4. UNIX for Dummies Questions & Answers

Finding the same pattern in three consecutive lines in several files in a directory

I know how to search for a pattern/regular expression in many files that I have in a directory. For example, by doing this: grep -Ril "News/U.S." . I can find which files contain the pattern "News/U.S." in a directory. I am unable to accomplish about how to extend this code so that it can... (1 Reply)
Discussion started by: shoaibjameel123
1 Replies

5. Shell Programming and Scripting

How to insert line with between two consecutive lines that match special pattern?

I have following pattern in a file: 00:01:38 UTC abcd 00:01:48 UTC 00:01:58 UTC efgh 00:02:08 UTC 00:02:18 UTC and I need to change something like the following 00:01:38 UTC abcd 00:01:48 UTC XXXX 00:01:58 UTC efgh 00:02:08 UTC XXXX (6 Replies)
Discussion started by: jjnight
6 Replies

6. Shell Programming and Scripting

awk to find pattern and add lines

My file goes like this: SID_LIST_HOSTNAME_LISTENER_3 = (SID_LIST = (SID_DESC = (SID_NAME = ORA0008) (ORACLE_HOME = /opt/oracle/product/ORA0008) (ENVS = "LD_LIBRARY_PATH=/opt/oracle/product/ORA0008/lib") ) (SID_DESC = (SID_NAME = ORA0007) ... (4 Replies)
Discussion started by: jpsingh
4 Replies

7. Shell Programming and Scripting

merging of 2 consecutive lines in a file for a specific pattern

Hi , I'm looking for a way to merge two lines only for a given pattern / condition. Input : abcd/dad + -49.201 2.09 -49.5 34 ewrew rewtre * fdsgfds/dsgf/sdfdsfasdd + -4.30 0.62 -49.5 45 sdfdsf cvbbv * sdfds/retret/asdsaddsa + ... (1 Reply)
Discussion started by: novice_man
1 Replies

8. Shell Programming and Scripting

Merge lines if pattern matches in ksh

I have a file like this. Pls help me to solve this . (I should look for only Message : 111 and need to print the start time to end time Need to ignore other type of messages. Ex: if first message is 111 and second message is 000 or anything else then ignore the 2nd one and print start time of the... (1 Reply)
Discussion started by: mnjx
1 Replies

9. Shell Programming and Scripting

Merge lines from one file if pattern matches

I have one comma separated file (a.txt) with two or more records all matching except for the last column. I would like to merge all matching lines into one and consolidate the last column, separated by ":". Does anyone know of a way to do this easily? I've searched the forum but most talked... (6 Replies)
Discussion started by: giannicello
6 Replies

10. Shell Programming and Scripting

Displaying lines of a file where the second field matches a pattern

Howdy. I know this is most likely possible using sed or awk or grep, most likely a combination of them together, but how would one go about running a grep like command on a file where you only try to match your pattern to the second field in a line, space delimited? Example: You are... (3 Replies)
Discussion started by: LordJezoX
3 Replies
Login or Register to Ask a Question