Match pattern only between certain lines in entire file


Login or Register to Reply

 
Thread Tools Search this Thread
# 1  
Match pattern only between certain lines in entire file

Hello, I have input that looks like this:
Code:
          * 0 -1 103 0 0 m. 7 LineNr 23 ClauseNr 1: 1: 1: 304: 0 0 SentenceNr 13 TxtType: Q Pargr: 2.1 ClType:MSyn
 PS004,006 ZBX=                0   1  1  0  7 -1 -1    3  2  3  2    -1   1   1  -1      -1      -1      -1    0  501     0
 PS004,006 ZBX                 0   2 -1 -1 -1  5 -1   -1 -1  3  2     1   2   0  -1       2      -1      -1   -1   -1    -1
 PS004,006 YDQ                 0   2 -1 -1 -1  1 -1   -1 -1  1  2     2   2   2   1  -10002      -1      -1    0  503     0
           * 0 -3 200 1 201 0 0 .. 5 LineNr 24 ClauseNr 1: 1: 2: 103: 0 0 SentenceNr 14 TxtType: Q Pargr: 2.1 ClType:ZIm0
 PS004,006 W                   0   6 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   6   6  -1      -1      -1      -1    0  509     0
 PS004,006 BVX                 0   1  1  0  7 -1 -1    3  2  3  2    -1   1   1  -1      -1      -1      -1    0  501     0
 PS004,006 >L                  0   5 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   5   0  -1      -1      -1      -1   -1   -1    -1
 PS004,006 JHWH                0   3 -1 -1 -1  1 -1   -1 -1  1  2     2   3   5   2      -1      -1      -1    0  504     0
           * 0 -1 201 0 0 .. 6 LineNr 25 ClauseNr 1: 1: 3: 153: 0 0 SentenceNr 15 TxtType: Q Pargr: 2.1 ClType:WIm0
 PS004,007 RB                  0  13 -1 -1 -1  4 -1   -1 -1  3  2     2   2   2   1      -1      -1      -1    0  502     0
 PS004,007 >MR                -1   1  0  0  1  4 -1    6  0  3  2     2   1   1  -1      -1      -1      -1    0  521     0
           * 0 -18 163 1 999 2 136 0 0 .# 2 LineNr 26 ClauseNr 1: 1: 2: 106: 0 0 SentenceNr 16 TxtType: Q Pargr: 2.2 ClType:Ptcp
 PS004,007 MJ                  0   9 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   9   9   1      -1      -1      -1    0  502     0
 PS004,007 R>H                 0   1  2  2  1 -1 -1    1  3  1  2    -1   1   1  -1      -1      -1      -1    0  501     0
 PS004,007 NW                 -1   7 -1 -1 -1 -1 -1   -1  1  3 -1    -1   7   7   2      -1      -1      -1    0  503     0
 PS004,007 VWB                 0  13 -1 -1 -1  1 -1   -1 -1  1  0     2   2   2   1      -1      -1      -1    0  503     0
           * 0 -1 999 0 0 .q 4 LineNr 27 ClauseNr 1: 1: 4: 121: 0 0 SentenceNr 17 TxtType: QQ Pargr: 2.2.1 ClType:XYqt

I would like to use either awk, sed, or grep to match a regex, but print not only line that contains the match, but also those lines before and after that match until a line that begins with a certain character.

So, for example, in the input above, if I would like to match the pattern "BVX" in field 2, I would desire the output to include not only that line, but also those between the nearest two lines before and after beginning with "*".

Thus the desired output would be:
Code:
           * 0 -3 200 1 201 0 0 .. 5 LineNr 24 ClauseNr 1: 1: 2: 103: 0 0 SentenceNr 14 TxtType: Q Pargr: 2.1 ClType:ZIm0
 PS004,006 W                   0   6 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   6   6  -1      -1      -1      -1    0  509     0
 PS004,006 BVX                 0   1  1  0  7 -1 -1    3  2  3  2    -1   1   1  -1      -1      -1      -1    0  501     0
 PS004,006 >L                  0   5 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   5   0  -1      -1      -1      -1   -1   -1    -1
 PS004,006 JHWH                0   3 -1 -1 -1  1 -1   -1 -1  1  2     2   3   5   2      -1      -1      -1    0  504     0
           * 0 -1 201 0 0 .. 6 LineNr 25 ClauseNr 1: 1: 3: 153: 0 0 SentenceNr 15 TxtType: Q Pargr: 2.1 ClType:WIm0

This is a very long file where a given pattern (such as "BVX" in the example) can occur multiple times. I would like to print each match of "BVX" and the lines before it stopping at /^\*/ and after the match stopping at /^\*/.

I have attempted combinations of grep and sed, but to no avail, e.g.
Code:
grep -C5 "BVX" input | sed -n '/\*/,/\*/p'

Thank you so much in advance.
# 2  
How about
Code:
awk '
                {BUF = BUF ORS $0
                }
$2 == "BVX"     {PRT = 1
                }
/^ *\*/         {if (PRT) print BUF
                 BUF = $0 
                 PRT = ""
                }
' file
           * 0 -3 200 1 201 0 0 .. 5 LineNr 24 ClauseNr 1: 1: 2: 103: 0 0 SentenceNr 14 TxtType: Q Pargr: 2.1 ClType:ZIm0
 PS004,006 W                   0   6 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   6   6  -1      -1      -1      -1    0  509     0
 PS004,006 BVX                 0   1  1  0  7 -1 -1    3  2  3  2    -1   1   1  -1      -1      -1      -1    0  501     0
 PS004,006 >L                  0   5 -1 -1 -1 -1 -1   -1 -1 -1 -1    -1   5   0  -1      -1      -1      -1   -1   -1    -1
 PS004,006 JHWH                0   3 -1 -1 -1  1 -1   -1 -1  1  2     2   3   5   2      -1      -1      -1    0  504     0
           * 0 -1 201 0 0 .. 6 LineNr 25 ClauseNr 1: 1: 3: 153: 0 0 SentenceNr 15 TxtType: Q Pargr: 2.1 ClType:WIm0


Last edited by RudiC; 06-20-2018 at 04:07 PM.. Reason: Removed surplus DL = "".
This User Gave Thanks to RudiC For This Post:
# 3  
Works like a charm RudiC! Thank you so much! Now I have to go try to figure out how it works.
Login or Register to Reply

|
Thread Tools Search this Thread
Search this Thread:
Advanced Search

More UNIX and Linux Forum Topics You Might Find Helpful
Perl script to fill the entire row of Excel file with color based on pattern match
kshitij
Hi All , I have to write one Perl script in which I need to read one pre-existing xls and based on pattern match for one word in some cells of the XLS , I need to fill the entire row with one color of that matched cell and write the content to another excel Please find the below stated...... Shell Programming and Scripting
2
Shell Programming and Scripting
Match all lines in file where specific text pattern is less than
cmccabe
In the below file I am trying to grep or similar, all lines where only AF= is less than 0.4.. Thank you :). grep grep "AF=" ,+ .4 file file 12 112036782 . T C 34.0248 PASS ...... Shell Programming and Scripting
3
Shell Programming and Scripting
Help with ksh-to read ip file & append lines to another file based on pattern match
prashob123
Hi, I need help with this- input.txt : L B white X Y white A B brown M Y black Read this input file and if 3rd column is "white", then add specific lines to another file insert.txt. If 3rd column is brown, add different set of lines to insert.txt, and so on. For example, the given...... Shell Programming and Scripting
6
Shell Programming and Scripting
Match pattern in a field, print pattern only instead of the entire field
lucasvs
Hi ! I have a tab-delimited file, file.tab: Column1 Column2 Column3 aaaaaaaaaa bbtomatoesbbbbbb cccccccccc ddddddddd eeeeappleseeeeeeeee ffffffffffffff ggggggggg hhhhhhtomatoeshhh iiiiiiiiiiiiiiii ...... UNIX for Dummies Questions & Answers
18
UNIX for Dummies Questions & Answers
deleting lines in a file that match a pattern without opening it
osbourneric
In Unix, how do I delete lines in a file that match a particular pattern without opening it. File contents are foo line1 misc whatever foo line 2 i want to delete all lines that have the pattern "foo" without opening the file. File should eventually contain misc whatever... Shell Programming and Scripting
1
Shell Programming and Scripting