Visit Our UNIX and Linux User Community


Script to match strings that sometimes are splitted in 2 lines


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Script to match strings that sometimes are splitted in 2 lines
# 15  
Old 08-26-2013
Hello Scrutinizer,

Many thanks for your help! I'll practice with substr() in order to manipulate the output.

How many variables to include more patterns support awk?

Thanks in advance.
# 16  
Old 08-26-2013
There is no limit in the number of variables...
# 17  
Old 08-26-2013
Hello Scrutinizer,

Thanks for your help.

I have an issue. I have a hexdump with 128 bytes without spaces as input file that I'm attaching (called file_128.txt).

I'm trying to get the 2 patterns shown in script. It prints the patterns, but for the first pat2 is printed in the next
line, I'm not sure why for some cases prints correctly "pat1 pat2"in the same line, but sometimes prints "pat1 \n pat2".

The lines that don't have 2 patterns printed is because they actually don't have associated one pat2 in the input file
and that part is correct how is printed.

This is what I'm getting so far:
Code:
  
$ awk -v pat1="ff44.{6,18}321456.{8}.f(.){16}" -v pat2="038.{32,34}.+84(.){30}" '
   {  b=p $0 ;c=p $0 }
   match(b,pat1) {s=substr(b,RSTART,RLENGTH); sub(pat1,x) }
   match(c,pat2) {s=s (s!=""?OFS:"") substr(c,RSTART,RLENGTH); sub(pat2,z)  }
   s!="" {print s; s=""}
   {p=$0 }' file_128.txt
ff44000001321456022272619f81422060001fffff
03810f01020000000d8147451905ffffff008930010c0000000d8147451905ffffff0101860f010c0000000d81474519559fffff00840e01020102010001ffffff02010201
ff44000002321456014041612f81422060002fffff
ff44000003321456022280546f81422060003fffff 03810f01020000000d8147451905ffffff008930010c0000000d8147451905ffffff0101860f010c0000000d81474519559fffff00840e01020102010001ffffff02010201
ff44000004321456022939276f81422060004fffff 03810f01020000000d8147451905ffffff008930010c0000000d8147451905ffffff0101860f010c0000000d81474519559fffff00840e01020102010001ffffff02010201
ff44000005321456013741169f81422060354fffff 03810f01020000000d8147451905ffffff008930010c0000000d8147451905ffffff0101860f010c0000000d81474519559fffff00840e01020102010001ffffff02010201
ff44000006321456013741255f81422079900fffff

and the correct output should be:
Code:
ff44000001321456022272619f81422060001fffff 03810f01020000000d8147451905ffffff008930010c0000000d8147451905ffffff0101860f010c0000000d81474519559fffff00840e01020102010001ffffff02010201
ff44000002321456014041612f81422060002fffff
ff44000003321456022280546f81422060003fffff 03810f01020000000d8147451905ffffff008930010c0000000d8147451905ffffff0101860f010c0000000d81474519559fffff00840e01020102010001ffffff02010201
ff44000004321456022939276f81422060004fffff 03810f01020000000d8147451905ffffff008930010c0000000d8147451905ffffff0101860f010c0000000d81474519559fffff00840e01020102010001ffffff02010201
ff44000005321456013741169f81422060354fffff 03810f01020000000d8147451905ffffff008930010c0000000d8147451905ffffff0101860f010c0000000d81474519559fffff00840e01020102010001ffffff02010201
ff44000006321456013741255f81422079900fffff

Thanks in advance for any help.
# 18  
Old 08-26-2013
You would need to skip printing the first time, where the buffer contains only one line ..

Quick fix:

Code:
$ awk -v pat1="ff44.{6,18}321456.{8}.f(.){16}" -v pat2="038.{32,34}.+84(.){30}" '
   {  b=p $0 ;c=p $0 }
   match(b,pat1) {s=substr(b,RSTART,RLENGTH); sub(pat1,x) }
   match(c,pat2) {s=s (s!=""?OFS:"") substr(c,RSTART,RLENGTH); sub(pat2,x)  }
   NR>1 && s!="" {print s; s=""}
   {p=$0 }' file_128.txt

--edit--
Actually NR>1 should not be skipped...

Last edited by Scrutinizer; 08-26-2013 at 05:29 PM.. Reason: changed to printing is NR>1 instead of matching if NR>1
# 19  
Old 08-26-2013
Hello Scrutinizer,

It works for the firts tme, but testing in a file a little bigger (100 lines of 128 bytes each)
again appears that sometimes pat1 and its respective pat2 is printed in the next line.

You can see with the new file I'm attaching (file_128_1.txt)

Thanks again.
# 20  
Old 08-26-2013
Come to think I think the output is correct as it is and line 1 should not be skipped. There is no new line being printed, but that is the way it is defined...

I am then not sure what you are looking for. Something like this (every pattern 1 on a new line) ?
Code:
wk -v pat1="ff44.{6,18}321456.{8}.f(.){16}" -v pat2="038.{32,34}.+84(.){30}" '
  {
    b=p $0
    c=p $0 
  }
  match(b,pat1) {
    if(s) print s
    s=substr(b,RSTART,RLENGTH)
    sub(pat1,x)
  }
  match(c,pat2) {
    s=s (s!=""?OFS:"") substr(c,RSTART,RLENGTH)
    sub(pat2,x)
  }
  {
    p=$0
  }
  END{
    if(s)print s
  }
' file_128_1.txt

Can you be more specific?

Last edited by Scrutinizer; 08-26-2013 at 05:50 PM..
# 21  
Old 08-26-2013
Hello Scrutinizer,

Pat1 always will appear, but the couple pat1 and pat2 may not always happens.

So, when pat2 exists will belongs to the previous pat1 and should be printed in the same line. If the script found 2 pat1 consecutives it means pat2 doesn't
exist for that pat1.

So, my goal is print the couple pat1 and pat2 in the same line, just that,
Hoping that the script could handle a input file of 4GB.

Many thanks for your help so far.

Regards

Previous Thread | Next Thread
Test Your Knowledge in Computers #903
Difficulty: Medium
There are less than 10 million lines of code in the Linux kernel as of 2018.
True or False?

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Use strings from nth field from one file to match strings in entire line in another file, awk

I cannot seem to get what should be a simple awk one-liner to work correctly and cannot figure out why. I would like to use patterns from a specific field in one file as regex to search for matching strings in the entire line ($0) of another file. I would like to output the lines of File2 which... (1 Reply)
Discussion started by: jvoot
1 Replies

2. Shell Programming and Scripting

Delimited records splitted into different lines

Hi I am using delimited sequence file. Delimter we are using is pipe .But for some of the records for one of the column the values are getting split into different lines as shown below "113"|"0155"|"2016-04-27 07:59:04"|"1930"|"TEST@TEST"|"2016-04-27 11:04:04.357000000"|"BO"|"Hard... (13 Replies)
Discussion started by: ginrkf
13 Replies

3. Shell Programming and Scripting

Script to match lines in screen

I'd like to ask people who knows bash scripting to write me a script which would open a specific screen and match lines. Here is algorithm I'm thinking about. Find SCREENS named name1, name2.... and nameX. Open them one by one and type 'STATS' Match last lines of the screen before command... (3 Replies)
Discussion started by: GhostMan
3 Replies

4. Shell Programming and Scripting

Returning two lines if they both match strings

Hi I have a problem where I have a large amount of files that I need to scan and return a line and its following line, but only when the following line begins with a string. String one - line one must begin with 'Bill' String two - line two must begin with 'Jones'. If these two... (7 Replies)
Discussion started by: majormajormajor
7 Replies

5. Shell Programming and Scripting

Print only lines where fields concatenated match strings

Hello everyone, Maybe somebody could help me with an awk script. I have this input (field separator is comma ","): 547894982,M|N|J,U|Q|P,98,101,0,1,1 234900027,M|N|J,U|Q|P,98,101,0,1,1 234900023,M|N|J,U|Q|P,98,54,3,1,1 234900028,M|H|J,S|Q|P,98,101,0,1,1 234900030,M|N|J,U|F|P,98,101,0,1,1... (2 Replies)
Discussion started by: Ophiuchus
2 Replies

6. Shell Programming and Scripting

Script to multi-transfer splitted files via scp

Hey :3 I am moving some stuff between different servers. I do it like this: scp -r -P 22 -i ~/new.ppk /var/www/bigfile.tar.gz user@123.123.123.123:/var/www/bigfile.tar.gz Lets say, this file is 50 GiB. I would like to know, if its possible to split the file in different parts,... (2 Replies)
Discussion started by: Keenora
2 Replies

7. Shell Programming and Scripting

Delete lines in file containing duplicate strings, keeping longer strings

The question is not as simple as the title... I have a file, it looks like this <string name="string1">RZ-LED</string> <string name="string2">2.0</string> <string name="string2">Version 2.0</string> <string name="string3">BP</string> I would like to check for duplicate entries of... (11 Replies)
Discussion started by: raidzero
11 Replies

8. Shell Programming and Scripting

Strings from one file which exactly match to the 1st column of other file and then print lines.

Hi, I have two files. 1st file has 1 column (huge file containing ~19200000 lines) and 2nd file has 2 columns (small file containing ~6000 lines). ################################# huge_file.txt a a ab b ################################## small_file.txt a 1.5 b 2.5 ab ... (4 Replies)
Discussion started by: AshwaniSharma09
4 Replies

9. Shell Programming and Scripting

shell script: grep multiple lines after pattern match

I have sql file containing lot of queries on different database table. I have to filter specific table queries. Let say i need all queries of test1,test2,test3 along with four lines above it and sql queries can be multi lines or in single line. Input file contains. set INSERT_ID=1; set... (1 Reply)
Discussion started by: mirfan
1 Replies

10. Shell Programming and Scripting

Perl script to match a pattern and print lines

Hi I have a file (say 'file1')and I want to search for a first occurence of pattern (say 'ERROR') and print ten lines in the file below pattern. I have to code it in PERL and I am using Solaris 5.9. I appreciate any help with code Thanks Ammu (6 Replies)
Discussion started by: ammu
6 Replies

Featured Tech Videos