Variable as input to awk command

12-03-2011

Registered User

17, 0

Join Date: Dec 2011

Last Activity: 19 January 2012, 7:10 AM EST

Posts: 17

Thanks Given: 5

Thanked 0 Times in 0 Posts

Variable as input to awk command

Hi Gurus,

I need a suggestion, please help. I have a input file as below :
abc.txt :

Code:

*
xxxx:              00000
xxxxx:              00000
xxxx:              RANDOM
xxx:              RANDOM
**************************xxxxxxx***
*        abc
******************************
abc:
abc:              6213000
abx:            89234010
abc:             01179
******************************
*        acbxyz
******************************
Kitnb/ICCID1/ICCID2/IMSI1/IMSI2/MSISDN1/MSISDN2/VOUCHER1/VOUCHER2
0117943621,89234010001179436212,,621300020985821,,2347064000500,,,
0117943622,89234010001179436220,,621300020985822,,2347062347000,,,

I am using a perl script as below :

Code:

     1  #!usr/bin/perl -w
     2
     3  $test1 = `awk '/Kitnb\\/ICCID1\\/ICCID2/{f=1;next}f' awktext.txt`;
     4  print $test2;
     5  $test2 = `awk 'BEGIN { count=0;}  { if(/^([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,\$/) count++; else print "Unmatching line" ; print } END { print "Number of Lines = ",count;}' `;
     6  print $test2;

The above input file (abc.txt) that I have mentioned is an example file (since the original file will have upto 5 million records after the line : Kitnb/ICCID1/ICCID2/IMSI1/IMSI2/MSISDN1/MSISDN2/VOUCHER1/VOUCHER2)

Initially we had an perl script where we were validating each line by loading in array and running foreach for every line, after the below line in the input file (abc.txt): Kitnb/ICCID1/ICCID2/IMSI1/IMSI2/MSISDN1/MSISDN2/VOUCHER1/VOUCHER2
with an appropriate Regex as shown in the above perl script and we really faced performance issues. So, I was advised to use awk (though I see that it actually uses a new shell) for increase in performance, please suggest if you think otherwise. I have to use a perl script for few rules in organization.

Now, please suggest in the above script, how do i use the variable $test as input for the awk command ( i.e. $test2 = `awk 'BEGIN { count=0;} { if(/^([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,\$/) count++; else print "Unmatching line" ; print } END { print "Number of Lines = ",count;}' abc.txt`

since I want to process (run regex) the file for the regex only after the line as below (and not before it) : Kitnb/ICCID1/ICCID2/IMSI1/IMSI2/MSISDN1/MSISDN2/VOUCHER1/VOUCHER2

Running the awk as above will even process the above line and before it.

In short, how can I validate for Regex using a variable ($test1) as input to awk command and store the same in $test2.

If you have any other suggestion apart from the above, kindly let me know.

Thank you

Last edited by jim mcnamara; 12-04-2011 at 09:31 AM.. Reason: code tags please

arunshankar.c

View Public Profile for arunshankar.c

Find all posts by arunshankar.c

12-04-2011

Registered User

83, 16

Join Date: Sep 2010

Last Activity: 9 March 2015, 1:19 PM EDT

Posts: 83

Thanks Given: 0

Thanked 16 Times in 16 Posts

I think your perl script should be like this (note the bold line).

Code:

#!usr/bin/perl -w

$test1 = `awk '/Kitnb\\/ICCID1\\/ICCID2/{f=1;next}f' awktext.txt`;
print $test1;
$test2 = `awk 'BEGIN { count=0;}  { if(/^([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,\$/) count++; else print "Unmatching line" ; print } END { print "Number of Lines = ",count;}' `;
print $test2;

Quote:

Using the above code you've provided, you are using awk twice to parse the same data and that is the reason being slow. Instead, you could just use one awk command and parse the data and count the number of lines starting from a match of the following line

Code:

Kitnb/ICCID1/ICCID2/IMSI1/IMSI2/MSISDN1/MSISDN2/VOUCHER1/VOUCHER2

I think this is something you want to achieve? If that's the case, then you could either use awk or perl to accomplish.

Code:

awk --re-interval 'BEGIN { count=0; found=0 } { if(/Kitnb\/ICCID1\/ICCID2/) { found=1; next } if(/^([0-9]*\,){8}([0-9]*)$/ && found) { count++; } else if(! /^([0-9]*\,){8}([0-9]*)$/ && found) { print "Unmatching line"; print } } END { print "number of lines = " count } ' data.txt

Quote:

In short, how can I validate for Regex using a variable ($test1) as input to awk command and store the same in $test2.

To answer your original question and use the original code given, you could change it this way

Code:

#!usr/bin/perl -w

$test1 = `awk '/Kitnb\\/ICCID1\\/ICCID2/{f=1;next}f' awktext.txt`;
print $test1;
$test2 = ` echo "$test1" | awk 'BEGIN { count=0;}  {  if(/^([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,([0-9])*,\$/)  count++; else print "Unmatching line" ; print } END { print "Number of  Lines = ",count;}' `;
print $test2;

Last edited by MR.bean; 12-05-2011 at 12:05 AM..

MR.bean

View Public Profile for MR.bean

Find all posts by MR.bean

12-05-2011

Registered User

17, 0

Join Date: Dec 2011

Last Activity: 19 January 2012, 7:10 AM EST

Posts: 17

Thanks Given: 5

Thanked 0 Times in 0 Posts

For the below Suggestion I got the error as below.

Code:

awk --re-interval 'BEGIN { count=0; found=0 } { if(/Kitnb\/ICCID1\/ICCID2/) { found=1; next } if(/^([0-9]*\,){8}([0-9]*)$/ && found) { count++; } else
 if(! /^([0-9]*\,){8}([0-9]*)$/ && found) { print "Unmatching line"; print } } END { print "number of lines = " count } ' data.txt

=> perl Regex.pl

Code:

 Usage: awk [-F fs][-v Assignment][-f Progfile|Program][Assignment|File] ...

So, I modified the script as below (implementing the suggestion of using the variable) and it worked fine:

$test1 = `awk '/Kitnb\\/ICCID1\\/ICCID2/{f=1;next}f' 10k.txt | awk 'BEGIN { count=0;} { if('"$regex"') count++; else print "Correct the output file, The line is :" ; print ;} END {} '`;

Thank you MR.bean

Thanks,
Arun

---------- Post updated at 04:14 PM ---------- Previous update was at 04:13 PM ----------

BTW, could you please suggest why I got below error :

Code:

awk --re-interval 'BEGIN { count=0; found=0 } { if(/Kitnb\/ICCID1\/ICCID2/) { found=1; next } if(/^([0-9]*\,){8}([0-9]*)$/ && found) { count++; } else
 if(! /^([0-9]*\,){8}([0-9]*)$/ && found) { print "Unmatching line"; print } } END { print "number of lines = " count } ' data.txt

=> perl Regex.pl

Code:

 Usage: awk [-F fs][-v Assignment][-f Progfile|Program][Assignment|File] ...

Thanks,
Arun

Last edited by Scott; 12-05-2011 at 06:48 AM.. Reason: Code tags, please...

arunshankar.c

View Public Profile for arunshankar.c

Find all posts by arunshankar.c

12-05-2011

Registered User

686, 179

Join Date: Mar 2011

Last Activity: 17 March 2020, 9:58 PM EDT

Posts: 686

Thanks Given: 51

Thanked 179 Times in 171 Posts

It may be because your awk doesnt support --re-interval option.
Can you post the output of

Code:

awk --version

If you are on Solaris, try to run it with 'nawk'.

mirni

View Public Profile for mirni

Find all posts by mirni

Shell Programming and Scripting

Variable as input to awk command

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk command input string too long, limit

Discussion started by: knijjar

2. Shell Programming and Scripting

Variable input to awk script

Discussion started by: Subbeh

3. Shell Programming and Scripting

Call a awk script with variable and input filename

Discussion started by: Onkar Banerjee

4. Shell Programming and Scripting

Passing variable as an input file to AWK comand

Discussion started by: ezhil01

5. Programming

take input from a variable as pattern to awk

Discussion started by: anandrec

6. Shell Programming and Scripting

awk built-in variable for input file

Discussion started by: Det7

7. Shell Programming and Scripting

Awk command without input file

Discussion started by: saikiran_1984

8. Shell Programming and Scripting

Input variable in command line

Discussion started by: aydj

9. UNIX and Linux Applications

Input a variable and write to a file using awk

Discussion started by: ladyAnne

10. UNIX for Dummies Questions & Answers

AWK command giving wrong input

Discussion started by: usha rao