awk parsing problem


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk parsing problem
# 8  
Old 02-02-2008
The following gives the output you want based on the sample file provided.

Code:
#!/usr/bin/awk -f

BEGIN {
    total = 0;
    cpfound = 0;
    edmfound = 0;
    wrcfound = 0;
}

function parseEDM()
{
   j = 0;
   sum = 0;
   edmfound = 1;

   while (j < 6) {
      getline;
      sum = sum + $1 + $2;
      j++;
   }

   total = total + sum;
   return sum;
}

function parseWRC()
{
   j = 0;
   sum = 0;
   wrcfound = 1;

   getline;
   sum = sum + $1 + $2;

   total = total + sum;
   return sum;
}

NF==1 && substr($1,1,2)=="CP" {
    print "";
    print $1;
    print "----";

    total = 0;
    cpfound = 1;
    edmfound = 0;
    wrcfound = 0;
}

NF==2 && substr($1,1,3)=="TCS" && cpfound == 1 {
       field1 = $1;
       field2 = $2;
       getline;
       if (NF==1 && substr($1,1,3)=="EDM") {
          field3=parseEDM();
       }
       if (NF==1 && substr($1,1,3)=="WRC") {
          field3=parseWRC();
       }
       print field1, field2, field3;
       if (edmfound==1 && wrcfound==1) {
           print "";
           print "TOTAL", total;
           edmfound = 0;
           wrcfound = 0;
           total = 0;
           cpfound = 0;
       }
}

Obviously error handling, etc. needs to be added if used in a production environment.

Code:
$ ./testawk testfile

CP31
----
TCS 10 54087
TCS 342 35173

TOTAL 89260

CP33
----
TCS 8 48790
TCS 286 33614

TOTAL 82404
$

# 9  
Old 02-02-2008
OK,
changed like this:

Code:
awk '
  /^CP/ {    
        print
        f++     
        }

  f && /TCS /   {   
        if (edm) {
                printf "----\n%-8s %-8s\n", tcs, edm
                edm = ""
                }
        tcs = (($1) FS ($2))
        }

  f && /EDM|WRC|wrc/    {  
        getline 
        edm += $1 + $2 
        edmt += $1 + $2 
        }

  f && ! NF     {  
        printf "%-8s %-8s\nTotal:   %-8s\n\n", tcs, edm, edmt
        f = tcs = edm = edmt = ""
        }
' filename

This is the output I get:

Code:
$ cat timj.txt


CDN 07    4
 IMS-SCNT 00000 00000 00000 00000
 IMS-LCNT 000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000
 SCNT     00000 00025 00000 00000 00000 00000 00000 00000 00031
 LCNT     000000041 001007860 187905607 000891919 102177186 000000000 000000000
          000000000 000000000 000000000 000000000 000000023 000000000 000001679
          000000016 000000309 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
 RT USAGE 00093 00001
 CP       000082307 000000000 000000000 000000000 000000000 000000000 000000000
          000000000
 MCR USG  000541595


CDN 08    4
 IMS-SCNT 00000 00000 00000 00000
 IMS-LCNT 000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000
 SCNT     00000 00025 00000 00000 00000 00000 00000 00000 00031
 LCNT     000000023 001033219 190332475 000919047 104943932 000000000 000000000
          000000000 000000000 000000000 000000000 000000023 000000000 000001697
          000000017 000000306 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
 RT USAGE 00082 00001
 CP       000085873 000000000 000000000 000000000 000000000 000000000 000000000
          000000000
 MCR USG  000553997

CP31
 ECMR  05 00000
          000000038 000000000 000000000 000000000 000175275 000175275 000033886
          000033886 000000000 000000000 000000011 000000001 000000095 000431157
          000143147 000000000 000004124 000246868 000184289 000085517 000069108
          000004981 000015056 000000731 000000678 000000000 000000000
 OC    07 00000
          000000283 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000424 000000027 000000195 000000000 000000004 000000006
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000472 000000472 000000472
 CCM   03 00000
  TCSLINK
  TCS  10
  EDM1
          000027214 000026873 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
  EDM2
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
  TCS 342                                                                                      
   WRC1
          000014880 000020293
   WRC2
          000000000 000000000
   WRC3
          000000000 000000000
   WRC4
          000000000 000000000
   WRC5
          000000000 000000000
   WRC6
          000000000 000000000
   wrc7
          000000000 000000000
   wrc8
          000000000 000000000
   wrc9
          000000000 000000000
   wrc10
          000000000 000000000
   wrc11
          000000000 000000000
   wrc12
          000001345 000002365
   wrc13
          000000000 000000000
   wrc14
          000000000 000000000
   wrc15
          000000000 000000000
   wrc16
          000000000 000000000

CP33
 ECMR  05 00000
          000000042 000000000 000000000 000000000 000167297 000167297 000022079
          000022079 000000000 000000000 000000010 000000001 000000095 000298686
          000148264 000000000 000004122 000168466 000130220 000081446 000066818
          000003491 000014595 000000730 000000678 000000000 000000000
 OC    07 00000
          000000254 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000394 000000022 000000197 000000000 000000001 000000004
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
 CCM   03 00000
  TCSLINK
  TCS   8
  EDM1
          000025112 000023678 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
  EDM2
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000 000000000
          000000000 000000000 000000000 000000000 000000000 000000000
  TCS 286
   WRC1
          000014267 000019347
   WRC2
          000000000 000000000
   WRC3
          000000000 000000000
   WRC4
          000000000 000000000
   WRC5
          000000000 000000000
   WRC6
          000000000 000000000
   WRC7
          000000000 000000000
   WRC8
          000000000 000000000


$ nawk '
>   /^CP/ { 
>         print 
>         f++
>         }
>
>   f && /TCS /   {
>         if (edm) {
>                 printf "----\n%-8s %-8s\n", tcs, edm
>                 edm = ""
>                 }
>         tcs = (($1) FS ($2))
>         }
>
>   f && /EDM|WRC|wrc/    {
>         getline
>         edm += $1 + $2
>         edmt += $1 + $2
>         }
>
>   f && ! NF     {
>         printf "%-8s %-8s\nTotal:   %-8s\n\n", tcs, edm, edmt
>         f = tcs = edm = edmt = ""
>         }
> ' timj.txt
CP31
----
TCS 10   54087
TCS 342  38883
Total:   92970

CP33
----
TCS 8    48790
TCS 286  33614
Total:   82404


Last edited by radoulov; 02-03-2008 at 04:53 AM.. Reason: modified
# 10  
Old 02-03-2008
radoulov & fpmurphy,
I can't thank you both enough for what you guys did. I had to tweak both scripts a little bit, but both are working. I realize I was going about this the wrong way. Both have helped me enormously.

radoulov,
I am having trouble following your logic, it looks like there are "shortcuts" in your script that I am having problem deciphering. Can you explain your script for me.

Thank you both again, so much.
# 11  
Old 02-04-2008
Quote:
Originally Posted by timj123
[...]
radoulov,
I am having trouble following your logic, it looks like there are "shortcuts" in your script that I am having problem deciphering. Can you explain your script for me.
[...]
Of course.
First, I would change the code to:

Code:
awk '
  /^CP/ { 
        print 
        f++     
        }

  f     { 
        if ($0 ~ /TCS /) { 
                if (edm) { 
                        printf "----\n%-8s %-8s\n", tcs, edm
                        edm = ""
                        }
                tcs = (($1) FS ($2))
                }
        if ($0 ~ /EDM|WRC|wrc/) { 
                getline 
                edm += $1 + $2 
                edmt += $1 + $2 
                }
        if (! NF) { 
                printf "%-8s %-8s\nTotal:   %-8s\n\n", tcs, edm, edmt
                f = tcs = edm = edmt = ""
                }
        }
' input

The logic is quite simple, here we go:

Code:
  /^CP/ { 
        print 
        f++     
        }

For every record that matches the pattern ^CP: print it and increment the value of the parameter f. We only need a flag, you can use flag = "true" if you consider it more readable.

Code:
  f     { 
        if ($0 ~ /TCS /) { 
                if (edm) { 
                        printf "----\n%-8s %-8s\n", tcs, edm
                        edm = ""
                        }
                tcs = (($1) FS ($2))
                }

For every record for which our flag is true, has value different than zero or null - the f by itself (here we are in your logical record/block):
+ if the record matches the pattern TCS<space>:
++ if the parameter edm is true (see below), then print the values of the parameters tcs and edm, then unset edm (set it to "", false)
++ set the parameter tcs to the values of $1 FS and $2.

Code:
        if ($0 ~ /EDM|WRC|wrc/) { 
                getline 
                edm += $1 + $2 
                edmt += $1 + $2 
                }

+ if the record matches the pattern EDM|WRC|wrc (EDM OR WRC OR wrc) go to the next line (getline) and:
++ increment edm and edmt (total) with the sum of $1 and $2.


Code:
        if (! NF) { 
                printf "%-8s %-8s\nTotal:   %-8s\n\n", tcs, edm, edmt
                f = tcs = edm = edmt = ""
                }

+ if the record has no fields (a blank line, the end of your logical record/block), print the values of
tcs, edm, edmt (total) and unset f, tcs, edm and edmt.


# 12  
Old 02-04-2008
I think I got it now. I was getting confused on the first edm statement, I was wondering how it becomes true, until I relized script goes until it finds a blank line and then prints out varables and resets varables at the end. I also did not know that a varable can be used in the pattern part of a awk script. Again thanks, you have no idea how much I struggled on this.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk parsing problem

Hello fellow unix geeks, I am having a small dilemna trying to parse a log file I have. Below is a sample of what it will look like: MY_TOKEN1(group) TOKEN(other)|SSID1 MY_TOKEN2(group, group2)|SSID2 What I need to do is only keep the MY_TOKEN pieces and where there are multiple... (7 Replies)
Discussion started by: dagamier
7 Replies

2. Shell Programming and Scripting

Problem parsing

Hi, I want to fetch a text.Clipping. ... (5 Replies)
Discussion started by: protocomm
5 Replies

3. Shell Programming and Scripting

Complex text parsing with speed/performance problem (awk solution?)

I have 1.6 GB (and growing) of files with needed data between the 11th and 34th line (inclusive) of the second column of comma delimited files. There is also a lot of stray white space in the file that needs to be trimmed. They have DOS-like end of lines. I need to transpose the 11th through... (13 Replies)
Discussion started by: Michael Stora
13 Replies

4. Homework & Coursework Questions

Problem parsing input with awk

Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted! 1. The problem statement, all variables and given/known data: I want add a line.For example:- 123456 1 1 0 1 1 0 1 0 0 0 1 5 8 0 12 10 25 its answer... (4 Replies)
Discussion started by: Arsh10
4 Replies

5. Shell Programming and Scripting

Parsing problem

Hello, I have a similar problem so I continue this thread. I have: my_script_to_format_nicely_bdf.sh | grep "RawData" |tr -s ' '|cut -d' ' -f 4|tr -d '%' So it supposed to return the percentage used of RawData FS: 80 (Want to use it in a alert script) However I also have a RawData2 FS so... (17 Replies)
Discussion started by: drbiloukos
17 Replies

6. Shell Programming and Scripting

Another parsing line awk or sed problem

Hi, After looking on different forums, I'm still in trouble to parse a parameters line received in KSH. $* is equal to "/AAA:111 /BBB:222 /CCC:333 /DDD:444" I would like to parse it and be able to access anyone from his name in my KSH after. like echo myArray => display 111 ... (1 Reply)
Discussion started by: RickTrader
1 Replies

7. Shell Programming and Scripting

Parsing Problem

Hi all, I am having problems parsing the following file: cat mylist one,two,three four five,six My goal is to get each number on a seperate line. one two three four five six I tried this command: sed -e 's/\,/^M/g' mylist (11 Replies)
Discussion started by: rob11g
11 Replies

8. Shell Programming and Scripting

Parsing problem

I need to parse a string which looks like "xyx","sdfsdf","asf_asdf" into var1="xyx" var2="sdfsdf" var3="asf_asdf" (3 Replies)
Discussion started by: Sushir03
3 Replies

9. Shell Programming and Scripting

Parsing problem

Hi, i need to parse a string which looks like this "xyz","1233","cm_asdfasdf" (2 Replies)
Discussion started by: Sushir03
2 Replies

10. Shell Programming and Scripting

Parsing problem

I need to separate out the contents in the string "xyz","1233","cm_asdfasdf" as xyz,1233,cm_asdfasdf Can anyone help me on this?? (1 Reply)
Discussion started by: Sushir03
1 Replies
Login or Register to Ask a Question