Looking to improve the output of this awk one-liner


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Looking to improve the output of this awk one-liner
# 1  
Old 07-19-2013
Looking to improve the output of this awk one-liner

I have the following awk one-liner I came up with last night to gather some data. and it works pretty well (apologies, I'm quite new with awk, and don't know how to format this pretty-printed). You can see the output with it.

Code:
 awk '{if ($8 == 41015 && $21 == "requests") arr["Requests "$1" "substr($2,0,5)]+=$20;if ($8 == 41015 && $22 == "requests") arr["Requests "$1" "substr($2,0,5)]+=$21;else if ($8 == 41100) arr["Deletes "$1" "substr($2,0,5)]+=1;if ($8 == 41015) arr["Batches "$1" "substr($2,0,5)]+=1};END{for (i in arr) print i,arr[i]}'  example.log

My input file for this example: (this is uniq'd but there are 75 lines total)

Code:
07/19/13 07:50:27.890   D CN M Proxy   MGR_PROXY    41100  @BOX-98765  Manager DoD with THING asset 1508772769 of home 7793004 found in 0 retries
07/19/13 07:50:28.247   I CN M Proxy   USER_OPER    41015  @12345  Schedule recording request: THING recording requested asset 7474656, channel XXX, 60 requests
07/19/13 07:53:04.319   I CN M Proxy   USER_OPER    41015  @54321  Schedule recording request: THING recording requested asset 61263854, channel XX HD, 1 requess
07/19/13 07:53:04.319   I CN M Proxy   USER_OPER    41015  @54321  Schedule recording request: THING recording requested asset 61263854, channel XX HD, 1 requests

and my output is:

Code:
Batches 07/19/13 07:50 25
Requests 07/19/13 07:50 1500
Deletes 07/19/13 07:50 25
Batches 07/19/13 07:53 25
Requests 07/19/13 07:53 25

The code logic is pretty simple:

Code:
 * Check 8th column which denotes log line type
 * If 41015 (a recording request) 
    + increment up the batch counter by one. 
    + find the column with the number of requests and increment the requests counter by that value
 * if 41100
    + increment the deletion counter up one.

My primary objective is to format the output as a CSV that I can just send off as a report like this (the headers are illustrative, and I'm not looking to actually print them out...unless I can). :

Code:
#Date,Time,Reqs,Bats,Dels
07/19/13,07:50,1500,25,25
07/19/13,07:53,24,25,

My secondary objective is to clean up the code. For example, having to check the 8th column twice for 41015 to increment both counters seems wasteful.

Any advice is welcome, but please keep in mind this is my first time doing anything more complex than awk '{print $2,$4,$8}' file, so I'd appreciate explanations as well as code snippets.

Last edited by DeCoTwc; 08-01-2013 at 04:06 AM.. Reason: cleaning
# 2  
Old 07-19-2013
That's only a 'one-liner' because the line refuses to wrap in code tags Smilie Better to break it where it matters and see what you're doing. I like two liners, three liners.

I have no idea where you're pulling that 1500 from, so I'll assume your output is unrelated.

Code:
awk 'BEGIN { SUBSEP=","; OFS="," }
        { sub(/:[^:]*$/, "", $2); } # Strip the seconds off the time
        $8 == 41015 { BATS[$1,$2]++ ; D[$1,$2]++ ; REQS[$1,$2] += $(NF-1) }
        $8 == 41100 { DELS[$1,$2]++; D[$1,$2]++ }
        END {
                print "#Date,Time,Reqs,Bats,Dels"
                for(X in D) print X, REQS[X]+0, BATS[X]+0, DELS[X]+0 }' inputfile

This User Gave Thanks to Corona688 For This Post:
# 3  
Old 07-19-2013
You can use profiling gawk to pretty print your awk program. Check the gawk manual for more details:
Code:
man gawk

I added few lines to your code to generate desired output, make any adjustments if required:
Code:
awk '
        BEGIN {
                print "#DateTime,Reqs,Bats,Dels"
        }
        {
                if ($8 == 41015 && $21 == "requests") {
                        arr["Requests," $1 " " substr($2, 0, 5)] += $20
                }
                if ($8 == 41015 && $22 == "requests") {
                        arr["Requests," $1 " " substr($2, 0, 5)] += $21
                } else {
                        if ($8 == 41100) {
                                arr["Deletes," $1 " " substr($2, 0, 5)] += 1
                        }
                }
                if ($8 == 41015) {
                        arr["Batches," $1 " " substr($2, 0, 5)] += 1
                }
        }


        END {
                for (i in arr) {
                        n = split ( i, V, "," )
                        if ( V[1] == "Batches" )
                                B[V[2]] = arr[i]
                        if ( V[1] == "Requests" )
                                R[V[2]] = arr[i]
                        if ( V[1] == "Deletes" )
                                D[V[2]] = arr[i]
                        T[V[2]]
                }
                for ( k in T )
                        print k,R[k],B[k],D[k]

        }
' OFS=, example.log

This User Gave Thanks to Yoda For This Post:
# 4  
Old 07-19-2013
Yeah, one of the downsides of only learning bits and pieces of different languages is I never learned how to construct a proper program, so everything is a one-liner to me. It's kind of fun though.

The 1500 is because like I said, my input is uniq'd and there's really 75 lines (or 18 of each line). So, where it's saying

Code:
awk '{if ($8 == 41015 && $21 == "requests") arr["Requests "$1" "substr($2,0,5)]+=$20;if ($8 == 41015 && $22 == "requests") arr["Requests "$1" "substr($2,0,5)]+=$21; [...]

It's adding up the column with the number of requests which in the full input file adds up to 1500.

Either way, your code looks pretty sweet and I'm going to try to wrap my head around it after I've had a chance to rest. Thanks, as always Corona688. As always, just when I think i'm getting good, you come through and kick me back to the kids table Smilie.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Improve awk code that has three separate parts

I have a very inefficient awk below that I need some help improving. Basically, there are three parts, that ideally, could be combined into one search and one output file. Thank you :). Part 1: Check if the user inputted string contains + or - in it and if it does the input is writting to a... (4 Replies)
Discussion started by: cmccabe
4 Replies

2. Shell Programming and Scripting

Improve script and get new output file

Gents, Using the following script, I got the changes as desired in the output file called (spread_2611.x01.new). Complete file as input (spread_2611.x01). Can you please have a look to my script and improve it please. :b: Also I would like to I get a additional selecting only the records... (21 Replies)
Discussion started by: jiam912
21 Replies

3. UNIX for Dummies Questions & Answers

Any awk one liner to print df output?

Hi, OS = Solaris Can anyone advise if there is a one liner to print specific output from a df -k output? Running df from a command line, it sometimes gives me 2 lines for some volume. By re-directing the output to a file, it always gives 1 line for each. Below is an example output,... (4 Replies)
Discussion started by: newbie_01
4 Replies

4. Shell Programming and Scripting

awk one liner

The below code is a simple modified sample from a file with millions of lines containing hundreds of extra columns xxx="yyy" ... <app addr="1.2.3.4" rem="1000" type="aaa" srv="server1" usr="user1"/> <app usr="user2" srv="server2" rem="1001" type="aab" addr="1.2.3.5"/>What's the most efficient awk... (2 Replies)
Discussion started by: cabrao
2 Replies

5. Shell Programming and Scripting

HELP with AWK one-liner. Need to employ an If condition inside AWK to check for array variable ?

Hello experts, I'm stuck with this script for three days now. Here's what i need. I need to split a large delimited (,) file into 2 files based on the value present in the last field. Samp: Something.csv bca,adc,asdf,123,12C bca,adc,asdf,123,13C def,adc,asdf,123,12A I need this split... (6 Replies)
Discussion started by: shell_boy23
6 Replies

6. Shell Programming and Scripting

Improve performance of echo |awk

Hi, I have a script which looks like this. Input file data1^20 data2^30 #!/bin/sh file"/home/Test.txt" while read line do echo $line |awk 'BEGIN { FS = "^" } ; { print $2 }' echo $line |awk 'BEGIN { FS = "^" } ; { print $1 }' | gzip | wc -c done <"$file" How can i... (4 Replies)
Discussion started by: chetan.c
4 Replies

7. Shell Programming and Scripting

Search & Replace regex Perl one liner to AWK one liner

Thanks for giving your time and effort to answer questions and helping newbies like me understand awk. I have a huge file, millions of lines, so perl takes quite a bit of time, I'd like to convert these perl one liners to awk. Basically I'd like all lines with ISA sandwiched between... (9 Replies)
Discussion started by: verge
9 Replies

8. Shell Programming and Scripting

Awk one-liner?

Hello, I have two files... File #1 1 3 2 5 File #2 3 5 3 1 3 7 9 1 5 2 5 8 3 3 1 I need to extract all lines from File #2 where the first two columns match each line of File #1. So in the example, the output would be: 1 3 7 2 5 8 Is there a quick one-liner that would... (4 Replies)
Discussion started by: palex
4 Replies

9. Shell Programming and Scripting

Improve program efficiency (awk)

Hi !! I've finished an awk exercise. Here it is: #!/bin/bash function calcula { # Imprimimos el mayor tamaño de fichero ls -l $1 | awk ' BEGIN { max = $5; # Inicializamos la variable que nos guardará el máximo con el tamaño del primer archivo } { if ($5 > max){ #... (8 Replies)
Discussion started by: Phass
8 Replies

10. Shell Programming and Scripting

Execute the output of one liner print

Hello I wrote simple one liner that take RunTime *.exe and link them to the output of the compilation output: find ~/DevEnv/. -name "*.exe" | xargs ls -l | awk '{ x=split($9,a,"/"); print "ln -s " $9 " "a}' and it gives me the desire output , but how can I execute this ln command on every... (1 Reply)
Discussion started by: umen
1 Replies
Login or Register to Ask a Question

Featured Tech Videos