awk Grouping and Subgrouping with Counts Post: 302776775

Sponsored Content

Top Forums UNIX for Dummies Questions & Answers awk Grouping and Subgrouping with Counts Post 302776775 by Don Cragun on Wednesday 6th of March 2013 10:08:32 PM

03-06-2013

Registered User

Quote:

Originally Posted by JoshCrosby

This looks awesome!! I have a really dumb question though, in the variables - is that expecting 2 files? one for just the products with counts and one just with skews, products and counts?

No, the variables PCF and SCF in the shell and pcf and scf in awk are the names of temp files used by the script to store sorted lists of products and sorted lists of skews for a selected product, respectively, while producing output in the END actions. The input data comes from a single file named file as marked in red in this excerpt from about eight lines from the end of the shell script:

Code:

        close(pcf)
        exit(ec + 0)
}' file
if [ $? -eq 0 ]
then    # awk completed successfully...

Obviously, you can change file to any other filename you want to use. Or you could change it to "$1" and pass the name of the file you want to process as the only argument to the shell script.

This script doesn't need to sort the entire input file, but with millions of input lines these temporary sort result files could still be large. If they are too big to save in the directory where you run this script, you could add an option to the shell script to specify a different directory for these temp files.

Let me know if you're still confused.

---------- Post updated at 19:08 ---------- Previous update was at 18:45 ----------

Quote:

Originally Posted by JoshCrosby

... ... ...
---------- Post updated at 07:35 PM ---------- Previous update was at 07:12 PM ----------

Works perfect!!!

For those who want to know.

To create the .Skew_Count use this one-liner:

Code:

awk -F"|" '$1 ~/p[0-9]/ { p[$2]++ }END{for (n in p) print n, p[n]}' products.txt >.Skew_Counts

To create the .Product_Counts use this:

Code:

awk -F"|" '$1 ~/p[0-9]/ { p[$1]++ }END{for (n in p) print n, p[n]}' products.txt > .Product_Counts

Don't forget to change file to products.txt at the end of the awk command

Don - HUGE THANK YOU!!!!!!!

By the way, using a MAC, so I don't have KornShell, used Bash without issue.

The code you have above will create the .Product_Counts file used by the script (before sorting it in reverse order by the number of hits for the product and increasing order of product name), but the script produces a .Skew_Counts file for each product entry in the top 3 list. The script never produces the entire list of skews. (You did say that some skews could appear with more than one product.) The lists of skews produced by my script, only show the skew counts for each of the displayed products; skipping all occurrences of the skew for other products.

Note that I developed and tested this on my MacBook Pro. You should have the KornShell available as /bin/ksh on any recent version of OS X.

Don Cragun

View Public Profile for Don Cragun

Find all posts by Don Cragun

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

counts

How can i do a simple record count in my shell script? i just want to count the number of records i receive from a specific file.

2. UNIX for Dummies Questions & Answers

counts

To start I have a table that has ticketholders. Each ticket holder has a unique number and each ticket holder is associated to a so called household number. You can have multiple guests w/i a household. I would like to create 3 flags (form a, for a household that has 1-4 gst) form b 5-8 gsts...

3. Shell Programming and Scripting

Grouping using sed/awk ?

I run awk cat $1|awk '{print $6}' and get a lot of results and I want results to group them. For example my result is (o/p is unknown to user) xyz xyz abc pqr xyz pqr etc I wanna group them as xyz=total found 7 abc=total .... pqr= Thank

4. Shell Programming and Scripting

awk grouping by name script

Hello I am trying to figure out a script which could group a log file by user names. I worked with awk command and I could trim the log file to: <USER: John Frisbie > /* Thu Aug 06 2009 15:11:45.7974 */ FLOAT GRANT WRITE John Frisbie (500 of 3005 write) <USER: Shawn Sanders > /* Thu Aug 06...

5. Shell Programming and Scripting

AWK script to create max value of 3rd column, grouping by first column

Hi, I need an awk script (or whatever shell-construct) that would take data like below and get the max value of 3 column, when grouping by the 1st column. clientname,day-of-month,max-users ----------------------------------- client1,20120610,5 client2,20120610,2 client3,20120610,7...

6. Shell Programming and Scripting

awk and perl grouping.

Hello folks. After awk, i have decided to start to learn perl, and i need some help. I have following output : 1 a 1 b 2 k 2 f 3 s 3 p Now with awk i get desired output by issuing : awk ' { a = a FS $2 } END { for ( i in a) print i,a }' input 1 a b 2 k f 3 s p Can...

7. Shell Programming and Scripting

grouping using sed or awk

I have below inside a file. 11.22.33.44 user1 11.22.33.55 user2 I need this manipulated as alias server1.domain.com='ssh user1@11.22.33.44' alias server2.domain.com='ssh user2@11.22.33.55'

8. UNIX for Dummies Questions & Answers

awk adding counts together from column

Hello Im new treat me nicely, I have a headache :) I have a script that seemed to work now it doesnt anyway, the last part is adding counts of unique items in a csv file eg 05492U34 38 05492U34 47 two columns, (many different values like this in file) i want...

9. Shell Programming and Scripting

Grouping and Subgrouping using awk

I have a data which looks like 1440993600|L|ABCDEF 1440993600|L|ABCD 1440993601|L|ABCDEF 1440993602|L|ABC 1440993603|L|ABCDE . . . 1441015200|L|AB 1441015200|L|ABC 1441015200|L|ABCDEF So basically, the $1 is epoch date, $2 and $3 is some application data From one if the...

10. Shell Programming and Scripting

Output counts of all matching strings lessthan a number using awk

The awk below is supposed to count all the matching $5 strings and count how many $7 values is less than 20. I don't think I need the portion in bold as I do not need any decimal point or format, but can not seem to get the correct counts. Thank you :). file chr5 77316500 77316628 ...

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

counts

Discussion started by: k@ssidy

2. UNIX for Dummies Questions & Answers

counts

Discussion started by: sbr262

3. Shell Programming and Scripting

Grouping using sed/awk ?

Discussion started by: pujansrt

4. Shell Programming and Scripting

awk grouping by name script

Discussion started by: Avto

5. Shell Programming and Scripting

AWK script to create max value of 3rd column, grouping by first column

Discussion started by: ckmehta

6. Shell Programming and Scripting

awk and perl grouping.

Discussion started by: Peasant

7. Shell Programming and Scripting

grouping using sed or awk

Discussion started by: anil510

8. UNIX for Dummies Questions & Answers

awk adding counts together from column

Discussion started by: aniquebmx

9. Shell Programming and Scripting

Grouping and Subgrouping using awk

Discussion started by: hemanty4u

10. Shell Programming and Scripting

Output counts of all matching strings lessthan a number using awk

Discussion started by: cmccabe