Sponsored Content
Top Forums UNIX for Dummies Questions & Answers awk Grouping and Subgrouping with Counts Post 302776019 by JoshCrosby on Tuesday 5th of March 2013 10:16:19 PM
Old 03-05-2013
Code awk Grouping and Subgrouping with Counts

So I have a ton of files, lines in excess of 3 MIL per file.

I need to find a solution to find the top 3 products, and then get the top 5 skews with a count of how many times that skew was viewed.

This is a sample file, shortened it for readability. Each ROW is counted as view.

Here's the sample file.
Code:
product|skew
p1|12345
p2|23456
p3|234
p4|98707
p1|12345
p2|23456
p3|2343
p4|98706
p1|12345
p2|23456
p3|234
p5|36748
p4|98708
p1|12345
p2|23456
p3|234
p4|98708
p1|12345
p6|23467
p2|23456
p3|234345
p4|98708
p1|12345
p2|23456
p3|234345
p4|98707

I can get the first top, but i'm having a tough time getting the second with count. I imagine I will have to create 2 arrays and loop through those to get the correct counts.

Can anybody provide any guidance?

for the first I can do this, piping sort and head, but stuck to get the rest.
Code:
awk -F"|" '{product[$1]++}END{for(n in product) print n, product[n]}' products.txt | sort -k2 -nr | head -2

which prints:
Code:
p4 6
p3 6

Expected result should be something like
Code:
product	skew count
p4	98708 	3
p4	98707	2
p4	98706	2
p3	234 	3
p3	234345	2
etc......


Last edited by JoshCrosby; 03-05-2013 at 11:23 PM..
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

counts

How can i do a simple record count in my shell script? i just want to count the number of records i receive from a specific file. (11 Replies)
Discussion started by: k@ssidy
11 Replies

2. UNIX for Dummies Questions & Answers

counts

To start I have a table that has ticketholders. Each ticket holder has a unique number and each ticket holder is associated to a so called household number. You can have multiple guests w/i a household. I would like to create 3 flags (form a, for a household that has 1-4 gst) form b 5-8 gsts... (3 Replies)
Discussion started by: sbr262
3 Replies

3. Shell Programming and Scripting

Grouping using sed/awk ?

I run awk cat $1|awk '{print $6}' and get a lot of results and I want results to group them. For example my result is (o/p is unknown to user) xyz xyz abc pqr xyz pqr etc I wanna group them as xyz=total found 7 abc=total .... pqr= Thank (3 Replies)
Discussion started by: pujansrt
3 Replies

4. Shell Programming and Scripting

awk grouping by name script

Hello I am trying to figure out a script which could group a log file by user names. I worked with awk command and I could trim the log file to: <USER: John Frisbie > /* Thu Aug 06 2009 15:11:45.7974 */ FLOAT GRANT WRITE John Frisbie (500 of 3005 write) <USER: Shawn Sanders > /* Thu Aug 06... (2 Replies)
Discussion started by: Avto
2 Replies

5. Shell Programming and Scripting

AWK script to create max value of 3rd column, grouping by first column

Hi, I need an awk script (or whatever shell-construct) that would take data like below and get the max value of 3 column, when grouping by the 1st column. clientname,day-of-month,max-users ----------------------------------- client1,20120610,5 client2,20120610,2 client3,20120610,7... (3 Replies)
Discussion started by: ckmehta
3 Replies

6. Shell Programming and Scripting

awk and perl grouping.

Hello folks. After awk, i have decided to start to learn perl, and i need some help. I have following output : 1 a 1 b 2 k 2 f 3 s 3 p Now with awk i get desired output by issuing : awk ' { a = a FS $2 } END { for ( i in a) print i,a }' input 1 a b 2 k f 3 s p Can... (1 Reply)
Discussion started by: Peasant
1 Replies

7. Shell Programming and Scripting

grouping using sed or awk

I have below inside a file. 11.22.33.44 user1 11.22.33.55 user2 I need this manipulated as alias server1.domain.com='ssh user1@11.22.33.44' alias server2.domain.com='ssh user2@11.22.33.55' (3 Replies)
Discussion started by: anil510
3 Replies

8. UNIX for Dummies Questions & Answers

awk adding counts together from column

Hello Im new treat me nicely, I have a headache :) I have a script that seemed to work now it doesnt anyway, the last part is adding counts of unique items in a csv file eg 05492U34 38 05492U34 47 two columns, (many different values like this in file) i want... (7 Replies)
Discussion started by: aniquebmx
7 Replies

9. Shell Programming and Scripting

Grouping and Subgrouping using awk

I have a data which looks like 1440993600|L|ABCDEF 1440993600|L|ABCD 1440993601|L|ABCDEF 1440993602|L|ABC 1440993603|L|ABCDE . . . 1441015200|L|AB 1441015200|L|ABC 1441015200|L|ABCDEF So basically, the $1 is epoch date, $2 and $3 is some application data From one if the... (5 Replies)
Discussion started by: hemanty4u
5 Replies

10. Shell Programming and Scripting

Output counts of all matching strings lessthan a number using awk

The awk below is supposed to count all the matching $5 strings and count how many $7 values is less than 20. I don't think I need the portion in bold as I do not need any decimal point or format, but can not seem to get the correct counts. Thank you :). file chr5 77316500 77316628 ... (6 Replies)
Discussion started by: cmccabe
6 Replies
MCSENDER(8)						      System Manager's Manual						       MCSENDER(8)

NAME
mcsender - Multicast test tool to send multicast test packets SYNOPSIS
mcsender [-t<ttl>] [-i<interface>] ip:port DESCRIPTION
mcsender sends multicast packets to the specified IPv4 or IPv6 multicast address and port. The packets sent contain the string "this is the test message from mclab/mcsender ". OPTIONS
-t<ttl> Set the TTL (or hop limit for IPv6) to the specified value. -i<interface> Specify the interface to use for outgoing multicast datagrams. EXAMPLES
To send datagrams to IPv4 multicast address 239.1.1.1 and port 12345 with a TTL of 3: $ mcsender -t3 239.1.1.1:12345 To send datagrams to IPv4 multicast address 239.1.1.1 and port 12345 with a TTL of 3 out of interface eth1: $ mcsender -ieth1 -t3 239.1.1.1:12345 To send datagrams to IPv6 multicast address ff15::1 and port 12345 with a hop limit of 3: $ mcsender -t3 ff15::1:12345 To send datagrams to IPv6 multicast address ff15::1 and port 12345 with a hop limit of 3 out of interface eth1: $ mcsender -ieth1 -t3 ff15::1:12345 AUTHOR
mcsender was written by Carsten Schill <carsten@cschill.de>. Support for IPv6 was added by Todd Hayton <todd.hayton@gmail.com>. This manual page was written by Julien BLACHE <jblache@debian.org>, for the Debian project (but may be used by others). August 08, 2011 MCSENDER(8)
All times are GMT -4. The time now is 08:25 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy