Group by columns and get the counts


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Group by columns and get the counts
# 1  
Old 02-09-2007
Group by columns and get the counts

Hi Gurus,

I have a file

1|usa|hh
2|usa|ll
3|usa|vg
4|usa|vg
5|usa|vg
6|usa|vg
7|usa|ll
8|uk|nn
9|uk|bb
10|uk|bb
11|kuwait|mm
12|kuwait|jkj
13|kuwait|mm
14|dubai|hh

I want to group by last two columns and get the last two recs and count.

output should look like :

dubai hh 1
kuwait jkj 1
kuwait mm 2
uk bb 2
uk nn 1
usa hh 1
usa ll 2
usa vg 4

I tried something like this but this doesnt give the right counts

awk -F"|" '{col[$2,$3]=NR} END {for (i in col) print i, col[i]}' new.txt | sort

dubaihh 14
kuwaitjkj 12
kuwaitmm 13
ukbb 10
uknn 8
usahh 1
usall 7
usavg 6

Can any of the Unix Gurus help.
It has been posted earlier but I have modified it a bit.
Any suggestion is very much appreciated.


Thanks
Sumeet
# 2  
Old 02-09-2007
Code:
awk -F"|" '{col[$2,$3]++} END {for (i in col) print i, col[i]}' new.txt | sort

# 3  
Old 02-09-2007
Thanks Jim

It works perfectly but one qusn : I couldnt understand the functionality of

{col[$2,$3]++} in comparison with {col[$2,$3]=NR}.

Can you throw some light on it.

Thanks
Sumeet
# 4  
Old 02-13-2007
NR versus ++

NR is the current Number of record, in other words the line number of the file. When adding this you are not counting the occurrences but the positions in the file
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

To group the text (rows) by similar columns-names in a file

As part of some report generation, I've written a script to fetch the values from DB. But, unluckily, for certain Time ranges(1-9.99,10-19.99 etc), I don't have data in DB. In such cases, I would like to write zero (0) instead of empty. The desired output will be exported to csv file. ... (1 Reply)
Discussion started by: kumar_karpuram
1 Replies

2. UNIX for Beginners Questions & Answers

Group by columns and add sum in new columns

Dear Experts, I have input file which is comma separated, has 4 columns like below, BRAND,COUNTRY,MODEL,COUNT NIKE,USA,DUMMY,5 NIKE,USA,ORIGINAL,10 PUMA,FRANCE,DUMMY,20 PUMA,FRANCE,ORIGINAL,15 ADIDAS,ITALY,DUMMY,50 ADIDAS,ITALY,ORIGINAL,50 SPIKE,CHINA,DUMMY,1O And expected output add... (2 Replies)
Discussion started by: ricky1991
2 Replies

3. Shell Programming and Scripting

Grep patterns and group counts

Hi, I have a continuous log file which has the following format:- 02/Sep/2015: IP 11.151.108.166 error occurred etc 03/Sep/2015: IP 11.151.108.188 error occurred etc 03/Sep/2015: IP 11.152.178.250 error occurred etc 03/Sep/2015: IP 11.188.108.176 error occurred etc 03/Sep/2015: IP... (4 Replies)
Discussion started by: finn
4 Replies

4. Shell Programming and Scripting

Average across multiple columns group by

Hi experts, I want to group by average, for multiple columns starting column $7 until NF, group by ($1-$5), please help For just 7th column, I can do awk ' NR>1{ arr += $7 count += 1 } END{ for (a in arr) { print a, arr/count ... (10 Replies)
Discussion started by: ritakadm
10 Replies

5. Shell Programming and Scripting

Get the SUM of TWO columns SEPARATELY by doing GROUP BY on other columns

My File looks like: "|" -> Field separator A|B|C|100|1000 D|E|F|1|2 G|H|I|0|7 D|E|F|1|2 A|B|C|10|10000 G|H|I|0|7 A|B|C|1|100 D|E|F|1|2 I need to do a SUM on Col. 5 and Col.6 by grouping on Col 1,2 & 3 My expected output is: A|B|C|111|11100 (2 Replies)
Discussion started by: machomaddy
2 Replies

6. Shell Programming and Scripting

Summing columns over group of lines

I have an input file that looks like: ID1 V1 ID2 V2 P1 P2 P3 P4 ..... n no. of columns 1 1 1 1 1.0000 1.0000 1.0000 1.0000 1 1 1 2 0.9999 0.8888 0.7777 0.6666 1 2 1 1 0.8888 0.7777 0.6666 0.5555 1 2 1 2 0.7777 0.6666 0.5555 0.4444 2 1 1 1 0.6666 0.5555 0.4444 0.3333 2 1 1 2 0.5555 0.4444... (4 Replies)
Discussion started by: sdp
4 Replies

7. Shell Programming and Scripting

Add the values in second and third columns with group by on first column.

Hi All, I have a pipe seperated file. I need to add the values in second and third columns with group by on first column. MYFILE_28012012_1115|47|173.90 MYFILE_28012012_1115|4|0.00 MYFILE_28012012_1115|6|22.20 MYFILE_28012012_1116|47|173.90 MYFILE_28012012_1116|4|0.00... (3 Replies)
Discussion started by: angshuman
3 Replies

8. Shell Programming and Scripting

how can i group by same columns by another columns in Bash

how can i group by same columns by another columns in Bash Eq. this is a csv file Co1 Co2 Co3 Co4 A A 1,000 1,000 A B 2,000 1,250 A A 2,000 3,002 A C 2,000 3,005 how can i get the result of like this Co1 Co2 Co3 Co4 A A 3,000 ... (5 Replies)
Discussion started by: qjlongs
5 Replies

9. Shell Programming and Scripting

Convert rows to columns group

Hi I have the input file following like this "AIX" "AIX 6.0" "AIX 7.0" "Redhat 8" "Redhat 9" "Redhat 5.0 Enterprise Linux" "Sun Solaris 9" "Sun Solaris 10", "Sun Microsystems" "Oracle" .................................Like this 2000 lines I need to convert this input into... (5 Replies)
Discussion started by: selvanraj
5 Replies

10. Shell Programming and Scripting

Script to place selected columns from a group of files side by side in a new file

Hi Everyone, I need a shell/perl script to bring selected columns from all the files located in a directory and place them in a new file side by side. File1: a b c d 2 3 4 5 f g h i .......... File2: I II III IV w x y z .............. and so on many files are there...... (8 Replies)
Discussion started by: ks_reddy
8 Replies
Login or Register to Ask a Question