awk Group By and count string occurrences


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk Group By and count string occurrences
# 1  
Old 08-08-2013
awk Group By and count string occurrences

Hi Gurus,
I'm scratching my head over and over and couldn't find the the right way to compose this AWK properly - PLEASE HELP Smilie

Input:
Code:
c,d,e,CLICK
a,b,c,CLICK
a,b,c,CONV
c,d,e,CLICK
a,b,c,CLICK
a,b,c,CLICK
a,b,c,CONV
b,c,d,CLICK
c,d,e,CLICK
c,d,e,CLICK
b,c,d,CONV
a,b,c,CLICK
b,c,d,CLICK
b,c,d,CLICK
c,d,e,CLICK

Desired Output:
Code:
a,b,c,4,2
b,c,d,3,1
c,d,e,5,0

##Explenation: the Key (group by) is fields $1+$2+$3
##The 4th column counts the occurrences of "CLICK"
##The 5th column counts the occurrences of "CONV"

Last edited by Franklin52; 08-08-2013 at 09:56 AM.. Reason: Please use Code Tags
# 2  
Old 08-08-2013
Can you post what you have tried?
# 3  
Old 08-08-2013
Code:
awk -F, '$4~/CLICK/ {a[$1","$2","$3]++} $4~/CONV/ {b[$1","$2","$3]++} END {for (i in a) print i","a[i]+0"," b[i]+0}'
a,b,c,4,2
c,d,e,5,0
b,c,d,3,1

This User Gave Thanks to Jotne For This Post:
# 4  
Old 08-08-2013
OMG - You are a genius!!!
Thanks so much!
# 5  
Old 08-08-2013
Code:
awk -F, '
$4~/CLICK/ {a[$1","$2","$3]++}
$4~/CONV/ {b[$1","$2","$3]++} 
END {for (i in a) print i","a[i]+0"," b[i]+0}'

Here we use array to count the number of hits.
One array for CLICK and one for CONV
using $1","$2","$3 as index will name array like a[a,b,c]
This creates one unique array for every different combination of $1,$2,$3
Then it adds up how many it finds by using the ++
a[a,b,c]++ equal a[a,b,c]=a[a,b,c]+1

END {for (i in a) print i","a[i]+0"," b[i]+0}'
This line will run once fore every unique combination of $1,$2,$3
In this case 3 times. Then it prints the value of the array.

Do a search in awk array
# 6  
Old 08-08-2013
BTW - why do we need the "+ 0"?
is this to protect lines that have only CLICK but don't have CONV?
# 7  
Old 08-08-2013
If it does not find any hit it will print a blank filed " "
To prevent this add a zero so it prints 0 when nothing found.
Removing the +0 gives:
Code:
a,b,c,4,2
c,d,e,5,
b,c,d,3,1

vs
Code:
a,b,c,4,2
c,d,e,5,0
b,c,d,3,1

Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

awk or sed script to count number of occurrences and creating an average

Hi Friends , I am having one problem as stated file . Having an input CSV file as shown in the code U_TOP_LOGIC/U_HPB2/U_HBRIDGE2/i_core/i_paddr_reg_2_/Q,1,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,1,1,0,0,0,0... (4 Replies)
Discussion started by: kshitij
4 Replies

2. Shell Programming and Scripting

Count the number of string occurrences to display 0 entries in output

Hello Friends, Can somebody assist an issue I am having? I have a separate file with a list of account ids XXX200B02Y01 XXX200B03Y01 XXX200B05Y01 XXX200B07Y01 XXX200B08Y01 I call the file, and run an egrep against a directory and logfiles AccountID=$(cat... (2 Replies)
Discussion started by: liketheshell
2 Replies

3. Shell Programming and Scripting

Count occurrences in first column

input amex-11 10 abc amex-11 20 bcn amed-12 1 abc I tried something like this. awk '{h++}; END { for(k in h) print k, h }' rm1 output amex-11 1 10 abc amex-11 1 20 bcn amed-12 2 1 abc Note: The second column represents the occurrences. amex-11 is first one and amed-12 is the... (5 Replies)
Discussion started by: quincyjones
5 Replies

4. Shell Programming and Scripting

Speed : awk command to count the occurrences of fields from one file present in the other file

Hi, file1.txt AAA BBB CCC DDD file2.txt abc|AAA|AAAabcbcs|fnwufnq bca|nwruqf|AAA|fwfwwefwef fmimwe|BBB|fnqwufw|wufbqw wcdbi|CCC|wefnwin|wfwwf DDD|wabvfav|wqef|fwbwqfwfe i need the count of rows of file1.txt present in the file2.txt required output: AAA 2 (10 Replies)
Discussion started by: mdkm
10 Replies

5. Shell Programming and Scripting

How to count occurrences in a specific column

Hi, I need help to count the number of occurrences in $3 of file1.txt. I only know how to count by checking one by one and the code is like this: awk '$3 ~ /aku hanya poyo/ {++c} END {print c}' FS="\t" file1.txt But this is not wise to do as i have hundreds of different occurrences in that... (10 Replies)
Discussion started by: redse171
10 Replies

6. Shell Programming and Scripting

Count occurrences in awk

Hello, I have an output from GDB with many entries that looks like this 0x00007ffff7dece94 39 in dl-fini.c 0x00007ffff7dece97 39 in dl-fini.c 0x00007ffff7ab356c 50 in exit.c 0x00007ffff7aed9db in _IO_cleanup () at genops.c:1022 115 in dl-fini.c 0x00007ffff7decf7b in _dl_sort_fini (l=0x0,... (6 Replies)
Discussion started by: ikke008
6 Replies

7. Shell Programming and Scripting

Help with Unix and Awk to count number of occurrences

Hi, I have a file (movies.sh), this file contains list of movies such as I want to redirect the movies from movies.sh to file_to_process to allow me process the file with out losing anything. I have tried Movies.sh >> file_to_process But I want to add the row number to the data... (2 Replies)
Discussion started by: INHF
2 Replies

8. Shell Programming and Scripting

Awk-Group count of field

Hi, Suppose if i am having a file with following records as given below. 5555 6756 5555 4555 4555 6767 how can i get the count of each record using AWK. Eg:5555 count should be 2 4555 count should be 2 6767 count should be 1 ... (5 Replies)
Discussion started by: tinivt
5 Replies

9. UNIX for Dummies Questions & Answers

count occurrences and substitute with counter

Hi Unix-Experts, I have a textfile with several occurrences of some string XXX. I'd like to count all the occurrences and number them in reverse order. E.g. input: XXX bla XXX foo XXX output: 3 bla 2 foo 1 I tried to achieve this with sed, but failed. Any suggestions? Thanks in... (4 Replies)
Discussion started by: ptob
4 Replies
Login or Register to Ask a Question