Awk to Count Multiple patterns in a huge file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Awk to Count Multiple patterns in a huge file
# 1  
Old 06-07-2012
Awk to Count Multiple patterns in a huge file

Hi,


I have a file that is 430K lines long. It has records like below
Code:
|site1|MAP
|site2|MAP
|site1|MODAL
|site2|MAP
|site2|MODAL
|site2|LINK
|site1|LINK

My task is to count the number of time MAP, MODAL, LINK occurs for a single site and write new records like below to a new file

Code:
SiteName MAP MODAL LINK
--------------------------
site1     | 1     |  1  |  1
site2     | 2     |  1  |  1

I have accomplished this using grep by doing
Code:
countmap=`grep $SITEID $FILENAME | grep MAP | wc -l`
countmodal=`grep $SITEID $FILENAME | grep MODAL | wc -l`
countlink=`grep $SITEID $FILENAME | grep LINK | wc -l`
echo $SITEID\|$countmap\|$countmodal\|$countlink\|

However with a 430K long file it took me more than an hour to accomplish this. My knowledge of awk is rudimentary at best. Kindly help me with this

Last edited by Franklin52; 06-08-2012 at 07:41 AM.. Reason: Please use code tags for data and code samples, thank you
# 2  
Old 06-07-2012
Try this:

Code:
awk -F\| '{sites[$2]; c[$2,$3]++ }
END {
   OFS=" | "
   print "SiteName MAP MODAL LINK"
   print "-----------------------"
   for(site in sites) print site, 0+c[site,"MAP"],0+c[site,"MODAL"], 0+c[site,"LINK"]
}' infile

# 3  
Old 06-07-2012
Code:
[ok@x60 ~]$ cat xxx | awk '{a[$1]++} END{for (x in a) {print x "\t" a[x]}}'

|site2|MAP 4
|site2|LINK 2
|site1|LINK 2
|site1|MAP 6
|site1|MODAL 2
|site2|MODAL 2
[ok@x60 ~]$ cat xxx
|site1|MAP
|site1|MAP
|site1|MAP
|site1|MAP
|site1|MAP
|site2|MAP
|site1|MODAL
|site2|MAP
|site2|MODAL
|site2|LINK
|site1|LINK
|site1|MAP
|site2|MAP
|site1|MODAL
|site2|MAP
|site2|MODAL
|site2|LINK
|site1|LINK

Last edited by new_item; 06-07-2012 at 08:24 PM..
# 4  
Old 06-07-2012
Thanks a lot for the reply

I tried the following command. It is formatting the data correctly. But it is printing the count as 0|0|0 incorrectly. Can you let me know what I am doing wrong
Code:
awk -F"|" '{c[$2,$3]++;b[$2]=$2 FS 0+c[i,"MAP"] FS 0+c[i,"MODAL"] FS 0+c[i,"LINK"] FS} END {for ( i in b) { print b[i]}}' filename;


Last edited by Franklin52; 06-08-2012 at 07:41 AM.. Reason: Please use code tags for data and code samples, thank you
# 5  
Old 06-07-2012
Good to see you working on your own solution from my example.

You are really close, it's just that i has not been assigned, assign it first or use $2 instead:
Code:
awk -F"|" '{c[$2,$3]++;b[$2]=$2 FS 0+c[$2,"MAP"] FS 0+c[$2,"MODAL"] FS 0+c[$2,"LINK"] FS} END {for ( i in b) { print b[i]}}' filename

# 6  
Old 06-08-2012
Thanks a lot Chubler_XL. That worked like a charm. It took me about 40 sec to run this which is awesome. thanks once again
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

awk to print before and after lines then count of patterns

What i'm trying to do here is show X amount of lines before and after the string "serialNumber" is found. BEFORE=3 AFTER=2 gawk '{a=$0} {count=0} /serialNumber/ && /./ {for(i=NR-'"${BEFORE}"';i<=NR;i++){count++ ;print a}for(i=1;i<'"${AFTER}"';i++){getline; print ; count ++; print... (5 Replies)
Discussion started by: SkySmart
5 Replies

2. Shell Programming and Scripting

Check multiple patterns in awk

I need to check if 2 values exists in the file and if they are equal print 0. output.txt: ------------ 1 2 3 4 5 6 Inputs: a=1 b=2 My pattern matching code works but I am trying to set a counter if both the pattern matches which does not work.If the count > 0,then I want to... (3 Replies)
Discussion started by: kannan13
3 Replies

3. Shell Programming and Scripting

Multiple patterns for awk script

Hi, I'm getting stuck when supplying multiple patterns for the below code: awk -F, ' .. .. if ($0 ~ pattern) { .. .. } .. .. ' pattern='$ROW' input_file for the same code I'm trying to supply multiple patterns as given below: awk -F, ' .. .. if( ($0 ~ pattern) && ($0 ~... (6 Replies)
Discussion started by: penqueen
6 Replies

4. Shell Programming and Scripting

awk - fetch multiple data from huge dump

Hello Experts I have a requirement wherein I need to fetch multiple data from huge dump egrep -f Pattern.txt Dump.txt My pattern file has got like 300 entries and Dump file is like 8GB data. It taking eternity to complete on my machine. Is their a faster way to search pattern like using... (5 Replies)
Discussion started by: navkanwal
5 Replies

5. Shell Programming and Scripting

Grep from multiple patterns multiple file multiple output

Hi, I want to grep multiple patterns from multiple files and save to multiple outputs. As of now its outputting all to the same file when I use this command. Input : 108 files to check for 390 patterns to check for. output I need to 108 files with the searched patterns. Xargs -I {} grep... (3 Replies)
Discussion started by: Diya123
3 Replies

6. Shell Programming and Scripting

Searching multiple patterns using awk

Hello, I have the following input file: qh1adm 20130710111201 : tp import all QH1 u6 -Dsourcesystems=BFI,EBJ qh1adm 20130711151154 : tp import all QH1 u6 -Dsourcesystems=BFI,EBJ qx1adm 20130711151154 : tp count QX1 u6 -Dsourcesystems=B17,E17,EE7 qh1adm 20130711151155 : tp import all... (7 Replies)
Discussion started by: kcboy
7 Replies

7. Shell Programming and Scripting

[Solved] HP-UX awk sub multiple patterns

Hi, I am using sub to remove blank spaces and one pattern(=>) from the input string. It works fine when I am using two sub functions for the same. However it is giving error while I am trying to remove both spaces and pattern using one single sub function. Working: $ echo " OK => " |awk... (2 Replies)
Discussion started by: sai_2507
2 Replies

8. Shell Programming and Scripting

count the number of occurring patterns in a file.

Hi, I have a file with a '|' pipe delimeter. I want to find number of counts for a particular pattern in particular field. Is it possible to do it in a single command? 1) want to find total number of "0" in field 4. 2) want to find total number of different records in field 4 ( similar to... (5 Replies)
Discussion started by: rudoraj
5 Replies

9. Shell Programming and Scripting

Count lines between two patterns inside a file

Hi, Im doing a script to find the number of lines included inside a file newly. These lines are in between #ifdef FLAG1 and #else or #endif or #else and #endif. I tried like this, awk '/#ifdef Flag1/,/#e/{print}' aa.c | wc -l awk '/#ifndef Flag1/,/#endif/{print}' aa.c | awk... (6 Replies)
Discussion started by: priyadarshini
6 Replies

10. UNIX for Dummies Questions & Answers

AWK: Multiple patterns per line

Hi there, We have been given a bit of coursework using awk on html pages. Without giving too much away and risking the wrath of the plagerism checks, I can say we need to deal with certain html elements. There may be several of these elements on one line. My question is, if there are more... (1 Reply)
Discussion started by: Plavixo
1 Replies
Login or Register to Ask a Question