awk to count pattern matches


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting awk to count pattern matches
# 8  
Old 12-12-2008
Quote:
Originally Posted by vgersh99
that doesn't work for multiple (and sequential) occurrences of a pattern on the same line/record:
Code:
echo '1,1,2,5,5,5,6,5,4,5,7'| nawk '{while (sub(/,5,/,",")) t++}END{print t}'


That's only good for one line, and there's no need for a loop:

Code:
 awk '{ total += gsub(/,5,/,"") } END { print total }'

# 9  
Old 12-12-2008
Hi.

Another method:
Code:
#!/bin/bash -

# @(#) s1       Demonstrate count of strings with awk, field separator.

echo
echo "(Versions displayed with local utility \"version\")"
version >/dev/null 2>&1 && version "=o" $(_eat $0 $1) awk
set -o nounset

FILE=${1-data1}

echo
echo " Data file $FILE:"
cat $FILE

echo
echo " Results:"
awk -F",5," '
BEGIN   { t = 0 }
        { t += NF }
END     {print t}
' $FILE

Producing:
Code:
% ./s1

(Versions displayed with local utility "version")
Linux 2.6.11-x1
GNU bash 2.05b.0
GNU Awk 3.1.4

 Data file data1:
1,1,2,5,5,5,6,5,4,5,7

 Results:
5

cheers, drl
# 10  
Old 12-13-2008
Quote:
Originally Posted by vgersh99
that doesn't work for multiple (and sequential) occurrences of a pattern on the same line/record:
Code:
echo '1,1,2,5,5,5,6,5,4,5,7'| nawk '{while (sub(/,5,/,",")) t++}END{print t}'

[...]
Quote:
Originally Posted by cfajohnson
That's only good for one line
[...]
For one line? Why?
# 11  
Old 12-13-2008
Quote:
Originally Posted by radoulov
For one line? Why?

Sorry, I don't know what I was thinking. (But gsub() does avoid the while loop.)
# 12  
Old 12-13-2008
Quote:
Originally Posted by cfajohnson

Sorry, I don't know what I was thinking. (But gsub() does avoid the while loop.)
I thought the same, but:
Code:
$ echo '1,1,2,5,5,5,6,5,4,5,7' | awk '{ total += gsub(/,5,/,"") } END { print total }'
4

# 13  
Old 12-13-2008
Hi.

In many RE implementations, one cannot have over-lapping matched sections of strings, so the result noted makes sense in that context.

I was surprised that my solution on post # 9 worked, since I would have expected the same behavior with a field split based on a regular expression as separator.

Perhaps the explanation is that for the gsub, part of the string is now replaced by the empty string which makes the next ",5," in the original no longer present, but rather "5,", causing the next one to be considered. For the field split, there is no replacement ... cheers, drl
# 14  
Old 12-13-2008
thats very useful guys. thanks guys.

i was using a separate while loop anyway to read in line by line from a file and then use pattern matching to extract and count ,5,

i was getting confused with awk.. when do you use something like {count++;}? i mean count++ with a semicolon and when do u use it without a semicolon?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Count Pattern using awk

I need to get a count of all the records that start with 4 and then print the value. I have the below statement but it is missing something, can someone help me to fix this awk 'BEGIN{~/^4/{C++}};END {print"<Total>"} {print"<Reg>"}{print "<value>"C"</value></Reg>"}' {print"</Total>"} temp >... (2 Replies)
Discussion started by: rosebud123
2 Replies

2. UNIX for Beginners Questions & Answers

find pattern matches in consecutive lines in certain fields-awk

I have a text file with many thousands of lines, a small sample of which looks like this: InputFile:PS002,003 D -1 5 -1 -1 -1 -1 -1 -1 -1 -1 -1 -1 6 6 -1 -1 -1 -1 0 509 0 PS002,003 PSQ 0 1 7 18 1 0 -1 1 1 3 -1 -1 ... (5 Replies)
Discussion started by: jvoot
5 Replies

3. Shell Programming and Scripting

awk Index to get position matches pattern

Input data as below (filetest.txt): 1|22 JAN Minimum Bal 20.00 | SAT 2|09 FEB Extract bal 168.00BR | REM 3|MIN BAL | LEX Output should be: ( If there is Date & Month in 2nd field of Input file, It should be seperated else blank. If There is Decimal OR Decimal & Currency in last of the 2nd... (7 Replies)
Discussion started by: JSKOBS
7 Replies

4. Shell Programming and Scripting

Number of matches and matched pattern(s) in awk

input: !@#$%2QW5QWERTAB$%^&* The string above is not separated (or FS=""). For clarity sake one could re-write the string by including a "|" as FS as follow: !|@|#|$|%|2QW|5QWERT|A|B|$|%|^|&|* Here, I am only interested in patterns (their numbers are variable between records) containing... (16 Replies)
Discussion started by: beca123456
16 Replies

5. Shell Programming and Scripting

Count number of pattern matches per line for all files in directory

I have a directory of files, each with a variable (though small) number of lines. I would like to go through each line in each file, and print the: -file name -line number -number of matches to the pattern /comp/ for each line. Two example files: cat... (4 Replies)
Discussion started by: pathunkathunk
4 Replies

6. Shell Programming and Scripting

awk with range but matches pattern

To match range, the command is: awk '/BEGIN/,/END/' but what I want is the range is printed only if there is additional pattern that matches in the range itself? maybe like this: awk '/BEGIN/,/END/ if only in that range there is /pattern/' Thanks (8 Replies)
Discussion started by: zorrox
8 Replies

7. Shell Programming and Scripting

Awk Count Pattern problem.

I want to keep a count of a all the records processed in a input file. The input file would have a lot of data containing various information. Lets say I make a pattern that only prints out data with the amount $37.57. How would I go about keeping track of how many $37.57 appears? I have... (2 Replies)
Discussion started by: Boltftw
2 Replies

8. Shell Programming and Scripting

awk to sum specific field when pattern matches

Trying to sum field #6 when field #2 matches string as follows: Input data: 2010-09-18-20.24.44.206117 UOWEXEC db2bp DB2XYZ hostname 1 2010-09-18-20.24.44.206117 UOWWAIT db2bp DB2XYZ hostname ... (3 Replies)
Discussion started by: ux4me
3 Replies

9. Shell Programming and Scripting

How to count the pattern in a file by awk

hello everybody, I have 3 files eg- sample1 sample2 sample3 each file contain word babu many times eg- cat sample1 babu amit msdfmdfkl babu abhi babu ruby amit babu I want to count only the count of babu ,how many times it appeared . (5 Replies)
Discussion started by: abhigrkist
5 Replies

10. Shell Programming and Scripting

Perl line count if it matches a pattern

#!/usr/bin/perl use Shell; open THEFILE, "C:\galileo_integration.txt" || die "Couldnt open the file!"; @wholeThing = <THEFILE>; close THEFILE; foreach $line (@wholeThing){ if ($line =~ m/\\0$/){ @nextThing = $line; if ($line =~ s/\\0/\\LATEST/g){ @otherThing =... (2 Replies)
Discussion started by: nmattam
2 Replies
Login or Register to Ask a Question