Count occurrence of string (based on type) in a column using awk


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Count occurrence of string (based on type) in a column using awk
# 1  
Old 03-03-2015
Count occurrence of string (based on type) in a column using awk

Hello,

I have a table that looks like what is shown below:

Code:
AA
BB
CC
XY
PQ
RS
AA
BB
CC
XY
RS

I would like the total counts depending on the set they belong to:

if search pattern is in
Code:
{AA, BB, CC} --> count them as Type1 | wc -l

or if pattern is in
Code:
{XY, RS, PQ} --> count them as Type2 | wc -l

So, w.r.t the above table the output should be Type1 = 6 & Type2 Counts = 5

I can do a simple awk for each type and then add them together, but it does not seem very efficient. Any (one liner)suggestions?

Many thanks!
~Guss

Last edited by Gussifinknottle; 03-08-2015 at 06:46 AM.. Reason: Typo
# 2  
Old 03-03-2015
As a "one liner" this would look like scrambled eggs.
Code:
awk '/AA|BB|CC/ {fcount++}
     /XY|RS|PQ/ {scount++}
     END { printf("%s\n%s\n",
           "Type1 = " fcount, "Type2 = " scount)
         }
' FILE

Of course, feel free to scramble. :-)
# 3  
Old 03-03-2015
Try also
Code:
awk '$1==$2 {TYPE1++; next} {TYPE2++}  END {print "Type1 =", TYPE1, ", Type2 =", TYPE2}' FS="" file
Type1 = 6 , Type2 = 5

# 4  
Old 03-03-2015
Quote:
Originally Posted by RudiC
Try also
Code:
awk '$1==$2 {TYPE1++; next} {TYPE2++}  END {print "Type1 =", TYPE1, ", Type2 =", TYPE2}' FS="" file
Type1 = 6 , Type2 = 5

Although the above will work in some implementations of awk, the standards say that the behavior when FS is an empty string is unspecified. A more portable way of writing the above would be:
Code:
awk '{	type[substr($1, 1, 1) == substr($1, 2, 1)]++ }
END {	print "Type1 = " type[1] ", Type2 = " type[0] }' file

which, when file contains the table shown in pst #1 in this thread, produces the output:
Code:
Type1 = 6, Type2 = 5

If you want to try this on a Solaris/SunOS system, change awk to /usr/xpg4/bin/awk.
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Count occurrence of column one unique value having unique second column value

Hello Team, I need your help on the following: My input file a.txt is as below: 3330690|373846|108471 3330690|373846|108471 0640829|459725|100001 0640829|459725|100001 3330690|373847|108471 Here row 1 and row 2 of column 1 are identical but corresponding column 2 value are... (4 Replies)
Discussion started by: angshuman
4 Replies

2. Shell Programming and Scripting

Count of occurrence in particular column of the file.

Hi All, let's say an input looks like: C1,C2,C3,C4,C5,C6,C7,C8,C9,C10,C11 ---------------------------------- 1|0123452|C501|Z|Z|Z|E|E|E|E|E|E|E 1|0156123|C501|X|X|X|E|E|E|E|E|E|E 1|0178903|C501|Z|Z|Z|E|E|E|E|E|E|E 1|0127896|C501|Z|Z|Z|E|E|E|E|E|E|E 1|0981678|C501|X|X|X|E|E|E|E|E|E|E ... (6 Replies)
Discussion started by: suresh_target
6 Replies

3. Programming

awk to count occurrence of strings and loop for multiple columns

Hi all, If i would like to process a file input as below: col1 col2 col3 ...col100 1 A C E A ... 3 D E G A 5 T T A A 6 D C A G how can i perform a for loop to count the occurences of letters in each column? (just like uniq -c ) in every column. on top of that, i would also like... (8 Replies)
Discussion started by: iling14
8 Replies

4. Shell Programming and Scripting

Insert Columns before the last Column based on the Count of Delimiters

Hi, I have a requirement where in I need to insert delimiters before the last column of the total delimiters is less than a specified number. Say if the delimiters is less than 139, I need to insert 2 columns ( with blanks) before the last field awk -F 'Ç' '{ if (NF-1 < 139)} END { "Insert 2... (5 Replies)
Discussion started by: arunkesi
5 Replies

5. UNIX for Dummies Questions & Answers

[Solved] Awk: count occurrence of each character for every field

Hi, let's say an input looks like: A|C|C|D A|C|I|E A|B|I|C A|T|I|B as the title of the thread explains, I am trying to get something like: 1|A=4 2|C=2|B=1|T=1 3|I=3|C=1 4|D=1|E=1|C=1|B=1 i.e. a count of every character in each field (first column of output) independently, sorted... (4 Replies)
Discussion started by: beca123456
4 Replies

6. Shell Programming and Scripting

Count occurrence of string in a column using awk

Hi, I want to count the occurrences of strings in a column and display as in example below: Input: get1 345 789 098 get2 567 982 090 fet4 777 610 632 get1 800 544 230 get1 600 788 451 get2 892 321 243 get1 673 111 235 fet3 789 220 278 fet4 768 222 341 output: 4 get1 345 789... (7 Replies)
Discussion started by: aydj
7 Replies

7. Shell Programming and Scripting

Count number of occurrence of a string in file

if there's a file containing: money king money queen money cat money also money king all those strings are on one line in the file. how can i find out how many times "money king" shows up in the line? egrep -c "money king" wont work. (7 Replies)
Discussion started by: SkySmart
7 Replies

8. Emergency UNIX and Linux Support

awk cut column based on string

Using awk I required to cut out column contain word "-Tag" regardles of any order of contents and case INsensitive -Tag:messages -P:/var/log/messages -P:/var/log/maillog -K:Error -K:Warning -K:critical Please Guide ...... --Shirish Shukla ---------- Post updated at 05:58 AM... (15 Replies)
Discussion started by: Shirishlnx
15 Replies

9. Shell Programming and Scripting

Count no of occurrence of the strings based on column value

Can anyone help me to count number of occurrence of the strings based on column value. Say i have 300 files with 1000 record length from which i need to count the number of occurrence string which is existing from 213 to 219. Some may be unique and some may be repeated. (8 Replies)
Discussion started by: zooby
8 Replies

10. Shell Programming and Scripting

remove characters from string based on occurrence of a string

Hello Folks.. I need your help .. here the example of my problem..i know its easy..i don't all the commands in unix to do this especiallly sed...here my string.. dwc2_dfg_ajja_dfhhj_vw_dec2_dfgh_dwq desired output is.. dwc2_dfg_ajja_dfhhj it's a simple task with tail... (5 Replies)
Discussion started by: victor369
5 Replies
Login or Register to Ask a Question