frequency count using shell


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting frequency count using shell
# 1  
Old 10-09-2012
frequency count using shell

Hello everyone,
please consider the following lines of a matrix

HTML Code:
[574,]   59   32 
[575,]   59   32 
[576,]   59   32 
[577,]   59   32 
[578,]   59   32 
[579,]   59   32 
[580,]   59   32 
[581,]   60   32 
[582,]   60   33 
[583,]   60   33 
[584,]   60   33 
[585,]   60   33 
[586,]   60   33 
[587,]   60   33 
[588,]   60   33 
[589,]   60   33 
[590,]   60   33 
[591,]   61   33 
[592,]   61   33 
[593,]   61   33 
[594,]   61   33 
[595,]   61   33 
[596,]   61   33 
[597,]   61   33 
[598,]   61   33 
[599,]   61   33 
[600,]   61   33 
[601,]   62   34 
Is is possible to count the percent frequency of each distinct field in $2?

Just like this:

HTML Code:
59  25.00%
60  35.70%
61  35.70%
62  3.57%
# 2  
Old 10-09-2012
Code:
awk '{ A[$2]++ } END { for(X in A) printf("%s\t%s\n", X, (A[X]*100)/NR) }' inputfile

This User Gave Thanks to Corona688 For This Post:
# 3  
Old 10-09-2012
in case you need to add "%" and use float

Code:
awk '{ A[$2]++ } END { for(X in A) printf("%d\t%.2f%\n", X, (A[X]*100)/NR) }' inputfile

This User Gave Thanks to fastlane3000 For This Post:
# 4  
Old 10-09-2012
Same solution, added format:
Code:
awk '{ A[$2]++ } END { for(X in A) printf("%s\t%5.2f%%\n", X, (A[X]*100)/NR) }' inputfile | sort -n


Last edited by rdrtx1; 10-09-2012 at 06:47 PM..
This User Gave Thanks to rdrtx1 For This Post:
# 5  
Old 10-09-2012
Quote:
Originally Posted by fastlane3000
in case you need to add "%" and use float

Code:
awk '{ A[$2]++ } END { for(X in A) printf("%d\t%.2f%\n", X, (A[X]*100)/NR) }' inputfile

It showed some message like this, how should I adjust the code? I'm not familiar with printf, Thank you!

HTML Code:
awk: weird printf conversion %

 input record number 61124, file HB143-0W-A4.txt
 source line number 1
awk: not enough args in printf(%d	%.2f%
)
 input record number 61124, file HB143-0W-A4.txt
 source line number 1
# 6  
Old 10-09-2012
Code:
awk '{ A[$2]++ } END { for(X in A) printf("%d\t%.2f%%\n", X, (A[X]*100)/NR) }' inputfile

This User Gave Thanks to rdrtx1 For This Post:
# 7  
Old 10-09-2012
maybe you're not using the same awk than i have (i've got a gnu version called gawk)
just try "rdrtx1"'s solution it's more complete with a "sort"
Code:
awk '{ A[$2]++ } END { for(X in A) printf("%s\t%3.2f%\n", X, (A[X]*100)/NR) }' inputfile | sort -n

i rather use 3 it's enough because we can't have more than 100%
This User Gave Thanks to fastlane3000 For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Need help for count using shell

Input: person1=1234 website=google.com website=yahoo.com person2=3453 website=google.com website=mail.com person3=4590 website=facebook.com person4=4591 website=facebook.com website=yahoo.com website=google.com website=twitter.com website=example.com (8 Replies)
Discussion started by: buzzme
8 Replies

2. Shell Programming and Scripting

Frequency Count of chunked data

Dear all, I have an AWK script which provides frequency of words. However I am interested in getting the frequency of chunked data. This means that I have already produced valid chunks of running text, with each chunk on a line. What I need is a script to count the frequencies of each string. A... (4 Replies)
Discussion started by: gimley
4 Replies

3. Shell Programming and Scripting

Count frequency of unique values in specific column

Hi, I have tab-deliminated data similar to the following: dot is-big 2 dot is-round 3 dot is-gray 4 cat is-big 3 hot in-summer 5 I want to count the frequency of each individual "unique" value in the 1st column. Thus, the desired output would be as follows: dot 3 cat 1 hot 1 is... (5 Replies)
Discussion started by: owwow14
5 Replies

4. Shell Programming and Scripting

Shell scripting: frequency of specific word in a string and statistics

Hello friends, I need a BIG help from UNIX collective intelligence: I have a CSV file like this: VALUE,TIMESTAMP,TEXT 1,Sun May 05 16:13:05 +0000 2013,"RT @gracecheree: Praying God sends me a really great man one day. Gotta trust in his timing. 0,Sun May 05 16:13:05 +0000 2013,@sendi__... (19 Replies)
Discussion started by: kraterions
19 Replies

5. Shell Programming and Scripting

Code for count the frequency of interacting pairs

Hi all, I am trying to analyze my data, and I will need your experience. I have some files with the below format: res1 = TYR res2 = ASN res1 = ASP res2 = SER res1 = TYR res2 = ASN res1 = THR res2 = LYS res1 = THR res2 = TYR etc (many lines) I am... (3 Replies)
Discussion started by: Tzole
3 Replies

6. Shell Programming and Scripting

count frequency of words in a file

I need to write a shell script "cmn" that, given an integer k, print the k most common words in descending order of frequency. Example Usage: user@ubuntu:/$ cmn 4 < example.txt :b: (3 Replies)
Discussion started by: mohit_iitk
3 Replies

7. Shell Programming and Scripting

shell to find the count fields of each line

hi, i've many unload files with delimiter '|'. I'm trying to load them to the specific tables from those unl's. The problem here is, some unl's are corrupted. To be exact, some files doesnt seem to have the exact number of fields as in the table. So im trying to identify the corrupted... (6 Replies)
Discussion started by: dvah
6 Replies

8. Shell Programming and Scripting

Help with checking reference data frequency count

reference data GHTAS QER CC N input data NNWQERPROEGHTASTTTGHTASNCC Desired output GHTAS 2 QER 1 CC 1 N 3 (2 Replies)
Discussion started by: perl_beginner
2 Replies

9. Shell Programming and Scripting

Hit count on a shell script

I have a unix shell script (ex.sh) written. How to find out how many users (incl. myself) have run this .sh ? I can insert code snipet at top of script if need be. - Ravi (2 Replies)
Discussion started by: ravi368
2 Replies

10. Shell Programming and Scripting

Count field frequency in a '|' delimited file

I have a large file with fields delimited by '|', and I want to run some analysis on it. What I want to do is count how many times each field is populated, or list the frequency of population for each field. I am in a Sun OS environment. Thanks, - CB (3 Replies)
Discussion started by: ChicagoBlues
3 Replies
Login or Register to Ask a Question