Printing most frequent string in column


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Printing most frequent string in column
# 1  
Old 04-11-2016
Printing most frequent string in column

I am trying to put together an script that will output the most frequent string in a column. This is what I have:
Code:
 awk '{count[$1]++} END {for ( i in count ) print i, count[i] }'

Of course, my script is outputting all different strings and counts. However, I just need the most frequent one (there will be always one)
I will appreciate any help
# 2  
Old 04-11-2016
Perhaps something like:
Code:
 awk '{if(count[$1]++ >= max) max = count[$1]} END {for ( i in count ) if(max == count[i]) print i, count[i] }'

This User Gave Thanks to Don Cragun For This Post:
# 3  
Old 04-11-2016
Don

Thanks a TON! I got it
# 4  
Old 08-30-2016
How could I modify the above script in such way that I can print all strings that represent >30% percent of all entries within the column?

Last edited by Xterra; 08-30-2016 at 05:04 PM..
# 5  
Old 08-30-2016
Individual strings with percentage > 30 or strings that collectively have a percentage > 30?
# 6  
Old 08-30-2016
Don
Let say I have the following file:
Code:
 a
 a
 a
 a
 a
 b
 b
 b
 c
 c

The desired output should be:

Code:
 a 5
 b 3

or
Code:
 a 50%
 b 30%

c should be excluded since it only accounts for 20% of the total count
Thanks!
PS> I tried modifying the variable max but I could not get it to print the desire output
# 7  
Old 08-30-2016
How about
Code:
awk -v PCT=".3" '
        {if(++count[$1] > count[max]) max = $1
         tot++
        }
END     {# print max, count[max]                      # solution for the former problem
         for (c in count) if (count[c]/tot >= PCT) print c, count[c]
        }
' file
a 5
b 3

This User Gave Thanks to RudiC For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Frequent words and trigraphs in text

Hello all, how to get the most 30 frequent words in text and the most frequent trigraphs (three character in same order in text )? note that : the text is none English text (Arabic text) so I will get the result as top 30 words abdbdns asddd wqwfqw top 30 trigraphs abc... (3 Replies)
Discussion started by: khaled79
3 Replies

2. Shell Programming and Scripting

Inconsistent column printing

Hi, I have a file that has inconsistently numbered columns. Like row1 has 23 columns, and row 2 has 34 columns etc. I would like to re-order the first 8 columns as required and from the 9th column till the end, I would like to print it as it is. I tried to read the re-ordered 8 columns... (7 Replies)
Discussion started by: jacobs.smith
7 Replies

3. UNIX for Dummies Questions & Answers

Printing out lines that have the same value in the first column but different value in the second

Hi, I have a text file that looks like the following: ILMN_1343291 6 74341083 74341772 ILMN_1343291 6 74341195 74341099 ILMN_1343295 12 6387581 6387650 ILMN_1651209 1 1657001 1657050 ILMN_1651209 5 83524260 83524309 I... (1 Reply)
Discussion started by: evelibertine
1 Replies

4. Shell Programming and Scripting

printing certain elelment of a column

"File1.txt" CHR SNP BP ANNOT 8 rs1878374 127974042 MYC(-843.5kb)|FAM84B(+334.4kb) 2 rs2042077 16883103 VSNL1(-702.2kb)|SMC6(-825.5kb)|RAD51AP2(-672.4kb)|MYCN(+878.5kb)|MSGN1(-978.2kb)|GEN1(-915.6kb)|FAM49A(+172.5kb) 12 rs10431347 3023955... (4 Replies)
Discussion started by: johnkim0806
4 Replies

5. UNIX for Dummies Questions & Answers

Printing a particular column using SED

Hi, i want to display only the particular column using SED command. For example, ps -ef|grep ash |sed -n '1p'|cut -d ' ' -f2   this gives 29067 ps -ef|grep ash |sed -n '1p'|awk '{print $2}'    this also gives the same  in the same way i need the solution using sed. Please... (4 Replies)
Discussion started by: pandeesh
4 Replies

6. Shell Programming and Scripting

grep on string and printing line after until another string has been found

Hello Everyone, I just started scripting this week. I have no background in programming or scripting. I'm working on a script to grep for a variable in a log file Heres what the log file looks like. The x's are all random clutter xxxxxxxxxxxxxxxxxxxxx START: xxxxxxxxxxxx... (7 Replies)
Discussion started by: rxc23816
7 Replies

7. UNIX for Dummies Questions & Answers

creating a file using the fist column and printing second column

Hello all. I have a problem that I need help solving. I would like to convert the following file: human pool1_12 10e-02 45 67 human pool1_1899 10e-01 45 29 human pool1_1829 10e-01 43 26 horse pool1_343 10e-20 65 191 horse pool1_454 10e-09 44 43... (5 Replies)
Discussion started by: viralnerd
5 Replies

8. Shell Programming and Scripting

Help with finding a string and printing value in the next column

Hi, been about 10 years since I've scripted, so very rusty and could use some quick help. I have a file that contains data like such: folder1 jondoe owner janedoe reader joeshmo none folder2 jondoe none janedoe none joeshmo owner folder3 jondoe owner folder4 janedoe owner joeshmo... (7 Replies)
Discussion started by: drewpark
7 Replies

9. Shell Programming and Scripting

How to select only the most frequent instances of a variable string in a file?

I've got a web access file that I want to grep (or awk or perl or whatever will work!) out the most frequent instances of unique IP entries. Meaning the file looks something like this: I'd like to run a sort or grep (or whatever) that will only select out the lines from IP's that had the... (7 Replies)
Discussion started by: kevinmccallum
7 Replies

10. Programming

Optimizing frequent file transfer?

Hi I have written a simple client/server(socket programming) application using TCP/IP. My server code runs on Linux and client is on windows. The concept is that the client request for files(on demand basis) to the server and the server sends it back to the client. As the client is attached to... (3 Replies)
Discussion started by: akilan
3 Replies
Login or Register to Ask a Question