Getting the most repeated column | Unix Linux Forums | Shell Programming and Scripting

  Go Back    


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

Getting the most repeated column

Shell Programming and Scripting


Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 07-10-2013
teefa teefa is offline
Registered User
 
Join Date: Nov 2011
Last Activity: 17 February 2014, 12:37 AM EST
Posts: 81
Thanks: 31
Thanked 0 Times in 0 Posts
Getting the most repeated column

Hi all ,

i want to get the most repeated column in my file
File:

Code:
name,ID 
adam,12345  ----1
adam,12345  ----2
adam,934
adam,12345  ----3
john,14
john,13
john,25 ----1 
john,25 ----2
tom,1  -----1
tom,2  -----1

so my output to be

Code:
adam,12345,4    ----[4] mean adams appears 4 times
john,25,4
tom,1,2 ----as it appears first  or if possible get tom,1,2,2 --- (1) and (2) with 2 appearances

thanks alot in advance
Sponsored Links
    #2  
Old 07-10-2013
anbu23's Avatar
anbu23 anbu23 is offline Forum Advisor  
Advisor
 
Join Date: Mar 2006
Last Activity: 20 May 2014, 5:16 AM EDT
Location: Bangalore,India
Posts: 2,089
Thanks: 12
Thanked 131 Times in 126 Posts
what have you tried?
Sponsored Links
    #3  
Old 07-10-2013
teefa teefa is offline
Registered User
 
Join Date: Nov 2011
Last Activity: 17 February 2014, 12:37 AM EST
Posts: 81
Thanks: 31
Thanked 0 Times in 0 Posts
i have tried with

Code:
cat file | uniq -c | /usr/xpg4/bin/awk -F"," '!a[$1,$2]++'

to get me first or most repeated but in case of tom i want the 2 cases from 2 diff records in one record

its not that simple

Code:
name,ID,ID1,ID2
adam,12345,1,2  ----1
adam,12345,1,1  ----2
adam,934,1,2
adam,12345,2,2  ----3
john,14
john,13
john,25 ----1 
john,25 ----2
tom,1  -----1
tom,2  -----1

to get me for example

Code:
adam,12345,1,2 the most repeated fields in one reocrds

    #4  
Old 07-10-2013
zaxxon's Avatar
zaxxon zaxxon is offline Forum Staff  
code tag tagger
 
Join Date: Sep 2007
Last Activity: 22 July 2014, 10:18 AM EDT
Location: St. Gallen, Switzerland
Posts: 6,213
Thanks: 116
Thanked 450 Times in 410 Posts
I am not sure if I understood what you want, since the 2nd post for line "adam,12345" differs from the example in post 1. There are 2 new fields that weren't there in post 1.
Seems there is some inconsistency between examples of input and output.

Anyway, giving a blind shot taking the 1st example as input without the -----[n]:

Code:
$ awk 'NR > 1{_[$1]++} END{for(a in _){print a ","  _[a]}}' infile | sort -nt, -k3| tail -1
adam,12345,3

The cat in your code is not needed.
The Following User Says Thank You to zaxxon For This Useful Post:
teefa (07-10-2013)
Sponsored Links
    #5  
Old 07-10-2013
teefa teefa is offline
Registered User
 
Join Date: Nov 2011
Last Activity: 17 February 2014, 12:37 AM EST
Posts: 81
Thanks: 31
Thanked 0 Times in 0 Posts
i have tried it , it works but i think you can make it

Code:
NR>=1  instead NR>1

but thanks but can u look at the second example
Sponsored Links
    #6  
Old 07-10-2013
zaxxon's Avatar
zaxxon zaxxon is offline Forum Staff  
code tag tagger
 
Join Date: Sep 2007
Last Activity: 22 July 2014, 10:18 AM EDT
Location: St. Gallen, Switzerland
Posts: 6,213
Thanks: 116
Thanked 450 Times in 410 Posts
Nope, that's not correct.
NR>=1 means equal or greater than 1. Since a file you want to parse usually has a 1st line, this makes no sense. You could leave it away if you want to count the header in.
If you want to skip the header line, you have to use NR>1 to just skip it.

So if the 2nd example is just a new or altered request, you should be able to alter the code given, to achieve the same. If you do not understand the code, that is no problem, but you have to let us know.
This is no script drive-in

Last edited by zaxxon; 07-10-2013 at 07:02 AM.. Reason: spelling
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Finding most repeated entry in a column and giving the count necro98 Shell Programming and Scripting 4 07-26-2012 02:31 AM
[Solved] Extract records based on a repeated column value m_usmanayub Emergency UNIX and Linux Support 1 02-27-2012 01:17 AM
Average for repeated elements in a column FelipeAd UNIX for Dummies Questions & Answers 1 02-15-2011 11:44 AM
repeated column data filter and make as a row vasanth_vadalur Shell Programming and Scripting 2 06-16-2009 07:01 AM
Deleting repeated strings in column 2 cgkmal Shell Programming and Scripting 5 05-26-2009 03:36 AM



All times are GMT -4. The time now is 01:35 AM.