Sponsored Content
Top Forums Shell Programming and Scripting Finding the most common entry in a column Post 302146605 by Donkey25 on Wednesday 21st of November 2007 10:38:48 AM
Old 11-21-2007
Finding the most common entry in a column

Hi,

I have a file with 3 columns in it that are comma separated and it has about 5000 lines. What I want to do is find the most common value in column 3 using awk or a shell script or whatever works! I'm totally stuck on how to do this.

e.g.

value1,value2,bob
value1,value2,bob
value1,value2,bob
value1,value2,dave
value1,value2,james

Clearly in the above example the most popular value in column3 is "bob", but how would I write a script to work this out?

Many thanks
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Finding longest common substring among filenames

I will be performing a task on several directories, each containing a large number of files (2500+) that follow a regular naming convention: YYYY_MM_DD_XX.foo_bar.A.B.some_different_stuff.EXT What I would like to do is automatically discover the part of the filenames that are common to all... (1 Reply)
Discussion started by: cmcnorgan
1 Replies

2. Shell Programming and Scripting

Finding Authors in Common Across Dozens of Lists

I currently have publication lists for ~3 dozen faculty members. I need to find out how many publications are in common across all faculty members - person 1 with person 2, person 1 with person 3, person 2 with person 3, person 1 with both person 2 and person 3, etc. One person may have Last1,... (5 Replies)
Discussion started by: Peggy White
5 Replies

3. Shell Programming and Scripting

finding common numbers (contents) across 2 or 3 files

I have 3 files which are tab delimited and have numbers in it. file 1 1 2 3 4 5 6 7 File 2 3 5 7 8 File 3 1 (4 Replies)
Discussion started by: Lucky Ali
4 Replies

4. Shell Programming and Scripting

for each different entry in column 1 extract maximum values from column 2 in unix/awk

Hello, I have 2 columns (1st column has multiple entries but the corresponding values in the column 2 may be the same or different.) however I want to extract unique values for each entry in column 1 by assigning the max value from column 2 SDF4 -0.211654 SDF4 0.978068 ... (1 Reply)
Discussion started by: Diya123
1 Replies

5. Shell Programming and Scripting

Rename a header column by adding another column entry to the header column name URGENT!!

Hi All, I have a file example.csv which looks like this GrpID,TargetID,Signal,Avg_Num CSCH74_1_1,2007,61,256 CSCH74_1_1,212007,647,679 CSCH74_1_1,12007,3,32 CSCH74_1_1,207,299,777 I want the output as GrpID,TragetID,Signal-CSCH74_1_1,Avg_Num CSCH74_1_1,2007,61,256... (4 Replies)
Discussion started by: Vavad
4 Replies

6. Shell Programming and Scripting

Finding most repeated entry in a column and giving the count

Please can you help in providing the most repeated entry in the 2nd column and give its count Here is an input file 1, This , is a forum 2, This , is a forum 1, There , is a forum 2, This , is not right Here the most repeated entry is "This" and count is 3 So output... (4 Replies)
Discussion started by: necro98
4 Replies

7. Shell Programming and Scripting

Finding most common substrings

Hello, I would like to know what is the three most abundant substrings of length 6 from col2. The file is quite large and looks like this col1 col2 EN03 typehellobyedogcatcatdog EN09 typehellobyebyebyebye EN08 dogcatcatdogbyebyebyebye EN09 catcattypehellobyebyebyebye... (9 Replies)
Discussion started by: verse123
9 Replies

8. Shell Programming and Scripting

Sum column values based in common identifier in 1st column.

Hi, I have a table to be imported for R as matrix or data.frame but I first need to edit it because I've got several lines with the same identifier (1st column), so I want to sum the each column (2nd -nth) of each identifier (1st column) The input is for example, after sorted: K00001 1 1 4 3... (8 Replies)
Discussion started by: sargotrons
8 Replies

9. UNIX for Beginners Questions & Answers

Finding common entries between 10 columns

Hello, I need to find the intersection across 10 columns. Kindly help. my file (INPUT.csv) looks like this 4_R 4_S 8_R 8_S 12_R 12_S 24_R 24_S LOC_Os01g01010 LOC_Os01g01010 LOC_Os01g01010 LOC_Os04g48290 LOC_Os01g01010 LOC_Os01g01010... (1 Reply)
Discussion started by: Sanchari
1 Replies

10. UNIX for Beginners Questions & Answers

Awk/sed summation of one column based on some entry in first column

Hi All , I am having an input file as stated below Input file 6 ddk/djhdj/djhdj/Q 10 0.5 dhd/jdjd.djd.nd/QB 01 0.5 hdhd/jd/jd/jdj/Q 10 0.5 512 hd/hdh/gdh/Q 01 0.5 jdjd/jd/ud/j/QB 10 0.5 HD/jsj/djd/Q 01 0.5 71 hdh/jjd/dj/jd/Q 10 0.5 ... (5 Replies)
Discussion started by: kshitij
5 Replies
SLAPD.REPLOG(5) 						File Formats Manual						   SLAPD.REPLOG(5)

NAME
slapd.replog - slapd replication log format SYNOPSIS
slapd.replog slapd.replog.lock DESCRIPTION
The file slapd.replog is produced by the stand-alone LDAP daemon, slapd(8), when changes are made to its local database that are to be propagated to one or more replica slapds. The file consists of zero or more records, each one corresponding to a change, addition, or deletion from the slapd database. The file is meant to be read and processed by slurpd(8), the stand-alone LDAP update replication daemon. The records are separated by a blank line. Each record has the following format. The record begins with one or more lines indicating the replicas to which the change is to be propagated: replica: <hostname[:portnumber]> Next, the time the change took place given, as the number of seconds since 00:00:00 GMT, Jan. 1, 1970, with an optional decimal extension, in order to make times unique. Note that slapd does not make times unique, but slurpd makes all times unique in its copies of the replog files. time: <integer[.integer]> Next, the distinguished name of the entry being changed is given: dn: <distinguishedname> Next, the type of change being made is given: changetype: <[modify|add|delete|modrdn]> Finally, the change information itself is given, the format of which depends on what kind of change was specified above. For a changetype of modify, the format is one or more of the following: add: <attributetype> <attributetype>: <value1> <attributetype>: <value2> ... - Or, for a replace modification: replace: <attributetype> <attributetype>: <value1> <attributetype>: <value2> ... - Or, for a delete modification: delete: <attributetype> <attributetype>: <value1> <attributetype>: <value2> ... - If no attributetype lines are given, the entire attribute is to be deleted. For a changetype of add, the format is: <attributetype1>: <value1> <attributetype1>: <value2> ... <attributetypeN>: <value1> <attributetypeN>: <value2> For a changetype of modrdn, the format is: newrdn: <newrdn> deleteoldrdn: 0 | 1 where a value of 1 for deleteoldrdn means to delete the values forming the old rdn from the entry, and a value of 0 means to leave the val- ues as non-distinguished attributes in the entry. For a changetype of delete, no additional information is needed in the record. The format of the values is the LDAP Directory Interchange Format described in ldif(5). Access to the slapd.replog file is synchronized through the use of flock(3) on the file slapd.replog.lock. Any process reading or writing this file should obey this locking convention. EXAMPLE
The following sample slapd.replog file contains information on one of each type of change. replica: truelies.rs.itd.umich.edu replica: judgmentday.rs.itd.umich.edu time: 797612941 dn: cn=Babs Jensen,dc=example,dc=com changetype: add objectclass: person cn: babs cn: babs jensen sn: jensen replica: truelies.rs.itd.umich.edu replica: judgmentday.rs.itd.umich.edu time: 797612973 dn: cn=Babs Jensen,dc=example,dc=com changetype: modify add: description description: the fabulous babs replica: truelies.rs.itd.umich.edu replica: judgmentday.rs.itd.umich.edu time: 797613020 dn: cn=Babs Jensen,dc=example,dc=com changetype: modrdn newrdn: cn=Barbara J Jensen deleteoldrdn: 0 FILES
slapd.replog slapd replication log file slapd.replog.lock lockfile for slapd.replog SEE ALSO
ldap(3), ldif(5), slapd(8), slurpd(8) ACKNOWLEDGEMENTS
OpenLDAP is developed and maintained by The OpenLDAP Project (http://www.openldap.org/). OpenLDAP is derived from University of Michigan LDAP 3.3 Release. OpenLDAP 2.1.X RELEASEDATE SLAPD.REPLOG(5)
All times are GMT -4. The time now is 10:51 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy