finding duplicates in csv based on key columns

11-24-2011

Registered User

27, 0

Join Date: Jun 2011

Last Activity: 4 April 2012, 2:36 AM EDT

Posts: 27

Thanks Given: 3

Thanked 0 Times in 0 Posts

finding duplicates in csv based on key columns

Hi team,

I have 20 columns csv files. i want to find the duplicates in that file based on the column1 column10 column4 column6 coulnn8 coulunm2 . if those columns have same values . then it should be a duplicate record.

can one help me on finding the duplicates,

Thanks in advance.

i sorted the columns first based on the key and view the data but. it won't show properly.

thanks,
Baski

baskivs

View Public Profile for baskivs

Find all posts by baskivs

11-24-2011

Registered User

833, 187

Join Date: Jul 2008

Last Activity: 9 March 2016, 9:36 AM EST

Posts: 833

Thanks Given: 9

Thanked 187 Times in 177 Posts

Give some input content and expected output ..

jayan_jay

View Public Profile for jayan_jay

Find all posts by jayan_jay

11-24-2011

Registered User

60, 1

Join Date: May 2010

Last Activity: 9 July 2012, 6:56 AM EDT

Location: Bangalore

Posts: 60

Thanks Given: 1

Thanked 1 Time in 1 Post

Try this:

Quote:

'inputFile.csv'
------------
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
11,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
1,2,3,4,5,6,7,8,9,40,11,12,13,14,15,16,17,18,19,20
1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20
1,2,3,4,5,66,7,8,9,10,11,12,13,14,15,16,17,18,19,20

awk -F, '!dup[$1,$10,$4,$6,$8,$2]++' inputFile.csv

Sheel

View Public Profile for Sheel

Find all posts by Sheel

Shell Programming and Scripting

finding duplicates in csv based on key columns

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Sort and remove duplicates in directory based on first 5 columns:

Discussion started by: gnnsprapa

2. Shell Programming and Scripting

UNIX scripting for finding duplicates and null records in pk columns

Discussion started by: praveenraj.1991

3. Shell Programming and Scripting

Remove Duplicates on multiple Key Columns and get the Latest Record from Date/Time Column

Discussion started by: vijaykodukula

4. Shell Programming and Scripting

Removing duplicates in fixed width file which has multiple key columns

Discussion started by: saj

5. Shell Programming and Scripting

CSV with commas in field values, remove duplicates, cut columns

Discussion started by: krishnix

6. UNIX for Dummies Questions & Answers

Removing duplicates based on key

Discussion started by: pandeesh

7. Shell Programming and Scripting

Search based on 1,2,4,5 columns and remove duplicates in the same file.

Discussion started by: onesuri

8. Shell Programming and Scripting

Remove duplicates based on the two key columns

Discussion started by: kmsekhar

9. Shell Programming and Scripting

finding duplicates in columns and removing lines

Discussion started by: totus

10. Shell Programming and Scripting

removing duplicates based on key

Discussion started by: pukars4u