how to delete duplicate rows in a file

06-18-2008

Registered User

32, 0

Join Date: May 2008

Last Activity: 5 August 2008, 11:42 AM EDT

Posts: 32

Thanks Given: 0

Thanked 0 Times in 0 Posts

how to delete duplicate rows in a file

I have a file content like below.

Code:

"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""

I want to know the command to delete duplicate rows in whole file.

Thanks in advance.

Last edited by Yogesh Sawant; 06-18-2008 at 10:13 AM.. Reason: added code tags

vamshikrishnab

View Public Profile for vamshikrishnab

Find all posts by vamshikrishnab

06-18-2008

Registered User

5,725, 311

Join Date: Jul 2006

Last Activity: 17 February 2019, 10:46 AM EST

Location: Berlin, Germany

Posts: 5,725

Thanks Given: 75

Thanked 311 Times in 297 Posts

have a look at the "sort" command...

sort -u /your/file > /new/file

can maybe help (untestet!)

DukeNuke2

View Public Profile for DukeNuke2

Visit DukeNuke2's homepage!

Find all posts by DukeNuke2

06-18-2008

Registered User

5,725, 311

Join Date: Jul 2006

Last Activity: 17 February 2019, 10:46 AM EST

Location: Berlin, Germany

Posts: 5,725

Thanks Given: 75

Thanked 311 Times in 297 Posts

moved from "solaris" to "shell scripting" cause this is no solaris specific problem!

DukeNuke2

View Public Profile for DukeNuke2

Visit DukeNuke2's homepage!

Find all posts by DukeNuke2

06-18-2008

Registered User

30, 0

Join Date: Mar 2008

Last Activity: 27 August 2008, 10:25 AM EDT

Posts: 30

Thanks Given: 0

Thanked 0 Times in 0 Posts

man uniq

bisla.yogender

View Public Profile for bisla.yogender

Find all posts by bisla.yogender

06-18-2008

Registered User

2,524, 241

Join Date: Dec 2007

Last Activity: 17 March 2020, 2:04 PM EDT

Posts: 2,524

Thanks Given: 173

Thanked 241 Times in 206 Posts

beware of order of data

If the file is already in sorted order, than one can simply do

Code:

uniq file1

However, if the file is in random order, you will not get a truly unique file output. I think it only looks to adjoining records, so you could have repeated entries in your output. Thus, probably better doing

Code:

sort -u file1

Both of the above commands send output to screen, thus to send to a file, try

Code:

sort -u file1 >file2

joeyg

View Public Profile for joeyg

Find all posts by joeyg

06-18-2008

Registered User

11,728, 1,345

Join Date: Feb 2004

Last Activity: 8 May 2020, 9:07 AM EDT

Location: NM

Posts: 11,728

Thanks Given: 903

Thanked 1,345 Times in 1,201 Posts

or if you like, an awk example

Code:

awk '!arr[$0]++' filename > newfile

jim mcnamara

View Public Profile for jim mcnamara

Find all posts by jim mcnamara

Shell Programming and Scripting

how to delete duplicate rows in a file

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Get duplicate rows from a csv file

Discussion started by: ggupta

2. Shell Programming and Scripting

Removing Duplicate Rows in a file

Discussion started by: ekbaazigar

3. Shell Programming and Scripting

Delete duplicate rows

Discussion started by: jacobs.smith

4. Shell Programming and Scripting

Duplicate rows in a text file

Discussion started by: whitecross

5. Ubuntu

delete duplicate rows with awk files

Discussion started by: sashtari

6. Shell Programming and Scripting

delete rows in a file based on the rows of another file

Discussion started by: Muthuraj K

7. Shell Programming and Scripting

how to delete duplicate rows based on last column

Discussion started by: reva

8. HP-UX

How to get Duplicate rows in a file

Discussion started by: raghu.iv85

9. UNIX for Dummies Questions & Answers

Remove duplicate rows of a file based on a value of a column

Discussion started by: risk_sly

10. Shell Programming and Scripting

duplicate rows in a file

Discussion started by: infyanurag