how to delete duplicate rows in a file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting how to delete duplicate rows in a file
# 1  
Old 06-18-2008
how to delete duplicate rows in a file

I have a file content like below.

Code:
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCYO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZI","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""
"0000000","ABLNCZO","BOTH",1049,2058,"XYZ","5711002","","Y","","","","","","","",""


I want to know the command to delete duplicate rows in whole file.

Thanks in advance.

Last edited by Yogesh Sawant; 06-18-2008 at 10:13 AM.. Reason: added code tags
# 2  
Old 06-18-2008
have a look at the "sort" command...

sort -u /your/file > /new/file

can maybe help (untestet!)
# 3  
Old 06-18-2008
moved from "solaris" to "shell scripting" cause this is no solaris specific problem!
# 4  
Old 06-18-2008
man uniq
# 5  
Old 06-18-2008
Hammer & Screwdriver beware of order of data

If the file is already in sorted order, than one can simply do
Code:
uniq file1

However, if the file is in random order, you will not get a truly unique file output. I think it only looks to adjoining records, so you could have repeated entries in your output. Thus, probably better doing
Code:
sort -u file1

Both of the above commands send output to screen, thus to send to a file, try
Code:
sort -u file1 >file2

# 6  
Old 06-18-2008
or if you like, an awk example
Code:
awk '!arr[$0]++' filename > newfile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. UNIX for Beginners Questions & Answers

Get duplicate rows from a csv file

How can i get the duplicates rows from a file using unix, for example i have data like a,1 b,2 c,3 d,4 a,1 c,3 e,5 i want output to be like a,1 c,3 (4 Replies)
Discussion started by: ggupta
4 Replies

2. Shell Programming and Scripting

Removing Duplicate Rows in a file

Hello I have a file with contents like this... Part1 Field2 Field3 Field4 (line1) Part2 Field2 Field3 Field4 (line2) Part3 Field2 Field3 Field4 (line3) Part1 Field2 Field3 Field4 (line4) Part4 Field2 Field3 Field4 (line5) Part5 Field2 Field3 Field4 (line6) Part2 Field2 Field3 Field4... (7 Replies)
Discussion started by: ekbaazigar
7 Replies

3. Shell Programming and Scripting

Delete duplicate rows

Hi, This is a followup to my earlier post him mno klm 20 76 . + . klm_mango unix_00000001; alp fdc klm 123 456 . + . klm_mango unix_0000103; her tkr klm 415 439 . + . klm_mango unix_00001043; abc tvr klm 20 76 . + . klm_mango unix_00000001; abc def klm 83 84 . + . klm_mango... (5 Replies)
Discussion started by: jacobs.smith
5 Replies

4. Shell Programming and Scripting

Duplicate rows in a text file

notes: i am using cygwin and notepad++ only for checking this and my OS is XP. #!/bin/bash typeset -i totalvalue=(wc -w /cygdrive/c/cygwinfiles/database.txt) typeset -i totallines=(wc -l /cygdrive/c/cygwinfiles/database.txt) typeset -i columnlines=`expr $totalvalue / $totallines` awk -F' ' -v... (5 Replies)
Discussion started by: whitecross
5 Replies

5. Ubuntu

delete duplicate rows with awk files

Hi every body I have some text file with a lots of duplicate rows like this: 165.179.568.197 154.893.836.174 242.473.396.153 165.179.568.197 165.179.568.197 165.179.568.197 154.893.836.174 how can I delete the repeated rows? Thanks Saeideh (2 Replies)
Discussion started by: sashtari
2 Replies

6. Shell Programming and Scripting

delete rows in a file based on the rows of another file

I need to delete rows based on the number of lines in a different file, I have a piece of code with me working but when I merge with my C application, it doesnt work. sed '1,'\"`wc -l < /tmp/fileyyyy`\"'d' /tmp/fileA > /tmp/filexxxx Can anyone give me an alternate solution for the above (2 Replies)
Discussion started by: Muthuraj K
2 Replies

7. Shell Programming and Scripting

how to delete duplicate rows based on last column

hii i have a huge amt of data stored in a file.Here in this file i need to remove duplicates rows in such a way that the last column has different data & i must check for greatest among last colmn data & print the largest data along with other entries but just one of other duplicate entries is... (16 Replies)
Discussion started by: reva
16 Replies

8. HP-UX

How to get Duplicate rows in a file

Hi all, I have written one shell script. The output file of this script is having sql output. In that file, I want to extract the rows which are having multiple entries(duplicate rows). For example, the output file will be like the following way. ... (7 Replies)
Discussion started by: raghu.iv85
7 Replies

9. UNIX for Dummies Questions & Answers

Remove duplicate rows of a file based on a value of a column

Hi, I am processing a file and would like to delete duplicate records as indicated by one of its column. e.g. COL1 COL2 COL3 A 1234 1234 B 3k32 2322 C Xk32 TTT A NEW XX22 B 3k32 ... (7 Replies)
Discussion started by: risk_sly
7 Replies

10. Shell Programming and Scripting

duplicate rows in a file

hi all can anyone please let me know if there is a way to find out duplicate rows in a file. i have a file that has hundreds of numbers(all in next row). i want to find out the numbers that are repeted in the file. eg. 123434 534 5575 4746767 347624 5575 i want 5575 please help (3 Replies)
Discussion started by: infyanurag
3 Replies
Login or Register to Ask a Question