How to delete or remove duplicate lines in a file

07-20-2009

Registered User

41, 0

Join Date: Jul 2009

Last Activity: 2 June 2010, 7:15 AM EDT

Posts: 41

Thanks Given: 0

Thanked 0 Times in 0 Posts

How to delete or remove duplicate lines in a file

Hi please help me how to remove duplicate lines in any file.
I have a file having huge number of lines.
i want to remove selected lines in it.
And also if there exists duplicate lines, I want to delete the rest & just keep one of them.
Please help me with any unix commands or even fortran program
for example

Code:

 SIG   50   12   0   34   87   3.00  37.0000N  100.0000E
 SIG   50   12   0   34   87   3.00  37.0000N  100.0000E  
SIG   18     7   9     0     0    0.00  36.0000N   60.0000E
SSR   40    7    0    0     0    0.00  35.2000N   60.4000E

Here i want the output to look like

Code:

SIG   50   12   0   34   87   3.00  37.0000N  100.0000E
SIG   18     7   9     0     0    0.00  36.0000N   60.0000E
SSR  40    7    0    0     0    0.00  35.2000N   60.4000E

Last edited by Yogesh Sawant; 07-20-2009 at 07:36 AM.. Reason: added code tags

reva

View Public Profile for reva

Find all posts by reva

07-20-2009

Registered User

5,521, 335

Join Date: Dec 2008

Last Activity: 28 March 2014, 8:35 AM EDT

Location: Vienna, Austria, Earth

Posts: 5,521

Thanks Given: 38

Thanked 335 Times in 308 Posts

If the order of the lines isn't important, sort -u. If the duplicates always appear grouped (as in your example), a simple uniq should suffice.

If there's no grouping and you want to keep the order:

Code:

$ perl -ne 'print if !$seen{$_}; $seen{$_}++' file

pludi

View Public Profile for pludi

Find all posts by pludi

07-20-2009

Registered User

41, 0

Join Date: Jul 2009

Last Activity: 2 June 2010, 7:15 AM EDT

Posts: 41

Thanks Given: 0

Thanked 0 Times in 0 Posts

hi uniq is working but perl command is giving an error. I hav a dhought if i need to remove duplicates & keep just one of it in file then i can use uniq. but if i want to remove duplicate lines based on a criteria
for example

Code:

SIG    765   0   0   0   0   0.00  35.2000N   60.4000E   25      39 
SSR   765   0   0   0   0   0.00  34.5600N   65.4000E   25      67       89    
SSR 1390   5   0   0   0   0.00  39.8000N   64.4000E   20      56
LEE  1458   8   0   0   0   0.00  25.1000N   99.2000E    9                 56

now i want my output file to look lik

Code:

SSR   765   0   0   0   0   0.00  34.5600N   65.4000E   25      67       89    
SSR 1390   5   0   0   0   0.00  39.8000N   64.4000E   20      56
LEE  1458   8   0   0   0   0.00  25.1000N   99.2000E    9                 56

i mean to say only few specific colums should be checked if its same for example in the first file 2nd,3rd,4th,5th,6th,7th colums were same for 1st & 2nd row so i must remove the duplicate lines & retain a line which has maximum fields or colums in it.
Help me out if thier any command to check

Last edited by Yogesh Sawant; 07-20-2009 at 07:37 AM.. Reason: added code tags

reva

View Public Profile for reva

Find all posts by reva

07-20-2009

Registered User

5,690, 630

Join Date: Jan 2007

Last Activity: 9 January 2017, 4:40 AM EST

Location: Варна, България / Milano, Italia

Posts: 5,690

Thanks Given: 184

Thanked 630 Times in 587 Posts

Code:

perl -lane'  
    $k = join " ", @F[ 1 .. 6 ];
    $m{$k} = @F if $u{$k}++ and @F > $m{$k};
    push @r, $_;

    END {
        for (@r) {
            $k = join " ", (split)[ 1 .. 6 ];
            print if $u{$k} == 1 or split == $m{$k};
        }
    }' infile

radoulov

View Public Profile for radoulov

Find all posts by radoulov

07-20-2009

Registered User

41, 0

Join Date: Jul 2009

Last Activity: 2 June 2010, 7:15 AM EDT

Posts: 41

Thanks Given: 0

Thanked 0 Times in 0 Posts

hii cant we use any other command like awk or any unix command for getting such output as shown in my privous post

reva

View Public Profile for reva

Find all posts by reva

07-20-2009

Registered User

5,690, 630

Join Date: Jan 2007

Last Activity: 9 January 2017, 4:40 AM EST

Location: Варна, България / Milano, Italia

Posts: 5,690

Thanks Given: 184

Thanked 630 Times in 587 Posts

Of course, but why? What's the problem with Perl?

radoulov

View Public Profile for radoulov

Find all posts by radoulov

07-20-2009

Registered User

41, 0

Join Date: Jul 2009

Last Activity: 2 June 2010, 7:15 AM EDT

Posts: 41

Thanks Given: 0

Thanked 0 Times in 0 Posts

I dont know to use perl and i am not understanding the code which you have given..so if u tell me any other simple command like awk or anything else it will be helpful.I am new to linux .

reva

View Public Profile for reva

Find all posts by reva

UNIX for Dummies Questions & Answers

How to delete or remove duplicate lines in a file

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove duplicate lines, sort it and save it as file itself

Discussion started by: refrain

2. Shell Programming and Scripting

Remove duplicate lines from a file

Discussion started by: sudhakar T

3. Shell Programming and Scripting

Remove duplicate lines from a 50 MB file size

Discussion started by: vsachan

4. Shell Programming and Scripting

How do I remove the duplicate lines in this file?

Discussion started by: Ernst

5. Shell Programming and Scripting

remove duplicate lines from file linux/sh

Discussion started by: crimso

6. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Discussion started by: krishnix

7. UNIX for Dummies Questions & Answers

Delete duplicate lines and print to file

Discussion started by: bfurlong

8. Shell Programming and Scripting

delete semi-duplicate lines from file?

Discussion started by: paqman

9. UNIX for Dummies Questions & Answers

Remove Duplicate lines from File

Discussion started by: Nysif Steve

10. Shell Programming and Scripting

Remove Duplicate Lines in File

Discussion started by: Teh Tiack Ein