Filtering duplicate lines

02-08-2002

Registered User

10, 0

Join Date: Jan 2002

Last Activity: 3 March 2003, 3:04 PM EST

Location: Houston

Posts: 10

Thanks Given: 0

Thanked 0 Times in 0 Posts

Filtering duplicate lines

Does anybody know a command that filters duplicate lines out of a file. Similar to the uniq command but can handle duplicate lines no matter where they occur in a file?

AreaMan

View Public Profile for AreaMan

Find all posts by AreaMan

02-08-2002

thehoghunter

Guest

n/a, 0

Posts: n/a

Check the man page on sort (sort -u )

thehoghunter

02-08-2002

Registered User

10, 0

Join Date: Jan 2002

Last Activity: 3 March 2003, 3:04 PM EST

Location: Houston

Posts: 10

Thanks Given: 0

Thanked 0 Times in 0 Posts

Thanks that does almost want I want. However, is there a way I can do it wile preserving the original order of the data?

AreaMan

View Public Profile for AreaMan

Find all posts by AreaMan

02-08-2002

thehoghunter

Guest

n/a, 0

Posts: n/a

What is the original order of the file? Is it order or chaos?

If the file has an order (by date time, by nodename, by some field) then you can also sort by that field (check the man page)

If it is chaos - meaning no specific order (it just came that way!) then I believe you would need to write a script (Perl ) or a program (your preference of lanuage) to get what you are trying to do.

thehoghunter

02-08-2002

Registered User

10, 0

Join Date: Jan 2002

Last Activity: 3 March 2003, 3:04 PM EST

Location: Houston

Posts: 10

Thanks Given: 0

Thanked 0 Times in 0 Posts

The order is indeed chaos. The information is to be ploted out. If the input order is lost then the plot loses meaning.

VELSTK1621-45
' ' 3031487.7 379165.3
VELSTK1621-45
' ' 3032181.8 379848.9
VELSTK1629-45
' ' 3005331.9 348245.4
VELSTK1629-45
' ' 3006027.4 348927.5
VELSTK1629-45
' ' 3006724.5 349610.6
VELSTK1629-45
' ' 3007420.4 350291.5
VELSTK1629-45
' ' 3008116.8 350974.5

I only need the first instance of a line begining with "VEL", however, if I sort it the attached information becomes jumbled.

Cheers

AreaMan

View Public Profile for AreaMan

Find all posts by AreaMan

02-08-2002

thehoghunter

Guest

n/a, 0

Posts: n/a

If the file is ordered by VEL(some number) and then the plots, an even simplier script could be written to read each line. Save the VEL info into a variable
Read the first line - if it has a VEL in it
compare it to the VEL variable.
If it is different , write it to the new file and
save it into the VEL variable
if it is the same read the next line.
if it has no VEL in it, write it to the new file

Unless there is something else in the file that would mess with this, it should work.

thehoghunter

02-08-2002

Registered User

10, 0

Join Date: Jan 2002

Last Activity: 3 March 2003, 3:04 PM EST

Location: Houston

Posts: 10

Thanks Given: 0

Thanked 0 Times in 0 Posts

Cheers, I think this is the inevitable conlusion/solution. I was hoping to get away with a ready made unix command. Uniq showed such promise.

I have a few other things to be doing till I have to cross this particular bridge again.

Thanks again for the ideas.

AreaMan

View Public Profile for AreaMan

Find all posts by AreaMan

UNIX for Advanced & Expert Users

Filtering duplicate lines

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Filtering log file with lines older than 10 days.

Discussion started by: shunya

2. Shell Programming and Scripting

Awk/sed : help on:Filtering multiple lines to one:

Discussion started by: rveri

3. Shell Programming and Scripting

Filtering out lines in a .csv file

Discussion started by: dev.devil.1983

4. UNIX for Dummies Questions & Answers

Filtering data -extracting specific lines

Discussion started by: A-V

5. Shell Programming and Scripting

Perl: filtering lines based on duplicate values in a column

Discussion started by: polsum

6. UNIX for Advanced & Expert Users

In a huge file, Delete duplicate lines leaving unique lines

Discussion started by: krishnix

7. Homework & Coursework Questions

Filtering Unique Lines

Discussion started by: billydeanmak

8. Shell Programming and Scripting

filtering out duplicate substrings, regex string from a string

Discussion started by: kchinnam

9. UNIX for Dummies Questions & Answers

Filtering similar lines in a big list

Discussion started by: Andrew9191

10. Shell Programming and Scripting

Issues with filtering duplicate records using gawk script

Discussion started by: nmumbarkar