The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com




View Single Post in the UNIX and Linux Forums - Click on the Thread or Permalink to View Entire Thread -->
  #1 (permalink)  
Old 06-23-2005
radhika radhika is offline
Registered User
  
 

Join Date: Apr 2005
Posts: 51
fastest way to remove duplicates.

I have searched the FAQ - by using sort, duplicates, etc.... but I didn't get any articles or results on it.

Currently, I am using:
sort -u file1 > file2 to remove duplicates. For a file size of 1giga byte approx. time taken to remove duplicates is 1hr 21 mins.

Is there any other faster way to remove duplicates? Our file sizes could get to 10 to 12 giga bytes size.

Aprpeciate any pointers.
Thanks,
Radhika.