The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Remove duplicates from File from specific location gopikgunda Shell Programming and Scripting 1 04-08-2008 11:16 PM
Fastest way for searching the file vaibhavbhat UNIX for Advanced & Expert Users 3 03-10-2008 07:57 AM
How to remove duplicates without sorting orahi001 UNIX for Dummies Questions & Answers 4 01-17-2008 04:19 PM
how to delete/remove directory in fastest way getdpg Shell Programming and Scripting 6 03-07-2006 07:42 AM
fastest copy command vascobrito UNIX for Dummies Questions & Answers 0 07-20-2004 03:02 AM

Reply
 
Submit Tools LinkBack Thread Tools Search this Thread Display Modes
  #8  
Old 06-24-2005
Registered User
 

Join Date: Jun 2005
Location: Ireland
Posts: 61
It's equivalent to uniq, so it won't help you.
If your data is in fact already sorted then just use `uniq` instead of `sort -u`
Reply With Quote
Forum Sponsor
  #9  
Old 06-24-2005
Registered User
 

Join Date: Apr 2005
Posts: 51
No, my data is not sorted.
Reply With Quote
  #10  
Old 06-24-2005
RishiPahuja's Avatar
Registered User
 

Join Date: Apr 2005
Location: Bangalore, India
Posts: 203
Thumbs up

The best possible approach will be push all the data in oracle using sqlloader.
Create index on the fly for the key u want unique.
And fire query to get the unique records.

Any better alternatives?
Reply With Quote
  #11  
Old 06-24-2005
Registered User
 

Join Date: Apr 2005
Posts: 51
I am not sure if I want to reload all that data again into another table and .....

As I am pulling data from a table using select * from table name into a text file and then doing sort -u file1 > file2.

Although, I could try doing a select distinct columns from the table.... and see if it will take more time than it took my original approach. Is it worth trying? I don't know.

I just don't have the luxury of trying different options at my will as it is a production database unless I know it's worth trying.
Reply With Quote
  #12  
Old 06-24-2005
Registered User
 

Join Date: Jun 2005
Location: Ireland
Posts: 61
It's already in a database!
Just do add a sort by in the select clause and
index the appropriate fields.
Reply With Quote
  #13  
Old 06-24-2005
RishiPahuja's Avatar
Registered User
 

Join Date: Apr 2005
Location: Bangalore, India
Posts: 203
Thumbs up

Definetly its worth a try.

Precautions u can take are:

1. Make sure all distinct columns are indexed.
2. If it is one table, then u need not worry about joins...else make sure the joins are in such a way that you get maximum throughput instead of least response time
3. Run the query at such a time when no other big activity is going on in same table, bcos if query will be long...it can give rollback segmetn too old error.

All the best.
Reply With Quote
  #14  
Old 06-24-2005
Registered User
 

Join Date: Jun 2005
Location: Bangalore , INDIA
Posts: 28
Sorry for reply back ....

>> Hi Amit,



>> sed '$!N; /^\(.*\)\n\1$/!P; D'

>> Could you explain the command - bit by bit if you don't mind.

>> Thanks!

I think u can refer the man page of sed and look for sed Addresses

I think the topic is self explainatory...

BTW ...

I tested this command with more than 1GB file.

it took about 13 min to sort that file. Much Much Faster than sort command.
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes




All times are GMT -7. The time now is 04:52 AM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0