sort takes a long time


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting sort takes a long time
# 1  
Old 02-08-2011
Lightbulb sort takes a long time

Dear experts

I have a 200MG text file in this format:

text \tab number

I try to sort using options -fd and it takes very long! is that normal or I can speed it up in some ways?
I dont want to split the file since this one is already splitted.

I use this command: sort -fd file > sorted-file

Thanks for the helps and comments.
# 2  
Old 02-08-2011
Seems -f and -d are not useful in your case.
Code:
       -d, --dictionary-order
              consider only blanks and alphanumeric characters

       -f, --ignore-case
              fold lower case to upper case characters

So just sort it directly.

Code:
sort file > sorted-file

Or show your sample data and expect output.
# 3  
Old 02-08-2011
they are useful since the first column is text (alphabetic).
# 4  
Old 02-08-2011
What Operating System and version are you using?
How many records?
How long does the process take?
Do you have spare memory and disc space which you could give to this sort process.

If your sort has a "-o filename" parameter , use this to specify the output file not a Shell redirect (> filename). It will be much much faster.

If set, what is the value of $TMPDIR ? Can it be set to point to a fast filesystem with at least twice the space of the size of the unsorted file?

You can get a dramatic improvement in the performance of unix sort by tuning the "-y kmem" parameter. It is very important that you start with enough memory allocated to do some serious sorting on the first pass.

Off topic: If you have a database engine it is often quicker to load a large file into a database table with suitable keys, then write the file out in the required order.
# 5  
Old 02-08-2011
I'm using linux. infact it takes more than 20 hours and it is not finished yet! I've allocated enough memory on tmpdir! I have no memory problem since it does not run out of memory!

I have about 10 million lines. I haven't set the -y kmem option and I have no idea how to use it. I need a fast improvement! I have no hard disk limitation and I can have a large ram as well.
# 6  
Old 02-08-2011
"Linux" is a bit vague.

Here are my timings for sorting a 600 Mb file after giving "sort" one Gigabyte of memory and a very large workspace. It used about 800 Mb of disc workspace and didn't make a dent in the memory. The unsorted file is random order but I also reverse sorted it to be sure that the test is representative.
This test server is nothing special - a 10 year old HP 9000 with HP-UX 11i and slowish 36Gb 10k rpm discs.

Code:
Ordinary sort:
date;sort -o bigfile.sor -T /workspace -y 1048576 bigfile;date
Tue Feb  8 12:25:47 GMT 2011
Tue Feb  8 12:28:10 GMT 2011

Dictionary sort:
date;sort -fd -o bigfile.sor -T /workspace -y 1048576 bigfile;date
Tue Feb  8 12:31:17 GMT 2011
Tue Feb  8 12:36:19 GMT 2011

Reverse sorting the output from the previous sort:
date;sort -r -fd -o bigfile.rev -T /workspace -y 1048576 bigfile.sor;date
Tue Feb  8 12:44:26 GMT 2011
Tue Feb  8 12:49:07 GMT 2011


Are you sure that you file is only 200 Mb ?

Last edited by methyl; 02-08-2011 at 08:56 AM.. Reason: paste errors
# 7  
Old 02-08-2011
very strange! my data is like this :

pleasant 2
festive 2
period 2
i declare 2
declare resumed 2
resumed the 2
the session 2
session of 2
of the 2
the european 2

and sorting it takes much longer! I just tried this :
sort -o sorted -d -y 1048576 file

and after 10 mins still nothing happened! I wonder how could you do that such a fast way! my file is 150 mb with about 10m lines.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Transfer file from a server takes long time

It takes 6 hrs for a 90 GB zip file that i am copying / transferring from serverA onto serverB. scp user1@serverA:/opt/setup/cash.zip . Output: cash.zip 21% 19GB 4.7MB/s 4:11:46 ETA uname -a SunOS serverB 5.11 11.2 sun4v sparc sun4vCan you please suggest if i could do... (11 Replies)
Discussion started by: mohtashims
11 Replies

2. Shell Programming and Scripting

Find command takes long

Hi, I am trying to search for a Directory called "mont" under a directory path "/opt/app/var/dumps" Although "mont" is in the very parent directory called "dumps" i.e "/opt/app/var/dumps/mont" and it can never be inside any Sub-Directory of "dumps"; my below find command which also checks... (5 Replies)
Discussion started by: mohtashims
5 Replies

3. Shell Programming and Scripting

Wget takes a long time to complete

Hi, I wish to check the return value for wget $url. However, some urls are designed to take 45 minutes or more to return. All i need to check if the URL can be reached or not using wget. How can i get wget to return the value in a few seconds ? (8 Replies)
Discussion started by: mohtashims
8 Replies

4. UNIX and Linux Applications

database takes long time to process

Hi, we currently having a issue where when we send jobs to the server for the application lawson, it is taking a very long time to complete. here are the last few lines of the database log. 2012-09-18-10.35.55.707279-240 E244403536A576 LEVEL: Warning PID : 950492 ... (1 Reply)
Discussion started by: techy1
1 Replies

5. UNIX for Dummies Questions & Answers

Changing Password process takes a long time

We are running unix. After a reboot of the server we have found that changing password takes a long time. if type in passwd "username" you can type in the 1st instance of the password , press enter , then it will wait for about 3 minutes before bringing up the confirm password line typing it in... (4 Replies)
Discussion started by: AIXlewis
4 Replies

6. UNIX for Dummies Questions & Answers

time how long it takes to load a module

Hello, like the title says, how can i measure the time it takes to load a module in Linux, and how how can i measure the time it takes to load a statically compiled module. /Best Regards Olle ---------- Post updated at 01:13 PM ---------- Previous update was at 11:54 AM ---------- For... (0 Replies)
Discussion started by: ollebanan
0 Replies

7. Linux

it takes long time to login on server

Hi, I am trying to login using ssh on Red Hat Linux 5 server, The password appears immediately but after I enter the password it takes about 90 seconds to login completely. Please suggest what changes require? Regards, Manoj (4 Replies)
Discussion started by: manoj.solaris
4 Replies

8. Shell Programming and Scripting

shell script takes long time to complete

Hi all, I wrote this shell script to validate filed numbers for input file. But it take forever to complete validation on a file. The average speed is like 9mins/MB. Can anyone tell me how to improve the performance of a shell script? Thanks (12 Replies)
Discussion started by: ozzman
12 Replies

9. Shell Programming and Scripting

Killing a process that takes too long

Hello, I have a C program that takes anywhere from 5 to 100 arguments and I'd like to run it from a script that makes sure it doesnt take too long to execute. If the C program takes more than 5 seconds to execute, i would like the shell script to kill it and return a short message to the user. ... (3 Replies)
Discussion started by: WeezelDs
3 Replies

10. Programming

fwrite takes extremely long time

After my previous thread, I think I found out what causes the long delays. I run this program on several Linux computers, and the sometimes (after the file with the arrays becomes big) the fwrite takes between 100 ms to 900 ms. This is very bad for me, as I want a timer to halt each 30 ms.... ... (5 Replies)
Discussion started by: inna
5 Replies
Login or Register to Ask a Question