sort takes a long time


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting sort takes a long time
# 8  
Old 02-08-2011
I don't think we identified your O/S beyond it being a Linux. There is much variation.

It is possible that your "sort" command does not have a "-y" switch or other switch to pre-allocate memory. Have you checked "man sort" or perhaps "info sort"?

Maybe you are a non-root user and have a memory quota which is too low to do this large sort?

Perhaps you have a basic kernel and the sort is trying to open more files than is allowed?

Have you checked the directory where you expect to find the sort workfiles? Are they there? Is there enough disc space in that filesystem?

Is the running sort using CPU? I'm starting to wonder if your "sort" program is faulty.

Afterthought. We assume that this is a unix standard format text file with each line terminated with a line-feed character (only) and that it has not come from a Microsoft platform.
# 9  
Old 02-08-2011
linux version:
Linux version 2.6.18-164.6.1.el5 (mockbuild@ls20-bc2-14.build.redhat.com) (gcc version 4.1.2 20080704 (Red Hat 4.1.2-46))

SORT:
sort (GNU coreutils) 5.97

it does not have -y but it has -S (buffersize) and it does not help neither!

also my sort program is fine! and it does use CPU!
# 10  
Old 02-08-2011
Googling your version of "sort" and the symptoms uncovered a can of worms.

For example: If your locale is anything other than "C" the performance of sort can be atrocious. There are other variants on this theme including the program ignoring the buffer parameter.

What is the output from the "locale" command ?

Suggest you take up the issue with your software supplier in case a fixed version is available.
# 11  
Old 02-09-2011
locale: LANG=en_US.UTF-8
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

so you guess it's because of my version?
# 12  
Old 02-09-2011
I think we have found the problem.

UTF-8 is cited as the worst possible character set to sort or grep because it can't be sorted as a simple binary key.
If your data is actually US ASCII then I'd try the sort with with the locale set to "C".

On my system (yours has more values):
Code:
LANG=
LC_CTYPE="C"
LC_COLLATE="C"
LC_MONETARY="C"
LC_NUMERIC="C"
LC_TIME="C"
LC_MESSAGES="C"
LC_ALL=

If you must sort in UTF-8 order some recommend using the sort command in "perl". The "sort" command in "perl" is three times slower than unix "sort" for LANG=C , but ten times faster for LANG=en_US.UTF-8 .


Rant: The UTF-8 issue arises from the latest Posix standards. It has given me grief with mixed platform XML too.
# 13  
Old 02-09-2011
Hi. If you don't want to create your own perl utility, you may be interested in:
Code:
msort - utility for sorting records in complex ways

...

       msort fully supports Unicode. The text to be sorted, and all
       specifications, should be in UTF-8 Unicode. (If you have plain ASCII
       text, this is not a problem as ASCII is a subset of Unicode.) Full
       Unicode case-folding is available, in Turkic and non-Turkic variants.
       Unicode normalization is performed before sorting.

-- excerpt from man msort, q.v

http://billposer.org/Software/msort.html

Good luck ... cheers, drl
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Transfer file from a server takes long time

It takes 6 hrs for a 90 GB zip file that i am copying / transferring from serverA onto serverB. scp user1@serverA:/opt/setup/cash.zip . Output: cash.zip 21% 19GB 4.7MB/s 4:11:46 ETA uname -a SunOS serverB 5.11 11.2 sun4v sparc sun4vCan you please suggest if i could do... (11 Replies)
Discussion started by: mohtashims
11 Replies

2. Shell Programming and Scripting

Find command takes long

Hi, I am trying to search for a Directory called "mont" under a directory path "/opt/app/var/dumps" Although "mont" is in the very parent directory called "dumps" i.e "/opt/app/var/dumps/mont" and it can never be inside any Sub-Directory of "dumps"; my below find command which also checks... (5 Replies)
Discussion started by: mohtashims
5 Replies

3. Shell Programming and Scripting

Wget takes a long time to complete

Hi, I wish to check the return value for wget $url. However, some urls are designed to take 45 minutes or more to return. All i need to check if the URL can be reached or not using wget. How can i get wget to return the value in a few seconds ? (8 Replies)
Discussion started by: mohtashims
8 Replies

4. UNIX and Linux Applications

database takes long time to process

Hi, we currently having a issue where when we send jobs to the server for the application lawson, it is taking a very long time to complete. here are the last few lines of the database log. 2012-09-18-10.35.55.707279-240 E244403536A576 LEVEL: Warning PID : 950492 ... (1 Reply)
Discussion started by: techy1
1 Replies

5. UNIX for Dummies Questions & Answers

Changing Password process takes a long time

We are running unix. After a reboot of the server we have found that changing password takes a long time. if type in passwd "username" you can type in the 1st instance of the password , press enter , then it will wait for about 3 minutes before bringing up the confirm password line typing it in... (4 Replies)
Discussion started by: AIXlewis
4 Replies

6. UNIX for Dummies Questions & Answers

time how long it takes to load a module

Hello, like the title says, how can i measure the time it takes to load a module in Linux, and how how can i measure the time it takes to load a statically compiled module. /Best Regards Olle ---------- Post updated at 01:13 PM ---------- Previous update was at 11:54 AM ---------- For... (0 Replies)
Discussion started by: ollebanan
0 Replies

7. Linux

it takes long time to login on server

Hi, I am trying to login using ssh on Red Hat Linux 5 server, The password appears immediately but after I enter the password it takes about 90 seconds to login completely. Please suggest what changes require? Regards, Manoj (4 Replies)
Discussion started by: manoj.solaris
4 Replies

8. Shell Programming and Scripting

shell script takes long time to complete

Hi all, I wrote this shell script to validate filed numbers for input file. But it take forever to complete validation on a file. The average speed is like 9mins/MB. Can anyone tell me how to improve the performance of a shell script? Thanks (12 Replies)
Discussion started by: ozzman
12 Replies

9. Shell Programming and Scripting

Killing a process that takes too long

Hello, I have a C program that takes anywhere from 5 to 100 arguments and I'd like to run it from a script that makes sure it doesnt take too long to execute. If the C program takes more than 5 seconds to execute, i would like the shell script to kill it and return a short message to the user. ... (3 Replies)
Discussion started by: WeezelDs
3 Replies

10. Programming

fwrite takes extremely long time

After my previous thread, I think I found out what causes the long delays. I run this program on several Linux computers, and the sometimes (after the file with the arrays becomes big) the fwrite takes between 100 ms to 900 ms. This is very bad for me, as I want a timer to halt each 30 ms.... ... (5 Replies)
Discussion started by: inna
5 Replies
Login or Register to Ask a Question