The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > UNIX for Advanced & Expert Users
Google UNIX.COM


UNIX for Advanced & Expert Users Advanced UNIX and Linux questions go here. Expert-to-Expert.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Strange sed behaviour vino UNIX for Advanced & Expert Users 8 02-12-2008 02:51 AM
A Strange Behaviour!!! navojit dutta Shell Programming and Scripting 5 12-21-2007 12:35 AM
a strange message when executing the sort command marwan UNIX for Dummies Questions & Answers 3 04-27-2007 04:32 AM
Help me to resolve uncertian behaviour of a sort command pankajrai Shell Programming and Scripting 1 12-21-2005 10:12 AM
/etc/passwd strange behaviour! penguin-friend Linux 0 06-06-2005 09:00 AM

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 05-28-2008
Registered User
 

Join Date: Jul 2006
Posts: 18
Sort command - strange behaviour

Hi guys,

I have the following example data:

A;00:00:19
B;00:01:02
C;00:00:13
D;00:00:16
E;00:02:27
F;00:00:12
G;00:00:21
H;00:00:19
I;00:00:13
J;00:13:22

I run the following sort against it, yet the output is as follows:

sort -t";" +1 -nr example_data.dat

A;00:00:19
B;00:01:02
C;00:00:13
D;00:00:16
E;00:02:27
F;00:00:12
G;00:00:21
H;00:00:19
I;00:00:13
J;00:13:22

I'd expect it to recognise the field delimiter, skipping field 1 and then sorting numerically on the second field (i.e. to put the longest time first and the shortest last). Any ideas please? Is the ":" in this field causing an issue?

I've proven it is recognising the second field as the sort key by changing to a dictionary-based sort too:

sort -t"," +1 -dr example_data.dat

J,00:13:22
E,00:02:27
B,00:01:02
G,00:00:21
A,00:00:19
H,00:00:19
D,00:00:16
C,00:00:13
I,00:00:13
F,00:00:12

This produces the desired output against this subset of data, but when I run it against the 'live' data (of much larger volume) it isn't any use as it will start with all times beginning with '9' first and then in descending order, it has to be a numeric-based sort ultimately.

Thanks in advance,

Mark

Last edited by miwinter; 05-28-2008 at 01:55 AM. Reason: Additional info
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 05-28-2008
Registered User
 

Join Date: Sep 2006
Location: Mysore, India
Posts: 155
Try this

Code:
sort -t";" -rk2,2 example_data.dat
Reply With Quote
  #3 (permalink)  
Old 05-28-2008
Registered User
 

Join Date: Jul 2006
Posts: 18
Thanks for the uber-fast reply Krish. I looked at the key definition thing (k switch) but it didn't seem to to work either. Using what you gave does the right thing, only, when I transpose that command to use on my live data, it doesn't. Here's an example (first 10 lines out of the newly sorted file):

sort -t";" rk2,2 mwreport_joined.txt > mwreport_sorted.txt

GLMLRP_ComparisonJob;989:13:42
GLMLRP_Diff_HighlighterJob;989:08:56
AD046;988:44:15
GleamMIPostCanadaExtractJob;9196:53:12
GleamMIAGREERepAllBackOutJob;9025:39:12
GleamMIAGREEProdFacilCombJob;9025:29:36
GleamMIAGREEExcRateHistExtractJob;9025:21:26
GleamMIAGREEDynamicParamJob;9025:19:10
GleamMIAGREEClassExtractJob;9025:11:35
GleamMIAGREEClassPODLoadJob;9025:09:43

As you can see above, the "9196:53:12" value in the fourth record should be top of the list as it is the largest numerically
Reply With Quote
  #4 (permalink)  
Old 05-28-2008
Registered User
 

Join Date: Sep 2007
Location: Koblenz, Germany
Posts: 579
Is this what you want?

Code:
sort -t";" -rn -k2,2 mwreport_joined.txt

GleamMIPostCanadaExtractJob;9196:53:12
GleamMIAGREERepAllBackOutJob;9025:39:12
GleamMIAGREEProdFacilCombJob;9025:29:36
GleamMIAGREEExcRateHistExtractJob;9025:21:26
GleamMIAGREEDynamicParamJob;9025:19:10
GleamMIAGREEClassPODLoadJob;9025:09:43
GleamMIAGREEClassExtractJob;9025:11:35
GLMLRP_Diff_HighlighterJob;989:08:56
GLMLRP_ComparisonJob;989:13:42
AD046;988:44:15
Reply With Quote
  #5 (permalink)  
Old 05-28-2008
Registered User
 

Join Date: Jul 2006
Posts: 18
Quote:
Originally Posted by zaxxon View Post
Is this what you want?

Code:
sort -t";" -rn -k2,2 mwreport_joined.txt

GleamMIPostCanadaExtractJob;9196:53:12
GleamMIAGREERepAllBackOutJob;9025:39:12
GleamMIAGREEProdFacilCombJob;9025:29:36
GleamMIAGREEExcRateHistExtractJob;9025:21:26
GleamMIAGREEDynamicParamJob;9025:19:10
GleamMIAGREEClassPODLoadJob;9025:09:43
GleamMIAGREEClassExtractJob;9025:11:35
GLMLRP_Diff_HighlighterJob;989:08:56
GLMLRP_ComparisonJob;989:13:42
AD046;988:44:15

That's closer yes... although, I've highlighted above where records are out of line:

9025:11:35 - this should sit between 9025:19:10 and 9025:09:43

989:13:42 - this should be above 989:08:56
Reply With Quote
  #6 (permalink)  
Old 05-28-2008
Registered User
 

Join Date: Sep 2007
Location: Koblenz, Germany
Posts: 579
Wrote nonsense - will come up with a better idea, brb

Here it is:

Code:
awk -F ";" '{print $2}' mwreport_joined.txt| sort -t ";" -rn -k 1,1 -k 2,2 -k 3,3 | xargs -I {} grep {} mwreport_joined.txt
Maybe not nice but works on Debian Linux. Not sure about the -I {} on xargs for other OS'es. On AIX I usually just leave it out iirc.

Last edited by zaxxon; 05-28-2008 at 06:29 AM.
Reply With Quote
  #7 (permalink)  
Old 05-28-2008
Registered User
 

Join Date: Jul 2006
Posts: 18
Cheers again That seems to work although it returns each line multiple times, meaning the output file becomes much larger than the original
Reply With Quote
Google UNIX.COM
Reply

Tags
linux

Thread Tools
Display Modes




All times are GMT -7. The time now is 06:06 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0