Alternative to sort -ur +1 required | Unix Linux Forums | UNIX for Advanced & Expert Users

  Go Back    


UNIX for Advanced & Expert Users Expert-to-Expert. Learn advanced UNIX, UNIX commands, Linux, Operating Systems, System Administration, Programming, Shell, Shell Scripts, Solaris, Linux, HP-UX, AIX, OS X, BSD.

Alternative to sort -ur +1 required

UNIX for Advanced & Expert Users


Tags
sort

Closed Thread    
 
Thread Tools Search this Thread Display Modes
    #1  
Old 10-03-2012
Mike Smith Mike Smith is offline
Registered User
 
Join Date: Oct 2012
Last Activity: 30 January 2013, 6:24 AM EST
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
Alternative to sort -ur +1 required

I've got scripts trawling the network and dumping parsed text into files with an Epoch timestamp in column 1. I append the old data to the new data then just want to keep the top entry if there is an identical duplicate below (column 1 needs to be ignored).

sort -ur +1 works a treat on a Solaris 8 box but on Solaris 10 the 'r' seems to break!

Can some kind soul offer a fix / workaround?

If you ask me a question please keep it dummy level as I'm not super Unix literate.
Sponsored Links
    #2  
Old 10-03-2012
Corona688 Corona688 is offline Forum Staff  
Mead Rotor
 
Join Date: Aug 2005
Last Activity: 24 October 2014, 4:38 PM EDT
Location: Saskatchewan
Posts: 19,683
Thanks: 823
Thanked 3,352 Times in 3,139 Posts
What is that + syntax supposed to do? It's not even part of the manual page for my version of sort.

If you just want sorting on the first column, sort -ur -k 1,2 I think.
Sponsored Links
    #3  
Old 10-03-2012
DGPickett DGPickett is offline Forum Advisor  
Registered User
 
Join Date: Oct 2010
Last Activity: 24 October 2014, 3:14 PM EDT
Location: Southern NJ, USA (Nord)
Posts: 4,469
Thanks: 8
Thanked 548 Times in 525 Posts
Solaris had a sort bug once I recall, but it was more subtle. Try using the -k method of specifying fields and sort direction and such. The +1 -2 notation is obsolescent - LINUX does not have it any more, and 0-based! The -k notation is 1-based, not zero-based, which might be more normal human friendly. BTW, +1 says sort on column 2 and following. I suppose column 1 is the file name?

Some sort of persistent JAVA container could do the testing and storing without a sort, perhaps in a tree. You can put the data into a structure mapped to a flat file, for instance. One possible advantage is that you can prune the set on the fly, if you are not interested in the full set. Also, you can do controlled thread parallelism. It is a lot faster than sort or an SQL RDBMS ETL approach.

Sort can also be sped up with parallelism in bash, on nicer systems with /dev/fd/[0-9]* and ksh (or using named pipes), using sort merge and pipes:
Code:
sort -m YOUR_ARGS <(
  sort YOUR_ARGS FILE_LIST_1
 ) <(
  sort YOUR_ARGS FILE_LIST_2
   .
   .
   .
 ) <( 
  sort YOUR_ARGS FILE_LIST_N
 )
 
nicer than with named pipes (/sbin/mknod NAMED_PIPE_N p):
 
(
sort YOUR_ARGS -oNAMED_PIPE_1 FILE_LIST_1 &
sort YOUR_ARGS -oNAMED_PIPE_2 FILE_LIST_2 &
.
.
.
sort YOUR_ARGS -oNAMED_PIPE_N FILE_LIST_N &
sort -m YOUR_ARGS oNAMED_PIPE_1 oNAMED_PIPE_2 . . . oNAMED_PIPE_N
)

    #4  
Old 10-03-2012
drl's Avatar
drl drl is online now Forum Advisor  
Registered Voter
 
Join Date: Apr 2007
Last Activity: 24 October 2014, 9:24 PM EDT
Location: Saint Paul, MN USA / BSD, CentOS, Debian, OS X, Solaris
Posts: 1,686
Thanks: 42
Thanked 197 Times in 179 Posts
Hi.

Minor quibble:
Quote:
Originally Posted by DGPickett View Post
... The +1 -2 notation is obsolescent - LINUX does not have it any more, and 0-based! The -k notation is 1-based, not zero-based, which might be more normal human friendly. BTW, +1 says sort on column 2 and following ...

Code:
sort (GNU coreutils) 8.13
OS, ker|rel, machine: Linux, 3.0.0-1-amd64, x86_64
Distribution        : Debian GNU/Linux wheezy/sid

allows old form:

Code:
   On older systems, `sort' supports an obsolete origin-zero syntax
`+POS1 [-POS2]' for specifying sort keys.  The obsolete sequence `sort
+A.X -B.Y' is equivalent to `sort -k A+1.X+1,B' if Y is `0' or absent,
otherwise it is equivalent to `sort -k A+1.X+1,B+1.Y'.

   This obsolete behavior can be enabled or disabled with the
`_POSIX2_VERSION' environment variable (*note Standards conformance::);
it can also be enabled when `POSIXLY_CORRECT' is not set by using the
obsolete syntax with `-POS2' present.

excerpt from info sort

Best wishes ... cheers, drl
Sponsored Links
    #5  
Old 10-03-2012
Mike Smith Mike Smith is offline
Registered User
 
Join Date: Oct 2012
Last Activity: 30 January 2013, 6:24 AM EST
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
Gosh! I'll have a play with the -k option, I read the man page and didn't understand the k bit at all.

I need column 1 (Epoch time stamp) ignoring and the rest of the line taken into account for comparison. It's free text from interfaces so could be any number of words and characters.
Sponsored Links
    #6  
Old 10-04-2012
DGPickett DGPickett is offline Forum Advisor  
Registered User
 
Join Date: Oct 2010
Last Activity: 24 October 2014, 3:14 PM EDT
Location: Southern NJ, USA (Nord)
Posts: 4,469
Thanks: 8
Thanked 548 Times in 525 Posts
With -k, some options now ride inside the -k, like reverse and numeric, so they can vary key by key without ambiguity. Unique -u is global to all keys.

Do you want the whole list, or just the last day's hits or the like? You can write a low latency unique filter that does not sort, using a filtering collection. I posted one I wrote in C using a simple bisection search of an array of pointers: http://www.unix.com/shell-programmin...roup-unix.html
Sponsored Links
    #7  
Old 10-08-2012
Mike Smith Mike Smith is offline
Registered User
 
Join Date: Oct 2012
Last Activity: 30 January 2013, 6:24 AM EST
Posts: 6
Thanks: 0
Thanked 0 Times in 0 Posts
Column 1 needs to be kept but ignored by sort.

Basically there will often be two entries just with the column 1 timestamp being different, I need to keep the top entry.

-k sounds like it could be what I need but the manual is gibberish to me.
Sponsored Links
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Script to sort the files and append the extension .sort to the sorted version of the file pankaj80 UNIX for Advanced & Expert Users 3 06-07-2011 09:28 AM
How to insert alternative columns and sort text from first column to second? Unilearn UNIX for Dummies Questions & Answers 7 07-10-2010 06:22 AM
Getting required fields from a test file in required fromat in unix rdhanek Shell Programming and Scripting 7 07-22-2009 11:35 AM
sort out the required data imas UNIX for Advanced & Expert Users 6 04-24-2009 04:46 AM
Script required to get a required info from file. Pls. help me. ntgobinath Shell Programming and Scripting 2 05-31-2008 08:34 AM



All times are GMT -4. The time now is 09:25 PM.