|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Advanced & Expert Users Expert-to-Expert. Learn advanced UNIX, UNIX commands, Linux, Operating Systems, System Administration, Programming, Shell, Shell Scripts, Solaris, Linux, HP-UX, AIX, OS X, BSD. |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Alternative to sort -ur +1 required
I've got scripts trawling the network and dumping parsed text into files with an Epoch timestamp in column 1. I append the old data to the new data then just want to keep the top entry if there is an identical duplicate below (column 1 needs to be ignored).
sort -ur +1 works a treat on a Solaris 8 box but on Solaris 10 the 'r' seems to break! Can some kind soul offer a fix / workaround? If you ask me a question please keep it dummy level as I'm not super Unix literate. |
| Sponsored Links | ||
|
|
#2
|
|||
|
|||
|
What is that + syntax supposed to do? It's not even part of the manual page for my version of sort.
If you just want sorting on the first column, sort -ur -k 1,2 I think. |
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
Solaris had a sort bug once I recall, but it was more subtle. Try using the -k method of specifying fields and sort direction and such. The +1 -2 notation is obsolescent - LINUX does not have it any more, and 0-based! The -k notation is 1-based, not zero-based, which might be more normal human friendly. BTW, +1 says sort on column 2 and following. I suppose column 1 is the file name? Some sort of persistent JAVA container could do the testing and storing without a sort, perhaps in a tree. You can put the data into a structure mapped to a flat file, for instance. One possible advantage is that you can prune the set on the fly, if you are not interested in the full set. Also, you can do controlled thread parallelism. It is a lot faster than sort or an SQL RDBMS ETL approach. Sort can also be sped up with parallelism in bash, on nicer systems with /dev/fd/[0-9]* and ksh (or using named pipes), using sort merge and pipes: Code:
sort -m YOUR_ARGS <( sort YOUR_ARGS FILE_LIST_1 ) <( sort YOUR_ARGS FILE_LIST_2 . . . ) <( sort YOUR_ARGS FILE_LIST_N ) nicer than with named pipes (/sbin/mknod NAMED_PIPE_N p): ( sort YOUR_ARGS -oNAMED_PIPE_1 FILE_LIST_1 & sort YOUR_ARGS -oNAMED_PIPE_2 FILE_LIST_2 & . . . sort YOUR_ARGS -oNAMED_PIPE_N FILE_LIST_N & sort -m YOUR_ARGS oNAMED_PIPE_1 oNAMED_PIPE_2 . . . oNAMED_PIPE_N ) |
|
#4
|
||||
|
||||
|
Hi. Minor quibble: Quote:
Code:
sort (GNU coreutils) 8.13 OS, ker|rel, machine: Linux, 3.0.0-1-amd64, x86_64 Distribution : Debian GNU/Linux wheezy/sid allows old form: Code:
On older systems, `sort' supports an obsolete origin-zero syntax `+POS1 [-POS2]' for specifying sort keys. The obsolete sequence `sort +A.X -B.Y' is equivalent to `sort -k A+1.X+1,B' if Y is `0' or absent, otherwise it is equivalent to `sort -k A+1.X+1,B+1.Y'. This obsolete behavior can be enabled or disabled with the `_POSIX2_VERSION' environment variable (*note Standards conformance::); it can also be enabled when `POSIXLY_CORRECT' is not set by using the obsolete syntax with `-POS2' present. excerpt from info sort Best wishes ... cheers, drl |
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
Gosh! I'll have a play with the -k option, I read the man page and didn't understand the k bit at all.
I need column 1 (Epoch time stamp) ignoring and the rest of the line taken into account for comparison. It's free text from interfaces so could be any number of words and characters. |
| Sponsored Links | |
|
|
#6
|
|||
|
|||
|
With -k, some options now ride inside the -k, like reverse and numeric, so they can vary key by key without ambiguity. Unique -u is global to all keys.
Do you want the whole list, or just the last day's hits or the like? You can write a low latency unique filter that does not sort, using a filtering collection. I posted one I wrote in C using a simple bisection search of an array of pointers: http://www.unix.com/shell-programmin...roup-unix.html |
| Sponsored Links | |
|
|
#7
|
|||
|
|||
|
Column 1 needs to be kept but ignored by sort.
Basically there will often be two entries just with the column 1 timestamp being different, I need to keep the top entry. -k sounds like it could be what I need but the manual is gibberish to me. |
| Sponsored Links | ||
|
![]() |
| Tags |
| sort |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| Script to sort the files and append the extension .sort to the sorted version of the file | pankaj80 | UNIX for Advanced & Expert Users | 3 | 06-07-2011 09:28 AM |
| How to insert alternative columns and sort text from first column to second? | Unilearn | UNIX for Dummies Questions & Answers | 7 | 07-10-2010 06:22 AM |
| Getting required fields from a test file in required fromat in unix | rdhanek | Shell Programming and Scripting | 7 | 07-22-2009 11:35 AM |
| sort out the required data | imas | UNIX for Advanced & Expert Users | 6 | 04-24-2009 04:46 AM |
| Script required to get a required info from file. Pls. help me. | ntgobinath | Shell Programming and Scripting | 2 | 05-31-2008 08:34 AM |
|
|