Alternative to sort -ur +1 required


 
Thread Tools Search this Thread
Top Forums UNIX for Advanced & Expert Users Alternative to sort -ur +1 required
# 1  
Old 10-03-2012
Alternative to sort -ur +1 required

I've got scripts trawling the network and dumping parsed text into files with an Epoch timestamp in column 1. I append the old data to the new data then just want to keep the top entry if there is an identical duplicate below (column 1 needs to be ignored).

sort -ur +1 works a treat on a Solaris 8 box but on Solaris 10 the 'r' seems to break!

Can some kind soul offer a fix / workaround?

If you ask me a question please keep it dummy level as I'm not super Unix literate.
# 2  
Old 10-03-2012
What is that + syntax supposed to do? It's not even part of the manual page for my version of sort.

If you just want sorting on the first column, sort -ur -k 1,2 I think.
# 3  
Old 10-03-2012
Solaris had a sort bug once I recall, but it was more subtle. Try using the -k method of specifying fields and sort direction and such. The +1 -2 notation is obsolescent - LINUX does not have it any more, and 0-based! The -k notation is 1-based, not zero-based, which might be more normal human friendly. BTW, +1 says sort on column 2 and following. I suppose column 1 is the file name?

Some sort of persistent JAVA container could do the testing and storing without a sort, perhaps in a tree. You can put the data into a structure mapped to a flat file, for instance. One possible advantage is that you can prune the set on the fly, if you are not interested in the full set. Also, you can do controlled thread parallelism. It is a lot faster than sort or an SQL RDBMS ETL approach.

Sort can also be sped up with parallelism in bash, on nicer systems with /dev/fd/[0-9]* and ksh (or using named pipes), using sort merge and pipes:
Code:
sort -m YOUR_ARGS <(
  sort YOUR_ARGS FILE_LIST_1
 ) <(
  sort YOUR_ARGS FILE_LIST_2
   .
   .
   .
 ) <( 
  sort YOUR_ARGS FILE_LIST_N
 )
 
nicer than with named pipes (/sbin/mknod NAMED_PIPE_N p):
 
(
sort YOUR_ARGS -oNAMED_PIPE_1 FILE_LIST_1 &
sort YOUR_ARGS -oNAMED_PIPE_2 FILE_LIST_2 &
.
.
.
sort YOUR_ARGS -oNAMED_PIPE_N FILE_LIST_N &
sort -m YOUR_ARGS oNAMED_PIPE_1 oNAMED_PIPE_2 . . . oNAMED_PIPE_N
)

# 4  
Old 10-03-2012
Hi.

Minor quibble:
Quote:
Originally Posted by DGPickett
... The +1 -2 notation is obsolescent - LINUX does not have it any more, and 0-based! The -k notation is 1-based, not zero-based, which might be more normal human friendly. BTW, +1 says sort on column 2 and following ...
Code:
sort (GNU coreutils) 8.13
OS, ker|rel, machine: Linux, 3.0.0-1-amd64, x86_64
Distribution        : Debian GNU/Linux wheezy/sid

allows old form:
Code:
   On older systems, `sort' supports an obsolete origin-zero syntax
`+POS1 [-POS2]' for specifying sort keys.  The obsolete sequence `sort
+A.X -B.Y' is equivalent to `sort -k A+1.X+1,B' if Y is `0' or absent,
otherwise it is equivalent to `sort -k A+1.X+1,B+1.Y'.

   This obsolete behavior can be enabled or disabled with the
`_POSIX2_VERSION' environment variable (*note Standards conformance::);
it can also be enabled when `POSIXLY_CORRECT' is not set by using the
obsolete syntax with `-POS2' present.

excerpt from info sort

Best wishes ... cheers, drl
# 5  
Old 10-03-2012
Gosh! I'll have a play with the -k option, I read the man page and didn't understand the k bit at all.

I need column 1 (Epoch time stamp) ignoring and the rest of the line taken into account for comparison. It's free text from interfaces so could be any number of words and characters.
# 6  
Old 10-04-2012
With -k, some options now ride inside the -k, like reverse and numeric, so they can vary key by key without ambiguity. Unique -u is global to all keys.

Do you want the whole list, or just the last day's hits or the like? You can write a low latency unique filter that does not sort, using a filtering collection. I posted one I wrote in C using a simple bisection search of an array of pointers: https://www.unix.com/shell-programmin...roup-unix.html
# 7  
Old 10-08-2012
Column 1 needs to be kept but ignored by sort.

Basically there will often be two entries just with the column 1 timestamp being different, I need to keep the top entry.

-k sounds like it could be what I need but the manual is gibberish to me.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

How to Modify a file content in UNIX and sort for only required fields ?

I have the below contents in a file after making the below curl call curl ... | grep -E "state|Rno" | paste -sd',\n' | grep "Disconnected" > test "state" : "Disconnected",, "Rno" : "5554f1d2" "state" : "Disconnected",, "Rno" : "10587563" "state" : "Disconnected",, "Rno" :... (2 Replies)
Discussion started by: Vaibhav H
2 Replies

2. UNIX for Dummies Questions & Answers

Best Alternative for checking input parameter contains required value or not

Any good way to check if code has the required output # /sbin/sysctl net.ipv4.icmp_echo_ignore_broadcasts net.ipv4.icmp_echo_ignore_broadcasts = 1 /sbin/sysctl net.ipv4.icmp_echo_ignore_broadcasts | grep "= 1" net.ipv4.icmp_echo_ignore_broadcasts = 1 What I can think of is above, and it... (16 Replies)
Discussion started by: alvinoo
16 Replies

3. Shell Programming and Scripting

Sort help: How to sort collected 'file list' by date stamp :

Hi Experts, I have a filelist collected from another server , now want to sort the output using date/time stamp filed. - Filed 6, 7,8 are showing the date/time/stamp. Here is the input: #---------------------------------------------------------------------- -rw------- 1 root ... (3 Replies)
Discussion started by: rveri
3 Replies

4. Shell Programming and Scripting

Help with sort word and general numeric sort at the same time

Input file: 100%ABC2 3.44E-12 USA A2M%H02579 0E0 UK 100%ABC2 5.34E-8 UK 100%ABC2 3.25E-12 USA A2M%H02579 5E-45 UK Output file: 100%ABC2 3.44E-12 USA 100%ABC2 3.25E-12 USA 100%ABC2 5.34E-8 UK A2M%H02579 0E0 UK A2M%H02579 5E-45 UK Code try: sort -k1,1 -g -k2 -r input.txt... (2 Replies)
Discussion started by: perl_beginner
2 Replies

5. Shell Programming and Scripting

Want to sort a file using awk & sed to get required output

Hi All, Need Suggestion, Want to sort a file using awk & sed to get required, output as below, such that each LUN shows correct WWPN and FA port Numbers correctly: Required output: 01FB 10000000c97843a2 8C 0 01FB 10000000c96fb279 9C 0 22AF 10000000c97843a2 8C 0 22AF 10000000c975adbd ... (10 Replies)
Discussion started by: aix_admin_007
10 Replies

6. UNIX for Advanced & Expert Users

Script to sort the files and append the extension .sort to the sorted version of the file

Hello all - I am to this forum and fairly new in learning unix and finding some difficulty in preparing a small shell script. I am trying to make script to sort all the files given by user as input (either the exact full name of the file or say the files matching the criteria like all files... (3 Replies)
Discussion started by: pankaj80
3 Replies

7. UNIX for Dummies Questions & Answers

How to insert alternative columns and sort text from first column to second?

Hi Everybody, I am just new to UNIX as well as to this forum. I have a text file with 10,000 coloumns and each coloumn contains values separated by space. I want to separate them into new coloumns..the file is something like this as ad af 1 A as ad af 1 D ... ... 1 and A are in one... (7 Replies)
Discussion started by: Unilearn
7 Replies

8. Shell Programming and Scripting

How to Sort Floating Numbers Using the Sort Command?

Hi to all. I'm trying to sort this with the Unix command sort. user1:12345678:3.5:2.5:8:1:2:3 user2:12345679:4.5:3.5:8:1:3:2 user3:12345687:5.5:2.5:6:1:3:2 user4:12345670:5.5:2.5:5:3:2:1 user5:12345671:2.5:5.5:7:2:3:1 I need to get this: user3:12345687:5.5:2.5:6:1:3:2... (7 Replies)
Discussion started by: daniel.gbaena
7 Replies

9. Shell Programming and Scripting

Getting required fields from a test file in required fromat in unix

My data is something like shown below. date1 date2 aaa bbbb ccccc date3 date4 dddd eeeeeee ffffffffff ggggg hh I want the output like this date1date2 aaa eeeeee I serached in the forum but didn't find the exact matching solution. Please help. (7 Replies)
Discussion started by: rdhanek
7 Replies

10. UNIX for Advanced & Expert Users

sort out the required data

Hi All, I have a file 1.txt which has the duplicate dns entries as shown: Name: 000f9fbc6738.net.in|Addresses: 10.241.66.169, 10.84.2.222,212.241.66.170 Name: 001371e8ed3e.net.in|Addresses: 10.241.65.153, 10.84.1.101 Name: 00e06f5bd42a.net.in|Addresses: 10.72.19.218,... (6 Replies)
Discussion started by: imas
6 Replies
Login or Register to Ask a Question