The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
c program to extract text between two delimiters from some text file kukretiabhi13 High Level Programming 6 6 Hours Ago 05:55 AM
Sorting rules on a text section Indalecio Shell Programming and Scripting 4 12-05-2006 02:25 AM
Text File error in email mgirinath Shell Programming and Scripting 3 07-12-2006 03:10 AM
sorting received mail in unix and another error starla0316 UNIX for Dummies Questions & Answers 0 06-06-2005 12:11 AM
grep multiple text files in folder into 1 text file? coppertone UNIX for Dummies Questions & Answers 7 08-23-2002 11:50 AM

Closed Thread
 
Submit Tools LinkBack Thread Tools Search this Thread Display Modes
  #1  
Old 04-29-2008
karthikn7974's Avatar
Supporter
 

Join Date: Jul 2007
Location: Singapore
Posts: 53
awk error in sorting text file

Hi

Having a file as below

file.txt
Code:
error  Server   Network   Name       Dept   Date            Time
===========================================================================================================================
0     ServerA  LAN1    AAA     IT01    04/30/2008  09:16:26
0     ServerB  LAN1    AAA     IT02    04/30/2008  09:16:26
0     ServerA  LAN1    AAA     IT01    04/30/2008  11:11:26
0     ServerB  LAN1    AAA     IT02    04/30/2008  11:11:26
0     ServerA  LAN1    AAA     IT01    04/29/2008  12:16:26
0     ServerB  LAN1    AAA     IT02    04/30/2008  12:16:26
using awk to sort, remove duplicate and display the latest log line
got error, not much clear with the syntax

any one can help me
Code:
nawk 'END { for (k in r) print r[k] }
/^[0-9]/ { split($6, d, "/")
if (d[3]d[1]d[2]OFS$2 > m[$NF] ) {
  m[$NF] = d[3]d[1]d[2]OFS$2; r[$NF] = $0
  }
next }1' FS="   *" file.txt
Thanks with anticipation
Forum Sponsor
  #2  
Old 04-30-2008
era era is offline
Herder of Useless Cats
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,650
If the code is not correct, then can you describe what it's supposed to do, in some detail?
  #3  
Old 05-05-2008
karthikn7974's Avatar
Supporter
 

Join Date: Jul 2007
Location: Singapore
Posts: 53
Need to remove duplicate lines satisfies the below condition
if error, server, netowrk, dept and date are all the same then keep the latest line and remove old timed duplicate lines
  #4  
Old 05-06-2008
era era is offline
Herder of Useless Cats
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,650
Your code collects a unique line per time stamp ($NF is the last field on the line, the time stamp), not per the criteria you listed.

I don't know what the FS=" *" part is supposed to do, the regular whitespace separation that awk uses by default should work, and the FS looks like it's more or less the same thing anyway (not sure if you have tabs in there or not).

The keys you want to use are $1 (error), $2 (server), $3 (network), $4 (dept), and $5 (date). You probably want to do the arithmetic normalization on the time field, not on the date.

Code:
nawk '/^[0-9]/ { split($7, z, ":")
  k=$1OFS$2OFS$3OFS$4OFS$5;
  t=z[1]z[2]z[3];
  if(t > m[k] ) {
    m[k] = t; r[k] = $0
  }
next }1
END { for (k in r) print r[k] }' file.txt
I moved the END to the end (sic) purely for readability reasons; awk doesn't care much where in the script you put it.

So t contains the time stamp from $7 with the colons removed, and k is the combination of the fields you want to compare time stamps for (error, server, network, dept, date). If t is bigger than the old t you have for this k in m[k] (or it doesn't exist, meaning it's effectively zero), replace it, and remember the whole line in r[k]. Finally print all the lines in r.

Oh, the single number one after the closing brace is significant, too; it causes the header lines to be printed. If you don't want to print them, take it out. (It's a shorthand; it says "for any remaining line -- for which 1 is true, which by definition it is; this thus means all remaining lines, excluding any which were already handled earlier in the script -- do the default action, which is to print the line.")

Last edited by era; 05-06-2008 at 01:26 AM. Reason: m[k] is effectively zero if it's not defined; single 1 prints header
Google The UNIX and Linux Forums
Closed Thread

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes




All times are GMT -7. The time now is 12:11 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008. All Rights Reserved.Ad Management by RedTyger Visit The Complex Event Processing Blog

Content Relevant URLs by vBSEO 3.2.0