The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com




View Single Post in the UNIX and Linux Forums - Click on the Thread or Permalink to View Entire Thread -->
  #2 (permalink)  
Old 06-05-2008
era era is offline Forum Advisor  
Herder of Useless Cats (On Sabbatical)
  
 

Join Date: Mar 2008
Location: /there/is/only/bin/sh
Posts: 3,652
So if the tenth field is identical to one we have seen before, remove the whole line?

Code:
perl -ane 'print unless $seen{$F[9]}++' my_log.txt
(Perl arrays are numbered from zero, so $F[9] is the tenth field. The -a option causes Perl to split the input line into the array @F, somewhat similarly to how awk works.)

The input line is printed unless the hash %seen already has an entry for the tenth field. We also add one to its value, which causes it to be set (to one) if it didn't exist before. Thus, the %seen value for the current seed will be set the next time it is encountered.

For the sample file you posted, this reduces 1,274 lines to just 72 lines.