Need awk script for removing duplicate records

05-21-2011

Registered User

1, 0

Join Date: May 2011

Last Activity: 21 May 2011, 9:36 AM EDT

Posts: 1

Thanks Given: 0

Thanked 0 Times in 0 Posts

Need awk script for removing duplicate records

I have log file having Traffic line

Code:

2011-05-21 15:11:50.356599  TCP (6), length: 52) 10.10.10.1.3020 > 10.10.10.254.50404: 
2011-05-21 15:11:50.652739  TCP (6), length: 52) 10.10.10.254.50404 > 10.10.10.1.3020: 
2011-05-21 15:11:50.652558  TCP (6), length: 89) 10.10.10.1.3020 > 10.10.10.254.50404: 
2011-05-21 15:11:50.852325  TCP (6), length: 32) 10.10.10.1.3020 > 10.10.10.254.50404:

the idea is to remove the lines that are repeated more than once , write how many times the line is repeated and the summation field length . I also want to arrange fields to have the following matches

Code:

2011-05-21 15:11:50.356599  TCP (6)  length 141 10.10.10.1  3020  >   10.10.10.254  50404   3
2011-05-21 15:11:50.652739  TCP (6)  length  52 10.10.10.254 50404 > 10.10.10.1  3020  1

I managed to get this result but it is not enough

Code:

awk '{x[substr ($0,28)]++;y[substr ($0,28)]=$2} END { for (i in x) printf "%s %d\n",y[i]i,x[i]}' file.txt

Code:

15:11:50.356599  TCP (6),  length: 52 10.10.10.1.3020 > 10.10.10.254.50404   3
15:11:50.652739  TCP (6),  length: 52 10.10.10.254.50404 > 10.10.10.1.3020  1

Last edited by Scott; 05-21-2011 at 10:56 AM.. Reason: Added code tags

Rastamed

View Public Profile for Rastamed

Find all posts by Rastamed

05-26-2011

Registered User

4,673, 588

Join Date: Oct 2010

Last Activity: 1 February 2016, 3:35 PM EST

Location: Southern NJ, USA (Nord)

Posts: 4,673

Thanks Given: 8

Thanked 588 Times in 561 Posts

I see length changing but no pattern, dot becomes space(s) or tab, record counting, time field saving/overwriting but I do not have your requirements.

DGPickett

View Public Profile for DGPickett

Find all posts by DGPickett

Linux

Need awk script for removing duplicate records

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Removing specific records from files when duplicate key

Discussion started by: tinytimmay

2. Shell Programming and Scripting

To select non-duplicate records using awk

Discussion started by: paresh n doshi

3. Homework & Coursework Questions

Script: Removing HTML tags and duplicate lines

Discussion started by: tburns517

4. Shell Programming and Scripting

Help with removing duplicate entries with awk or Perl

Discussion started by: Amit Pande

5. Shell Programming and Scripting

Removing duplicate records in a file based on single column explanation

Discussion started by: cokedude

6. Shell Programming and Scripting

removing duplicate records comparing 2 csv files

Discussion started by: rajak.net

7. Shell Programming and Scripting

Removing duplicate records in a file based on single column

Discussion started by: G.K.K

8. Shell Programming and Scripting

Removing duplicate records from 2 files

Discussion started by: zooby

9. Shell Programming and Scripting

Issues with filtering duplicate records using gawk script

Discussion started by: nmumbarkar

10. Linux

Need awk script for removing duplicate records

Discussion started by: nmumbarkar