Visit Our UNIX and Linux User Community


deleting double entries in a log file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting deleting double entries in a log file
# 1  
Old 09-16-2002
deleting double entries in a log file

Hi Folks,

I have a apache log file that has double entries (however not all lines appear twice).

How can I delete automatically the first line of a double entry?

Your help is greatly appreciated.

Thanks,

Klaus

Here is what the log file looks like

217.81.190.164 - - [28/Aug/2002:00:16:33 +0200] "GET /rmg/w4w/1000689.htm HTTP/1.1" 200 2409
217.81.190.164 - - [28/Aug/2002:00:16:33 +0200] "GET /rmg/w4w/1000689.htm HTTP/1.1" 200 2409 "http://www.opusforum.org/rmg/w4w/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
217.81.190.164 - - [28/Aug/2002:00:17:01 +0200] "GET /rmg/vec/ HTTP/1.1" 200 2631
217.81.190.164 - - [28/Aug/2002:00:17:01 +0200] "GET /rmg/vec/ HTTP/1.1" 200 2631 "http://www.opusforum.org/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
217.81.190.164 - - [28/Aug/2002:00:17:03 +0200] "GET /rmg/vec/1000868.htm HTTP/1.1" 200 2386
217.81.190.164 - - [28/Aug/2002:00:17:03 +0200] "GET /rmg/vec/1000868.htm HTTP/1.1" 200 2386 "http://www.opusforum.org/rmg/vec/" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)"
213.23.52.237 - - [28/Aug/2002:00:17:10 +0200] "GET / HTTP/1.0" 200 16327
# 2  
Old 09-16-2002
How about:
uniq <inputfile >outputfile
# 3  
Old 09-16-2002
Re: deleting double entries in a log file

Quote:
Originally posted by opusforum
I have a apache log file that has double entries (however not all lines appear twice).
In this situation, I like to use a Perl hash for doing the dirty work for me.

Something like this:

Code:
#!/usr/bin/perl

open(LOG, "myLogFile") || die "$!";

my %logHash;

while ($inputLine = <LOG>) {
  if (!exists($logHash{$inputLine})) {
    $logHash{$inputLine} = 1;
    print "$inputLine";
  };
};

That should remove the dupe entries. Just redirect the output to a new log.
# 4  
Old 09-16-2002
Quote:
Originally posted by Perderabo
How about:
uniq <inputfile >outputfile
Oh sure, do it the eeaaasssy way! Smilie
# 5  
Old 09-16-2002
unfortunately doesn't work

Hi Folks,

thanks a lot for your suggestions. Unfortunately, both suggestions don't work.

The "uniq" solution needs a "-w 50" in order to come up with the double entry. However, it gives me the first line but I need the second (the line with add. information).

The perl script doesn't give me the result because it compares line by line. But the lines are not really "exact" duplicates (only the first 50 characters or so).

Any refinements, so the solution works? I am sure we are close Smilie

Thanks

Klaus
# 6  
Old 09-17-2002
the answer

I made it Smilie

here is what worked for me:

perl -e 'print reverse <>' logfile|uniq -w 50|perl -e 'print reverse <>' >logfile.done

so first, the logfile is inverted (by lines) then the dupes are removed and finaly we do an invert again.

The inversion is needed in order to have the first of a duplicate line pair removed.

Thanks to your contributions folks. This pointed me into the right direction.

Klaus Smilie
# 7  
Old 03-13-2009
uniq -c <file1 >file2 would give you the number duplicate entries with a unique entry appending to the 2nd file.


Regards,
uniesh

Previous Thread | Next Thread
Test Your Knowledge in Computers #865
Difficulty: Easy
The original idea of a member badging system at UNIX.com was first proposed by Ravinder Singh in 2018.
True or False?

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Deleting entries older than 7 days from a file

Hi, I have a file which contains entries in this format. my-bin.000140 my-bin.000141 my-bin.000142 my-bin.000143 my-bin.000144 my-bin.000145 my-bin.000146 my-bin.000147 my-bin.000148 my-bin.000149 my-bin.000150 my-bin.000151 my-bin.000152 my-bin.000153 my-bin.000154... (2 Replies)
Discussion started by: arijitsaha
2 Replies

2. Shell Programming and Scripting

deleting lines in a log file

Hello Community, sorry iam from germany and my englisch is not so well. iam searching for less then 4 hours on the web for a solution of my problem. i have a proxy-log-file and want to delete lines wicht matches on two words. example of the line in the logfile: now i want to delet... (3 Replies)
Discussion started by: matze
3 Replies

3. Shell Programming and Scripting

Deleting string within a file that finishes with .log

Hello, This is my first post. Nice forum! I have a file trls.results small exemple of content (actually the file can be very big): ./security/htaccess.htm ./security/ipcount.log ./adhoc/sql/datamod06.sql So there is 3 paths to 3 different files... I want to remove every string that has a... (9 Replies)
Discussion started by: Jacob106106
9 Replies

4. UNIX for Dummies Questions & Answers

Script For Deleting Contents of "Live" Log File

In our shop, we have a situation where a log file from our interface engine software has begun maxing out in file size (reaching the 32-bit "2147483647" limit). Currently, the only way to rectify this is to stop the interface and restart it, which generates a new log. Easy enough, but the... (6 Replies)
Discussion started by: rjhjr64
6 Replies

5. Shell Programming and Scripting

Delete log file entries based on the Date/Timestamp within log file

If a log file is in the following format 28-Jul-10 ::: Log message 28-Jul-10 ::: Log message 29-Jul-10 ::: Log message 30-Jul-10 ::: Log message 31-Jul-10 ::: Log message 31-Jul-10 ::: Log message 1-Aug-10 ::: Log message 1-Aug-10 ::: Log message 2-Aug-10 ::: Log message 2-Aug-10 :::... (3 Replies)
Discussion started by: vikram3.r
3 Replies

6. Shell Programming and Scripting

Deleting double items in file

Hi, i need a script, which deletes doulbe items in a file. My file looks like: - - - xxx xxx G123 G234 G234 G234 o o ... First i want to sort the file an then i want to delete double items. Can anyone help me. I work under solaris10. (3 Replies)
Discussion started by: free2k
3 Replies

7. Shell Programming and Scripting

Deleting all characters from 350th character to 450th character from the log file

Hi All, I have a big log file i want to delete all characters (between 350th to 450th characters) starting at 350th character position to 450th character position. please advice or sample code. (6 Replies)
Discussion started by: rajeshorpu
6 Replies

8. Shell Programming and Scripting

deleting lines in a log file

Is there an easy way to delete the first so many lines in a log file? like I have a log file that has 10000 lines, i want to just get rid of the first 9000. (2 Replies)
Discussion started by: BG_JrAdmin
2 Replies

9. Shell Programming and Scripting

Deleting double entry in a file

Hi, I am having almost the same problem as junior member 'oupsforum' (refer to subjuct "deleting double entry in a log file"), only that I am using Sun Sorlaris Unix which the uniq command does not has the flag -w. So I am not able to ignore certain portion of the line when the uniq doing the... (3 Replies)
Discussion started by: Wing m. Cheng
3 Replies

10. UNIX for Advanced & Expert Users

Deleting log file

Hi:- Will there be any difference in Solair 2.6 if I delete /var/cron/log file to free up some space. Thanks in advance (5 Replies)
Discussion started by: s_aamir
5 Replies

Featured Tech Videos