How to search backwards in a log file by timestamp of entries?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How to search backwards in a log file by timestamp of entries?
# 1  
Old 09-23-2009
Question How to search backwards in a log file by timestamp of entries?

Hello. I'm not nearly good enough with awk/perl to create the logfile scraping script that my boss is insisting we need immediately. Here is a brief 3-line excerpt from the access.log file in question (actual URL domain changed to 'aaa.com'):

Code:
209.253.130.36 - - [23/Sep/2009:12:55:44 -0700] "GET /images/products/en_us/pc/detail/273595_dt.jpg HTTP/1.1" 200 28520 "http://www.aaa.com/product/holiday+parties/halloween+party+supplies.do?" "Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.0; FunWebProducts; .NET CLR 1.1.4322)" 22134 "__utma=8470452.136497171.1253643073.1253655989.1253731688.3; __utmb=8470452.4.10.1253731688; __utmz=8470452.1253643073.1.1.utmcsr=(direct)|utmccn=(direct)|utmcmd=(none); s_cc=true"
99.60.55.157 - - [23/Sep/2009:12:55:45 -0700] "GET /mod/productquickview/includes/themes/default.css HTTP/1.1" 200 767 "http://www.aaa.com/home.do?" "Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.14) Gecko/2009082707 Firefox/3.0.14 (.NET CLR 3.5.30729)" 14097 "customer=none; basket=none; __utma=8470452.1058319807.1252542208.1252547047.1252713609.3; __utmz=8470452.1252542208.1.1.utmcsr=yahoo|utmccn=(organic)|utmcmd=organic|utmctr=aaa; JSESSIONID=j0d7VJsXNBv6ztnpOp"
198.7.255.226 - - [23/Sep/2009:12:55:46 -0700] "GET /images/products/en_us/gateways/costumes_R_01_C_01.jpg HTTP/1.1" 200 30097 "http://www.aaa.com/category/costumes+%26+accessories.do" "Mozilla/5.0 (Windows; U; Windows NT 6.0; en-US; rv:1.9.0.14) Gecko/2009082707 Firefox/3.0.14 (.NET CLR 3.5.30729)" 12334 "s_cc=true"

So the lines start with an IP-address, followed by date, and then time. We want to only search the last 10 minutes in the file (say if current time is 11:40, we want to only look at lines that go back to 11:30). I've got the code to convert the current time into scalar, subtract 600 secs, and store that time as single character variables (ie: $a = 1, $b = 1, $c = 3, $d = 0).

But I need help with an awk (or other?) code line that will parse each entry in the log file to skip over the IP and the date, and match against the TIMEstamp only. And what's more, we'd like it to do so starting from the bottom of the file (ie: with the most recent entry) and go backwards......and then hopefully stop the search when it hits the first entry that does NOT fall within the past 10-min (because log file is very, very large!).

Any and all help or suggestions would be monumentally appreciated.
# 2  
Old 09-27-2009
use File::ReadBackwards

see how-to: reading a file backwards | Perl HowTo

Code:
#!/usr/bin/perl
use File::ReadBackwards;
 
$fh = File::ReadBackwards->new('access.log') or die "can't read file: $!\n";
 
while ( defined($line = $fh->readline) )
{
  if ($line =~ /regex to capture time/)
  {
      #get max time on first iteration
      #check time against max time
      #if within range, add to array
      #otherwise, exit the loop
  }
}
foreach my $line (reverse @lines)
{
   #process each line as needed
}

# 3  
Old 09-27-2009
For the backwards part, you can use tac. I had some doubts about its efficiency for large files but I just did some tests and, to my great surprise, it is almost as efficient as cat.

Now the parse and time test part. Prerequisite:

- the sample file is exactly as the one you provided. Otherwise you can adjust the field offset by playing around withe the $i's
- you have GNU awk at hand. That's for the systime() and mktime() functions. If not, see remark below.

parselog.awk
Code:
BEGIN{
    FS="[ /:[]"
    now=systime()
    str="Jan_Feb_Mar_Apr_Mai_Jun_Jul_Aug_Sep_Oct_Nov_Dec"
    split(str, m, "_")
    for (i in m) mm[m[i]]=i
}
{
    timestamp=mktime(sprintf("%s %s %s %s %s %s", $7,mm[$6],$5,$8,$9,$10))
    if (timestamp < (now-600)){
        exit
    }
    print
}

To run that snippet:
Code:
$ tac your.log | awk -f parselog.awk

The awk program will stop and exit as soon as it hits a line with a timestamp that is more than 10 min. old. That exit swtich is there to prevent awk to continue scanning the remaining lines which we know will never comply with the timestamp condition.

If you don't have GNU awk, let us know. There is a workaround using awk's system() I/O function and the shell date command.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Search for latest Timestamp in file

Hi, I have a file which generate Timestamp in this format :- 20121012162354 20121114191610 20121210232808 20121216220002 20130106220002 20130127220001 20130203220001 20121012162354 20121114191610 20121210232808 20121216220002 20130106220002 20130127220001 20130203220001 (2 Replies)
Discussion started by: netdbaind
2 Replies

2. Shell Programming and Scripting

Search backwards to certain string

Hi, I'm using the following to do a backwards search of a file for a string sed s/^M//g FILE | nawk 'c-->0;$0~s{if(b)for(c=b+1;c>1;c--)print r;print;c=a}b{r=$0}' b=10 a=0 s="9005"|grep "policy "|sort -u |awk '{print $4}'|cut -c2-10 My issue is that because I'm looking back 10 lines it's... (11 Replies)
Discussion started by: SaltyDog
11 Replies

3. Programming

How to search a file based on a time stamp backwards 10 seconds

Hi all, I'm after some help with this small issue which i'm struggling to work out a fix for. I have a file that contains records that all have a time stamp for each individual record, i need to search the file for a specific time stamp and then search back 10 seconds to see if the number... (2 Replies)
Discussion started by: sp3arsy
2 Replies

4. Shell Programming and Scripting

Delete log file entries based on the Date/Timestamp within log file

If a log file is in the following format 28-Jul-10 ::: Log message 28-Jul-10 ::: Log message 29-Jul-10 ::: Log message 30-Jul-10 ::: Log message 31-Jul-10 ::: Log message 31-Jul-10 ::: Log message 1-Aug-10 ::: Log message 1-Aug-10 ::: Log message 2-Aug-10 ::: Log message 2-Aug-10 :::... (3 Replies)
Discussion started by: vikram3.r
3 Replies

5. Shell Programming and Scripting

search for string and replace backwards

I'm new to Unix scripting and I'm not sure if this can be done. Example: search (grep) in a file for 'Control ID' and then replace with 4 blanks 7 bytes before 'Control ID. input "xxxxxx1234xxxxxxxControl IDxxxxxx" output: "xxxxxx xxxxxxxControl IDxxxxxx" thanks! (7 Replies)
Discussion started by: jbt828
7 Replies

6. Shell Programming and Scripting

concatenate log file lines up to timestamp

Hi, Using sed awk or perl I am trying to do something similar to https://www.unix.com/shell-programming-scripting/105887-sed-awk-concatenate-lines-until-blank-line-2.html but my requirement is slightly different. What I am trying to accomplish is to reformat a logfile such that all lines... (4 Replies)
Discussion started by: AlanC
4 Replies

7. Shell Programming and Scripting

search backwards relative to a string

Hi, I have to search for first occurenceof string str1 in a file(>5GB). Now, after I have that , I have to search backwards from that offset till I get another string str2. I should also be able to get the new string str2's offset. Similarly, I look for last occurence of str1 and then... (1 Reply)
Discussion started by: finder255
1 Replies

8. Shell Programming and Scripting

Search backwards

Hi, I have a variable , lets say a=/disk1/net/first.ksh i need to grep "first.ksh" everytime "a" gets changed dynamically and i do not know how many '"/" are there in my variable. Can somebody help me out. (9 Replies)
Discussion started by: giri_luck
9 Replies

9. UNIX for Advanced & Expert Users

Copy lines from a log file based on timestamp

how to copy lines from a log file based on timestamp. INFO (RbrProcessFlifoEventSessionEJB.java:processFlight:274) - E_20080521_110754_967: rbrAciInfoObjects listing complete! INFO (RbrPnrProcessEventSessionEJB.java:processFlight:197) - Event Seq: 1647575217; Carrier: UA; Flt#: 0106; Origin:... (1 Reply)
Discussion started by: ranjiadmin
1 Replies

10. UNIX for Dummies Questions & Answers

Spooling a log file with timestamp

Hi From shell script i am invoking sqlplus to connect to oracle database and then i spool a csv file as with output. What i want to do is to change the file name with timestamp on it so after spooling finish shell script change file name with time stamp. can someone help me to do that . Thanks... (2 Replies)
Discussion started by: ukadmin
2 Replies
Login or Register to Ask a Question