Need Time Stamp Range On Log Files


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need Time Stamp Range On Log Files
# 29  
Old 07-06-2015
Using this code
Code:
$5 ~ "-0[45]00]" {

I get
Code:
./modified_gawk.sh "01 Mar 2015" 01:23:47 "02 Jul 2015" 01:55:58
Examining from Sun Mar  1 01:23:47 EST 2015 (1425191027)
            to Thu Jul  2 01:55:58 EDT 2015 (1435816558)

$1=3.1.20.15
$2=-
$3=-
$4=[01/Mar/2015:01:23:46
$5=-0500]
$6="Test
$7=in
$8=EST"
Processing /data/log/test2.log file
mtkime(2015 3 01 01 23 46): 1425191026
1425191026 not between 1425191027 and 1435816558
$1=3.1.20.15
$2=-
$3=-
$4=[01/Mar/2015:01:23:47
$5=-0500]
$6="Test
$7=in
$8=EST"
mtkime(2015 3 01 01 23 47): 1425191027
$1=54.86.148.217
$2=-
$3=-
$4=[02/Jul/2015:01:55:57
$5=-0400]
$6="HEAD
$7=/content/422-ahmunbelief
$8=HTTP/1.1"
$9=200
$10=-
$11="-"
$12="Sphider"
mtkime(2015 7 02 01 55 57): 1435816557
$1=184.98.149.48
$2=-
$3=-
$4=[02/Jul/2015:01:55:59
$5=-0400]
$6="GET
$7=/themes/warehouse/js/script.js
$8=HTTP/1.1"
$9=200
$10=1313
$11="https://www.google.com/"
$12="Mozilla/5.0
$13=(Linux;
$14=Android
$15=5.0;
$16=SM-N900T
$17=Build/LRX21V)
$18=AppleWebKit/537.36
$19=(KHTML,
$20=like
$21=Gecko)
$22=Chrome/43.0.2357.93
$23=Mobile
$24=Safari/537.36"
mtkime(2015 7 02 01 55 59): 1435816559
1435816559 not between 1425191027 and 1435816558
$1=184.98.149.48
$2=-
$3=-
$4=[02/Jul/2015:01:55:59
$5=-0400]
$6="GET
$7=/themes/warehouse/cache/50ca4d40aa6b13dfe15d7583bbe75eea.js
$8=HTTP/1.1"
$9=200
$10=69947
$11="https://www.google.com/"
$12="Mozilla/5.0
$13=(Linux;
$14=Android
$15=5.0;
$16=SM-N900T
$17=Build/LRX21V)
$18=AppleWebKit/537.36
$19=(KHTML,
$20=like
$21=Gecko)
$22=Chrome/43.0.2357.93
$23=Mobile
$24=Safari/537.36"
mtkime(2015 7 02 01 55 59): 1435816559
1435816559 not between 1425191027 and 1435816558
      1 3.1.20.15
      1 1.1.1.1

Using this code
Code:
$5 == "-0400]" || $5 == "-0500]" {

I get
Code:
./modified_gawk.sh "01 Mar 2015" 01:23:47 "02 Jul 2015" 01:55:58
Examining from Sun Mar  1 01:23:47 EST 2015 (1425191027)
            to Thu Jul  2 01:55:58 EDT 2015 (1435816558)

$1=3.1.20.15
$2=-
$3=-
$4=[01/Mar/2015:01:23:46
$5=-0500]
$6="Test
$7=in
$8=EST"
Processing /data/log/test2.log file
mtkime(2015 3 01 01 23 46): 1425191026
1425191026 not between 1425191027 and 1435816558
$1=3.1.20.15
$2=-
$3=-
$4=[01/Mar/2015:01:23:47
$5=-0500]
$6="Test
$7=in
$8=EST"
mtkime(2015 3 01 01 23 47): 1425191027
$1=1.1.1.1
$2=-
$3=-
$4=[02/Jul/2015:01:55:57
$5=-0400]
$6="HEAD
$7=/content/422-ahmunbelief
$8=HTTP/1.1"
$9=200
$10=-
$11="-"
$12="Sphider"
mtkime(2015 7 02 01 55 57): 1435816557
$1=2.2.2.2
$2=-
$3=-
$4=[02/Jul/2015:01:55:59
$5=-0400]
$6="GET
$7=/themes/warehouse/js/script.js
$8=HTTP/1.1"
$9=200
$10=1313
$11="https://www.google.com/"
$12="Mozilla/5.0
$13=(Linux;
$14=Android
$15=5.0;
$16=SM-N900T
$17=Build/LRX21V)
$18=AppleWebKit/537.36
$19=(KHTML,
$20=like
$21=Gecko)
$22=Chrome/43.0.2357.93
$23=Mobile
$24=Safari/537.36"
mtkime(2015 7 02 01 55 59): 1435816559
1435816559 not between 1425191027 and 1435816558
$1=2.2.2.2
$2=-
$3=-
$4=[02/Jul/2015:01:55:59
$5=-0400]
$6="GET
$7=/themes/warehouse/cache/50ca4d40aa6b13dfe15d7583bbe75eea.js
$8=HTTP/1.1"
$9=200
$10=69947
$11="https://www.google.com/"
$12="Mozilla/5.0
$13=(Linux;
$14=Android
$15=5.0;
$16=SM-N900T
$17=Build/LRX21V)
$18=AppleWebKit/537.36
$19=(KHTML,
$20=like
$21=Gecko)
$22=Chrome/43.0.2357.93
$23=Mobile
$24=Safari/537.36"
mtkime(2015 7 02 01 55 59): 1435816559
1435816559 not between 1425191027 and 1435816558
      1 3.1.20.15
      1 1.1.1.1

Thanks for sticking with me on this.
# 30  
Old 07-06-2015
Quote:
Originally Posted by sharingsunshine
Using this code
Code:
$5 ~ "-0[45]00]" {

I get
Code:
./modified_gawk.sh "01 Mar 2015" 01:23:47 "02 Jul 2015" 01:55:58
Examining from Sun Mar  1 01:23:47 EST 2015 (1425191027)
            to Thu Jul  2 01:55:58 EDT 2015 (1435816558)

... ... ...
      1 3.1.20.15
      1 1.1.1.1

Using this code
Code:
$5 == "-0400]" || $5 == "-0500]" {

I get
Code:
./modified_gawk.sh "01 Mar 2015" 01:23:47 "02 Jul 2015" 01:55:58
Examining from Sun Mar  1 01:23:47 EST 2015 (1425191027)
            to Thu Jul  2 01:55:58 EDT 2015 (1435816558)

... ... ...
      1 3.1.20.15
      1 1.1.1.1

Thanks for sticking with me on this.
You're welcome. And, it looks like we're getting exactly what you want now. So, turn off debugging, take out the line:
Code:
{for(i=1;i<=NF;i++) printf "$%d=%s\n", i, $i }

(or, preferably, change it to:
Code:
debug{for(i=1;i<=6;i++) printf "%d=%s\n", i, $i }

in case you ever need to turn debugging back on in the future), and change the last line of the awk script from:
Code:
END {for(ip in C) printf "%7d %s\n", C[ip], ip} ' /data/log/test.log

back to:
Code:
END {for(ip in C) printf "%7d %s\n", C[ip], ip} ' $FILES

and you should get what you want from your real data without the debugging info. (And, it should continue working when we shift back to standard time on November 1st.)
This User Gave Thanks to Don Cragun For This Post:
# 31  
Old 07-06-2015
This is great, thanks for your help. I also plan to find the top 5 entries in each log file during the range of time. So I'll use what you have given me and then figure out how to do that too.

Once again, thanks for all your help.

Hope it can help someone else too.
# 32  
Old 07-07-2015
Sorry, I had some real life issues that have kept me away from this thread. I'm so glad and greatful that Don Cragun was able to assist with resolving this issue for you.

I'm a little red-faced about that out-by-one issue with the month decoding, but happy to see you have a working solution now. It's worth setting a calendar reminder for yourself to check when daylight savings kicks in, that we don't end up with a 1 hour error.
# 33  
Old 07-08-2015
Hi Chubler,

I certainly understand when life situations change our schedules and priorities. Don was able to keep me going in grand fashion. I am just grateful you took the time to write such a complete set of code in the first place.

I will keep that in mind about checking for DST being on or off depending on the time of year.

I do wonder if you would mind pasting an explanation of your code on the thread. I need it to know where to modify the code to show only the top 5 entries based on the number of visits during the range I have specified. When I went to try and find the array that is creating the output I really couldn't understand the logic behind it.

Also, this is such a complete thread, to have the code explained, I am sure it will help others greatly that want to use or piggy back off of what has been accomplished.
# 34  
Old 07-08-2015
Code:
if (( $# < 3 || $# > 4 ))
then
   printf "Usage: $0 from_date from_time [to_date] to_time\n" >&2
   exit 2
fi

This just prints the usage string and terminates with exit code 2 if the number of passed arguments is not 3 or 4.

Code:
FDAY=$1
FTIME=$2

if (( $# == 3 ))
then
    TDAY=$FDAY
    TTIME=$3
else
    TDAY=$4
    TTIME=$3
fi

This sets FDAY,FTIME and TDAY,TTIME from the passed arguments, when 3 arguments are passed TDAY defaults to FDAY

Code:
FROM=$(date -d "$FDAY $FTIME" +%s)
(($? != 0 )) && exit 3
TO=$(date -d "$TDAY $TTIME" +%s)
(($? != 0 )) && exit 4

Calculate FROM and TO as seconds from epoch (midnight 1/1/1970). Here we allow any error messages from date to be displayed and exit with 3 in the case of an invalid from date/time or 4 for to date/time.

Code:
if (( $# == 3 && TO < FROM ))
then
   #FROM time later that TO time so add a day
   (( TO+=3600*24))
fi

Here if 3 arguments and TO time is earlier than FROM time (eg 9pm to 1am) make the TO date the next day. 3600 is seconds in 1 hour, multiply by 24 gives 1 day worth of seconds. Remember these dates are seconds passed epoch date.

Code:
if (( TO < FROM ))
then
    echo "$0: FROM date must be before TO date" >&2
    exit 5
fi

Trap error where TO date is before FROM and exit with 5.

Code:
echo "Examining from $(date -d @$FROM) ($FROM)"
echo "            to $(date -d @$TO) ($TO)"
echo

Display confirmation that the calculated dates match what was requested. This is quite usefull as date can accept strings like "today" or "yesterday" and it's good to be specific about the range going to be checked.

Code:
gawk -v F=$FROM -v T=$TO -v debug=0 '

Using GNU awk, this is needed as time/date functions are not supported in standard awk.
Pass shell $FROM in as variable F and $TO as variable. Variable debug set to 0 for false (non zero is true).

Code:
debug{for(i=1;i<=NF;i++) printf "$%d=%s\n", i, $i }

This debug outpus each field awk has split from the input file.

Code:
FNR==1 {
    for(ip in C) printf "%7d %s\n", C[ip], ip
    delete C
    print "Processing " FILENAME " file"
}

If processing the first record for a file output the contents of the C[] array from the previous file. The %7d format ensures 7 digit right justified printing.
Note: This is also done in the END block to get counts for last file processed.

$5 ~ "-0[45]00]" {
Only process rows where field number 5 is "-0400" or "-0500". This skips records from other timezones or non-valid log lines (eg headers or other record types).

split($4,v,"[[/: ]")
Split field 4 into variable V using left-square-bracket,colon,space or slash as word separators so:
Code:
[02/Jul/2015:01:55:59
gives
v[1]=
v[2]=02
v[3]=Jul
v[4]=2015
v[5]=01
v[6]=55
v[7]=59

mnum=index("xxJanFebMarAprMayJunJulAugSepOctNovDec", v[3])/3
Calculate month number from 3 char short month name. Index returns ordinal position in string of v[3]
so Jul gives 21. Once we divide by 3 we get the correct month number (eg Sep=9 Dec=12)

tm=mktime(v[4] " " mnum " " v[2] " " v[5] " " v[6] " " v[7])
mktime() requires string with "YYYY MM DD HH MM SS" format. Note if mnum has a invalid value like 0 or 1.33333, or the date is invalid in some other way (eg 30 Feb 2015) mktime returns -1, which will not be between the F and T values so nothing will be counted.

Code:
if (tm >= F && tm <= T) C[$1]++
else if(debug) print tm " not between " F " and " T

if tm is between FROM and TO increment the C[] array. This is the crux of the counting of ip addresses.
The C[] array array will use IP address (field $1) as the index and count as the value so it ends up like this:

Code:
C[192.168.0.20]=208
C[203.22.200.1]=15
C[215.215.215.215]=1051

To get the top 5 by count, you could use the GNU awk ordered arrays feature and only print the first 5 records. But in this case as it's only done after each file is processed it is much easier and still fairly efficient to use the external unix sort and head functions like this:
Code:
for(ip in C) printf "%7d %s\n", C[ip], ip | "sort -k1,1rn | head -5"
close("sort -k1,1rn | head -5")

Sort using first field -k1,1 with reverse order r numeric n sorting, head -5 for top 5
Note these 2 lines need to be in both the END and FNR==1 blocks, as a replacement for the existing for(... line

Last edited by Chubler_XL; 07-08-2015 at 12:57 PM.. Reason: Added close() for sort+head command plus some cleanup after a re-read
This User Gave Thanks to Chubler_XL For This Post:
# 35  
Old 07-08-2015
This is great! Thanks so much for doing that. This will be a great help to me and I am sure it will be a future help to readers also.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Collecting all lines between two time stamp from the log

Can you help me to collect the entire logs between two time stamp. The below awk command collecting the logs only if the line has time stamp. awk '$0>=from && $0<=to' from="150318 23:19:04" to="150318 23:55:04" log file 150318 23:19:04 logentries 150318 23:29:04 logentries 150318... (11 Replies)
Discussion started by: zenkarthi
11 Replies

2. Shell Programming and Scripting

To check time stamp in log file and calculate.

Hi Friends, I have the following logfile. i want to make a script for calculate time by time2 - time1 1600266278|random|1|2014-09-19 02:08:56.024|2014-09-19 02:08:59.398|A|B|ROOM|Num0208559970111101788|1|dog|dos 1600266200|random|4|2014-09-19 02:08:06.572|2014-09-19... (2 Replies)
Discussion started by: ooilinlove
2 Replies

3. Shell Programming and Scripting

Files with date and time stamp

Hi Folks, Need a clarification on files with date and time stamp. Here is my requirement. There is a file created everyday with the following format "file.txt.YYYYMMDDHHMMSS". Now i need to check for this file and if it is available then i need to do some task to the file. I tried... (6 Replies)
Discussion started by: jayadanabalan
6 Replies

4. Shell Programming and Scripting

awk - check time stamp between range or not

I want to check given time stamp is between the given time stamp or not. I am using AIX. YYYYMMDDHHMMSS abc.csv START TIME, END TIME 20130209018000,20130509022000 20120209018000,20130509022000 20120209018000,20130509022000 Script will check given time stamp is between above two range or... (2 Replies)
Discussion started by: vegasluxor
2 Replies

5. Shell Programming and Scripting

Select files by time stamp

Hi, I need help to read file in a directory on basis of time stamp. e.g. If file access in last 2 minutes it should not be copy to remote directory. Below is my script. +++++++++++++++++++++++++ #!/bin/ksh DATE=`date +"%Y-%m-%d_%H%M"` SEPARATER=" " exec < out_interfaces.cfg... (1 Reply)
Discussion started by: qamar.alam
1 Replies

6. Shell Programming and Scripting

Identify log files based on time stamp,zip and then copy..HELP

Hi All, PFB is a requirement. I am new to shell scripting. So plz help. It would be highly appreciated. 1. choose all the log files based on a particular date (files location is '/test/domain')--i.e,we should choose all the files that are modified on 29th November, neither 28th nor 30th 2.... (3 Replies)
Discussion started by: skdas_niladri
3 Replies

7. Shell Programming and Scripting

Old time stamp being updated for new files

Hello Friends I am facing a weird problem :confused:, we receive thousands of files in my system on a daily basis, access time stamp on some of the files are being updated as old time stamp like 1968-01-19, Could some one help me what could be causing this? so that i can narrow down the problem... (4 Replies)
Discussion started by: Prateek007
4 Replies

8. Shell Programming and Scripting

time stamp perl script error out of range 1..31

Hi, while running the perl script i am getting this error message , Day '' out of range 1..31 at rsty.sh line 44 what do iam missing in the script, any suggestion #!/usr/bin/perl use Time::Local; my $wday = $ARGV; my $month = $ARGV; # convert the month shortname into 0-11 number if... (4 Replies)
Discussion started by: saha
4 Replies

9. Solaris

doubt reg time stamp in files.

I copied a file from one host to another using sftp. But after copying the time stamp is not updating . Even though I checked the permission, it looks good. I copied the same file to some temporary location, there it updating the time stamp. Anyone have any idea on this (6 Replies)
Discussion started by: rogerben
6 Replies

10. UNIX for Dummies Questions & Answers

How to search for files based on the time stamp

Hi All, I know the timestamp of a file. Now i would like to list all the files in the with the same time stamp in the same file. Any help would be appreciated. Thanks. sunny (1 Reply)
Discussion started by: sunny_03
1 Replies
Login or Register to Ask a Question