![]() |
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.
|
|
google unix.com
|
|||||||
| Forums | Register | Forum Rules | Links | Albums | FAQ | Members List | Calendar | Search | Today's Posts | Mark Forums Read |
| Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here. |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| FTP run from shell script gives slow transfer rates | Countificus | Shell Programming and Scripting | 8 | 04-07-2009 05:50 PM |
| Shell script to parsing log | justbow | Shell Programming and Scripting | 10 | 12-06-2008 03:20 PM |
| egrep is very slow : How to improve performance | hidnana | Shell Programming and Scripting | 7 | 02-12-2008 07:13 AM |
| Help! Slow Performance | Neo | Post Here to Contact Site Administrators and Moderators | 6 | 08-25-2003 04:08 PM |
![]() |
|
|
LinkBack | Thread Tools | Search this Thread | Rate Thread | Display Modes |
|
|
|
||||
|
Performance of log parsing shell script very slow
Hello,
I am an absolute newbie and whatever I've written in the shell script (below) has all been built with generous help from googling the net and this forum. Please forgive any schoolboy mistakes. Now to the qn, my input file looks like this - 2009:04:03 08:21:41:513,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,Start - SvcName: OIS - EndUserName: - RequestId: null20090403082313 2009:04:03 08:21:41:775,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,End - SvcName: OIS - EndUserName: - RequestId: null20090403082313 - StatusCode: 0 - StatusText: Success 2009:04:03 08:21:45:660,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,Start - SvcName: VCC - EndUserName: - RequestId: 411111111111111120090403082318 2009:04:03 08:21:46:171,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,End - SvcName: VCC - EndUserName: - RequestId: 411111111111111120090403082318 - StatusCode: 0 - StatusText: Success 2009:04:03 08:21:49:583,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,Start - SvcName: CO - EndUserName: - RequestId: 20090403082321 2009:04:03 08:22:03:571,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,End - SvcName: CO - EndUserName: - RequestId: 20090403082321 - StatusCode: 0 - StatusText: Success From this I have to find out the resp time (start time - end time) for each request id for the SvcName CO. I have written this shell script (not finished though) however the performance is very slow (takes a minute to process a 100 line file). can you please point me in the right direction to improve the performance? Although not the exact subject of this post, if any pointers can be given to calculate the difference between start and end time, it will be quite helpful. Code:
#!/bin/bash
#script to resptime timings for CO call from logfile
#////////////////////////////////////
#if no command line args
if [ $# -ne 1 ]
then
echo 1>&2"Oops......Usage is wrong. $0 <tgtsrchfile>"
exit 2
fi
#assigning command line params to variables
srchFN=$1
#remove resptime.log if already present
checkfile="./resptime.log"
tempfile="./temp.log"
tempfile1="./temp1.log"
if [ -e $checkfile ];then
rm -r $checkfile
fi
if [ -e $tempfile ];then
rm -r $tempfile
fi
if [ -e $tempfile1 ];then
rm -r $tempfile1
fi
#if keywordfile not present
if [ ! -r $srchFN ]; then
echo Target search file $srchFN not present
exit 2
fi
#grep for request id
grep 'Start.*CO' $srchFN | awk -F "RequestId: " '{print $2}'>temp.log
#for each request id get starttime and end time and print into temp file
cat temp.log | while read line; do
#if string is empty
if [ -n $line ];then
sttime=`grep Start.*CO.*$line $srchFN | awk -F "," '{print $1}'|awk -F " " '{print $2}'`
endtime=`grep End.*CO.*$line $srchFN | awk -F "," '{print $1}'|awk -F " " '{print $2}'`
if [ -n "$sttime" -o -n "$endtime" ];then
echo $line,$sttime,$endtime>>temp1.log
fi
fi
done;
#/////////////////////////////////////
Last edited by jim mcnamara; 04-08-2009 at 10:51 AM.. Reason: code tags |
|
||||
|
My input file has one line for the start of a request and one for the end of the request. And there are hundreds of unique requests. So I am struggling to think of how else I can get the start and end time with one grep, for each req id...
|
|
|||||
|
something like this? substitute 'd' for your log file:
Code:
cat d |
while read line ; do
#----------------------------------------------------------------------#
# Start time and service name. #
#----------------------------------------------------------------------#
if [[ $line = *Service,Start* ]] ; then
echo $line |
sed -e 's/^.*SvcName: //' -e 's/ - .*$//' |
read service_name
echo $line |
sed -e 's/,INFO.*$//' |
read start_time
fi
#----------------------------------------------------------------------#
# End time. #
#----------------------------------------------------------------------#
if [[ $line = *Service,End*SvcName*$svcname*EndUser* ]] ; then
echo $line |
sed -e 's/,INFO.*$//' |
read end_time
echo start-time $start_time end-time $end_time service-name $service_name
fi
done
|
|
|||||
|
I would use Perl to calculate the date and time differences (using an external module Date::Manip):
Code:
#!/usr/bin/perl
use warnings;
use strict;
use Date::Manip;
my $infile = 'file';
open FH, $infile or die "$infile: $!";
my ( $sdate, $edate, $delta, $err, $ustart, $uend, $rtime );
while (<FH>) {
if (/SvcName: CO/) {
/Start -/ and ( $sdate = ( split ',' )[0] ) =~ s|:(.*?):|/$1/|;
/End -/ and ( $edate = ( split ',' )[0] ) =~ s|:(.*?):|/$1/|;
}
if ( $sdate && $edate ) {
$ustart = UnixDate( $sdate, "%s" );
$uend = UnixDate( $edate, "%s" );
if ($edate) {
$rtime = $ustart - $uend;
print $sdate, " - ", $edate, "\n";
print DateCalc( $sdate, $edate, \$err ), "\n";
}
}
}
Code:
% cat file 2009:04:03 08:21:41:513,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,Start - SvcName: OIS - EndUserName: - RequestId: null20090403082313 2009:04:03 08:21:41:775,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,End - SvcName: OIS - EndUserName: - RequestId: null20090403082313 - StatusCode: 0 - StatusText: Success 2009:04:03 08:21:45:660,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,Start - SvcName: VCC - EndUserName: - RequestId: 411111111111111120090403082318 2009:04:03 08:21:46:171,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,End - SvcName: VCC - EndUserName: - RequestId: 411111111111111120090403082318 - StatusCode: 0 - StatusText: Success 2009:04:03 08:21:49:583,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,Start - SvcName: CO - EndUserName: - RequestId: 20090403082321 2009:04:03 08:22:03:571,INFO ,servername,yyyy,undefined,,INFO,null,.....Out:xxService,End - SvcName: CO - EndUserName: - RequestId: 20090403082321 - StatusCode: 0 - StatusText: Success zsh-4.3.9[t]% ./s 2009/04/03 08:21:49:583 - 2009/04/03 08:22:03:571 +0:0:0:0:0:0:14 |
|
|||||
|
Here is another way to solve your problem using ksh93 ....
Code:
#!/usr/bin/ksh93
LOGFILE=./logfile
SVCNAME="CO"
awk -v str="$SVCNAME" -F, '$10 ~ "SvcName: "str {
split($10, arr1, " ")
date1=$1
getline
date2=$1
print arr1[4], arr1[9], date1, date2
next
}' $LOGFILE | while read svc id sdate stime fdate ftime
do
sdate1=$(printf '%(%s.%3N)T' "${sdate//:/-} ${stime:0:8}.${stime:9}")
fdate1=$(printf '%(%s.%3N)T' "${fdate//:/-} ${ftime:0:8}.${ftime:9}")
diff=$(( fdate1 - sdate1 ))
idiff=$(( int(diff) ))
usecs=$(( 1000 * (diff - int(diff)) ))
secs=$(( idiff % 60 ))
mins=$(( idiff % (60 * 60) / 60 ))
hours=$(( idiff / (60 * 60) ))
printf "%s %s - %s %s +%02d:%02d:%02d.%03d\n" ${sdate//://} $stime ${fdate//://} $ftime $hours $mins $secs $usecs
done
exit 0
Code:
2009/04/03 08:21:49:583 - 2009/04/03 08:22:03:571 +00:00:13.988 |
![]() |
| Bookmarks |
| Thread Tools | Search this Thread |
| Display Modes | Rate This Thread |
|
|