Slow performance filtering file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Slow performance filtering file
# 1  
Old 02-10-2011
Slow performance filtering file

Please, I need help tuning my script. It works but it's too slow.
The code reads an acivity log file with 50.000 - 100.000 lines and filters error messages from it. The data in the actlog file look similar to this:
Code:
02/08/2011 00:25:01,ANR2034E QUERY MOUNT: No match found using this criteria. (SESSION: 211433)
or
02/08/2011 01:05:50,"ANE4991I (Session: 211857, Node: xxx) blablabla (SESSION: 211857)"

I was trying to do the same with awk but it seems it's over my head. Any ideas will be appreciated. Thanks!

here is the code:
Code:
while read LINE
do
    A=$(echo ${LINE} | cut -c 21)
    if [[ ${A} == "\"" ]];then  
        MSGDATE=$(echo ${LINE} | cut -c 1-10)
        MSGTIME=$(echo ${LINE} | cut -c 12-19)
        PREFIX=$(echo ${LINE} | cut -c 22-24)
        MSGNUM=$(echo ${LINE} | cut -c 25-28)
        MSGTYPE=$(echo ${LINE} | cut -c 29)
        MESSAGE=$(echo ${LINE} | cut -c 30-)
        if [[ ${MSGTYPE} == "E" ]];then
            echo "${MSGDATE},${MSGTIME},${PREFIX}${MSGNUM}${MSGTYPE},${MESSAGE}" >> ${TMPFILE}
        fi
    else  
        MSGDATE=$(echo ${LINE} | cut -c 1-10)
        MSGTIME=$(echo ${LINE} | cut -c 12-19)
        PREFIX=$(echo ${LINE} | cut -c 21-23)
        MSGNUM=$(echo ${LINE} | cut -c 24-27)
        MSGTYPE=$(echo ${LINE} | cut -c 28)
        MESSAGE=$(echo ${LINE} | cut -c 29-)
        if [[ ${MSGTYPE} == "E" ]];then
            echo "${MSGDATE},${MSGTIME},${PREFIX}${MSGNUM}${MSGTYPE},${MESSAGE}" >> ${TMPFILE}
        fi
    fi
done <${ACTLOGFILE}

# 2  
Old 02-10-2011
You posted sample input, but no output. Please, show us how do you want the output to look like for those lines.
# 3  
Old 02-10-2011
The output is comma separated text and looks like this:
Code:
02/10/2011,06:00:02,ANR2034E, QUERY MOUNT: No match found using this criteria. (SESSION: 235950)

# 4  
Old 02-11-2011
use the right tool

I have to admit I'm pretty turned off by this kind of shell script - spawning way too many trivial processes to perform minor processing. for instance, in perl:

Code:
#!/usr/bin/perl
use strict;
while (<>) {
    my ($date,$time,$rest) = /^(.{10}) (.{8}),(.+)$/;
    $rest = $1 if ($rest =~ /^"(.+)"$/);
    my ($prefix,$num,$type,$message) = ($rest =~ /^(...)(....)(.) (.+)$/);
    print "$date,$time,$prefix,$num,$type,$message\n" if ($type eq 'E');
}

could be down to ~3 lines with perl -n and integrating the 'E' test into the pattern match.

to me, shell is ok for command-y stuff (not processing). I turn to awk mainly for quick processing of simple, delimited tables on the commandline. for anything more than trivial (real parsing, any math), I use perl.
# 5  
Old 02-11-2011
Code:
awk '$2~/E$/ {sub(/"/,"",$2); $2=", "$2", ";print}' ${ACTLOGFILE} >${TMPFILE}

This User Gave Thanks to rdcwayx For This Post:
# 6  
Old 02-11-2011
Quote:
Originally Posted by rdcwayx
Code:
awk '$2~/E$/ {sub(/"/,"",$2); $2=", "$2", ";print}' ${ACTLOGFILE} >${TMPFILE}

That worked great. Also the script's performance went up significantly. Thanks!
Login or Register to Ask a Question

Previous Thread | Next Thread

8 More Discussions You Might Find Interesting

1. Filesystems, Disks and Memory

Slow copy (cp) performance when overwriting files

I have a lot of binary files I need to copy to a folder. The folder is already filled with files of the same name. Copying on top of the old files takes MUCH longer than if I were to delete the old files then copy the new files to the now-empty folder. This result is specific to one system -... (3 Replies)
Discussion started by: ces55
3 Replies

2. Solaris

Solaris 11.1 Slow Network Performance

I have identical M5000 machines that are needing to transfer very large amounts of data between them. These are fully loaded machines, and I've already checked IO, memory usage, etc... I get poor network performance even when the machines are idle or copying via loopback. The 10 GB NICs are... (7 Replies)
Discussion started by: christr
7 Replies

3. Red Hat

GFS file system performance is very slow

My code Hi All, I am having redhat linux 5.3 (Tikanga) with GFS file system and its very very slow for executing ls -ls command also.Please see the below for 2minits 12 second takes. Please help me to fix the issue. $ sudo time ls -la BadFiles |wc -l 0.01user 0.26system... (3 Replies)
Discussion started by: susindram
3 Replies

4. Infrastructure Monitoring

99% performance wa, slow server.

There is a big problem with the server (VPS based on OpenVZ, CentOS 5, 3GB RAM). The problem is the following. The first 15-20 minutes after starting the server is operating normally, the load average is less than or about 1.0, but then begins to increase sharply% wa, then hovers around 95-99%.... (2 Replies)
Discussion started by: draiphod
2 Replies

5. UNIX for Dummies Questions & Answers

Slow copy/performance... between volumes

hi guys We are seeing weird issues on my Linux Suse 10, it has lotus 8.5 and 1 filesystem for OS and another for Lotus Database. the issue is when the Lotus service starts wait on top is very high about 25% percent and in general CPU usage is very high we found that when this happens if we... (0 Replies)
Discussion started by: kopper
0 Replies

6. Filesystems, Disks and Memory

Slow Copy(CP) performance

Hi all We have got issues with copying a 2.6 GB file from one folder to another folder. Well, this is not the first issue we are having on the box currently, i will try to explain everything we have done from the past 2 days. We got a message 2 days back saying that our Production is 98%... (3 Replies)
Discussion started by: b_sri
3 Replies

7. Shell Programming and Scripting

egrep is very slow : How to improve performance

We have an egrep search in a while loop. egrep -w "$key" ${PICKUP_DIR}/new_update >> ${PICKUP_DIR}/update_record_new ${PICKUP_DIR}/new_update is 210 MB file In each iteration, the egrep on an average takes around 50-60 seconds to search. Ther'es nothing significant in the loop other... (7 Replies)
Discussion started by: hidnana
7 Replies

8. Post Here to Contact Site Administrators and Moderators

Help! Slow Performance

Is the performance now very, very slow (pages take a very long time to load)? Or is it just me? Neo (6 Replies)
Discussion started by: Neo
6 Replies
Login or Register to Ask a Question