More complicated log parsing


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting More complicated log parsing
# 8  
Old 06-06-2007
Code:
awk '{  
         if ( b= match($0,"xml")) {
           xml = substr($0,b-2)
         }
         else {
          next
         }
         n=split($0,line," ")
         m=split(line[5],file,"|")         
         if ( file[2] ~ /^[0-9]/) {
             filename=file[2]
             print xml > filename
         }     
       
      }

' "file"

# 9  
Old 06-06-2007
Thanks ghostdog,

It is interesting to see the different methods everyone has to tackle the same parse.

Ghostdog: I tried to add a bit to the if statement to only output lines that had '[HandleRequest] AService' in them (so that I would have only one xml, per 01/02-YYYYMMDD-XXXXXX), but I could not get it to work successfully.
Could you please post up this addition?


Jean-Pierre: Your awk script is very short and seems efficient, but with the addition of '&& /[HandleRequest] AService/', I cannot get any output. I have tried on many different servers, and while your original script creates the outputs, for some reason no matter what I try with the second one, I cannot get any output files. Do you have any suggestions? It is strange that it works for you but not for me at all. Is there an awk log or hidden verbosity I could enable to trace the activity? Also, how are you getting both of the AService lines output, one is a request and one is a reply, so one should definitely not pass your filters(?).
# 10  
Old 06-07-2007
Quote:
Originally Posted by sjug
Ghostdog: I tried to add a bit to the if statement to only output lines that had '[HandleRequest] AService' in them (so that I would have only one xml, per 01/02-YYYYMMDD-XXXXXX), but I could not get it to work successfully.
Could you please post up this addition?
Code:
awk '/HandleRequest.*AService/{  
         if ( b= match($0,"xml") ) {
           xml = substr($0,b-2)
           print $0
         }
         else {
          next
         }
         n=split($0,line," ")
         m=split(line[5],file,"|")         
         if ( file[2] ~ /^[0-9]/) {
             filename=file[2]
             print xml > filename
         }     
      }
' "file"

# 11  
Old 06-07-2007
Code:
#!/usr/bin/awk -f
# Awk script: extract.awk

BEGIN {
   FS = "|";
}
$2 ~ /^02-[0-9]+-[0-9]+$/ && /\[HandleRequest\] AService/ {
   if (file && file != $2) close(file);
   file = $2;
   sub(/^.*.Service/, "", $0);
   print $0 >> file;
}

Jean-Pierre.
# 12  
Old 06-07-2007
Quote:
Originally Posted by sjug
Thanks for the amazingly quick response.

As you said, untested, but so far good start.
Two main issues
First, it doesn't quite work. I neglected to mention that I only want to get what is after the HandleRequest for AService ([HandleRequest] AService).
It does make files in the current state, but selectively, and not the correct handle/service.

Second, is the millions of other files that are created named after random tags that are on their own lines. All the other garbage files all start with "<", what would the if statement in the brackets look like (if [ $filename != <* ] ?).
A small modification as,

Code:
echo "$line" | sed 's/^.*<?\(.*\)/<?\1/' >> $filename

Code:
>cat input
2007-06-05 14:11:09,570 INFO  External- |02-20070605-510669||>> [HandleRequest] LService<?xml version="1.0" encoding="utf-8" standalone="yes"?>
2007-06-05 14:11:12,752 INFO  External- |02-20070605-510669||<< [HandleResponse] LService<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
2007-06-05 14:11:22,997 INFO  External- |02-20070605-510669||>> [HandleRequest] AService<?xml version="1.0" encoding="utf-8" standalone="yes"?>
2007-06-05 14:11:38,191 INFO  External- |02-20070605-510669||<< [HandleResponse] AService<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

Code:
>running the script
filename is 02-20070605-510669
filename is 02-20070605-510669
filename is 02-20070605-510669
filename is 02-20070605-510669


Code:
>cat 02-20070605-510669
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<?xml version="1.0" encoding="utf-8" standalone="yes"?>
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>

# 13  
Old 06-07-2007
Thanks for all of your help ghostdog, Jean-Pierre, and matrixmadhan!

ghostdog: I have adjusted your second edit, and made it into a script, and it works with fantastic results on my very large log files.
How would I encorporate an if ( ! -f filename), as well as directory filing by date 01-YYYYMMDD-XXXXX into the corresponding YYYYMMDD directory) into the awk script?


Jean-Pierre: With your latest edit it works very well as well on the most complicated sample log I have provided, however my actual logs have more garbage in them that causes your script not to work. I have not been able to isolate what causes it to stop working. Also, when running with full very long xml code after the service search, the code is strangely truncated. I would like to send you some additional logs if you would not mind continuing to help me.

matrixmadhan: Your latest edit does work with my initial simple log, but also does stop functioning when it encounters more difficult log files, such as the one I posted in response to Jean-Pierre's early script. Additionally I only need to see the Request from Aservice.
# 14  
Old 06-07-2007
Quote:
Originally Posted by sjug
How would I encorporate an if ( ! -f filename), as well as directory filing by date 01-YYYYMMDD-XXXXX into the corresponding YYYYMMDD directory) into the awk script?
ermm sorry i don't understand.care to elaborate more?
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parsing Log File help

Hi, I am a newbie to scripting. I have multiple log files (saved as .gz) in a directory that looks like this 01-01-2013 10:00 pn: 123 01-01-2013 10:00 sn: 987 01-01-2013 10:00 Test1 01-01-2013 10:00 Result: Pass 01-01-2013 10:00 Time: 5:00 01-01-2013 10:00 Test2 01-01-2013 10:00... (3 Replies)
Discussion started by: linuxnew
3 Replies

2. Shell Programming and Scripting

Log parsing

I have a directory with daily logs that have records like this: Date: 04/17/13 Time: 09:29:15 IP: 123.123.123.123 URL: usr/local/file1 and I want to only count how many times each file was accessed (e.g. file1 in that example above), and I want to also look in all the logs in the current... (3 Replies)
Discussion started by: Jaymz
3 Replies

3. Shell Programming and Scripting

Help Parsing a Log File

Hello all, I am new to scripting and I have written a script that performs an Rsync on my NAS and then moves on to send me an email with the status etc. The problem is that I think Rsync is taking to long to complete and the IF statement is timing out, as it doesn't appear to move on. Here... (1 Reply)
Discussion started by: Mongrel
1 Replies

4. Shell Programming and Scripting

Log parsing script

Hello, I have a script that parses logs and sends the output via digitally signed and encrypted email. This script uses grep -v to exclude patterns in a file. The problem I have is if this is run via cron none of the pattern matching seems to occur. If I run it by hand it runs exactly as it is... (2 Replies)
Discussion started by: wpfontenot
2 Replies

5. Shell Programming and Scripting

Parsing complicated CSV file with sed

Yes, there is a great doc out there that discusses parsing csv files with sed, and this topic has been covered before but not enough to answer my question (unix.com forums). I'm trying to parse a CSV file that has optional quotes like the following: "Apple","Apples, are fun",3.60,4.4,"I... (3 Replies)
Discussion started by: analog999
3 Replies

6. Shell Programming and Scripting

Perl log parsing help

Hello, I'm sure this is a very simple problem, but I'm having trouble thinking of an efficient way to do the following: given a large centralized ssh-log, one file on a syslog server, not separated by machines (I wish it were), that looks something like this: Sep 27 16:20:56 machine-name... (1 Reply)
Discussion started by: droog72
1 Replies

7. Shell Programming and Scripting

Parsing a Complicated properties file

Hi All, I have a requirement to parse a file. Let me clear you all on the req. I have a job which contains multiple tasks and each task will have multiple attributes that will be in the below format. Each task will have some sequence number according to that sequence number tasks shld... (1 Reply)
Discussion started by: rajeshorpu
1 Replies

8. Shell Programming and Scripting

XML Log Parsing

I have a log file that is around 300 MB of data having continours soap responses as shown below( I have attached only one sample SOAP). I would require to have the following extracted and written onto a new file. timestamp WebPartId bus:block bus:unblock endpt:operation Please help me. ... (3 Replies)
Discussion started by: pk_eee
3 Replies

9. Shell Programming and Scripting

Parsing a large log

I need to parse a large log say 300-400 mb The commands like awk and cat etc are taking time. Please help how to process. I need to process the log for certain values of current date. But I am unbale to do so. (17 Replies)
Discussion started by: asth
17 Replies

10. Shell Programming and Scripting

parsing email log

Can anyone give me some examples of how I can parse the following lines of text so that all characters up to and including the @ symbol are deleted? Also, any duplicates would need to be deleted in order to produce the desired output. Any help is much appreciated and explanations of any... (5 Replies)
Discussion started by: jjamd64
5 Replies
Login or Register to Ask a Question