The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Plz correct my syntax of shell script girish.batra Shell Programming and Scripting 2 06-06-2008 03:36 AM
Basic Shell script syntax help vignesh53 Shell Programming and Scripting 2 02-05-2008 06:14 AM
Shell script not processing if statement properly jonathan184 Shell Programming and Scripting 2 05-08-2007 12:24 PM
Shell script syntax checker handak9 Shell Programming and Scripting 2 07-16-2004 12:56 AM

Reply
 
Submit Tools LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 02-03-2008
Registered User
 

Join Date: Feb 2008
Posts: 7
Stumble this Post!
Shell Script syntax for XML processing

Hi All,

I am new to Shell scripting.

I have a log file containing XML Messages.Each XML Message is accompanied with a timestamp.I need to count the the number of messages that get logged in a particular timeinterval.Is there any command/Syntax to achieve this.

Any code/example is welcome.

Thanks
Vignesh.
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 02-03-2008
Technorati Master
 

Join Date: Mar 2005
Location: Large scale systems...
Posts: 2,547
Stumble this Post!
Quote:
Any code/example is welcome.


I thought it should have been the other way.

To parse XML messages, I suggest using XML::Parser

XML::Parser - A perl module for parsing XML documents - search.cpan.org

Quite easy, quick and efficient to use the above module
Reply With Quote
  #3 (permalink)  
Old 02-03-2008
Registered User
 

Join Date: Feb 2008
Posts: 7
Stumble this Post!
Shell Scripting Syntax for XML Processing

Hi madhan,


Thanks for ur reply.Actually there is no need to parse my XML.Each XML message is accompanied with a timestamp value(it is not a XML field).I just need to count the no of messages within a particular time interval using shell scripting.


My XML looks like this

2008-01-27 00:05:00 (2008-01-27 00:05:00.055000000Z): message={Data="<?xml version="1.0" encoding="UTF-8"?>
<Envelope>
<Header>
....
....
</Header>
<Body>
....
....
</Body>
</Envelope>"
2008-01-27 00:05:12 (2008-01-27 00:05:12.055000000Z): message={Data="<?xml version="1.0" encoding="UTF-8"?>
<Envelope>
<Header>
....
....
</Header>
<Body>
....
....
</Body>
</Envelope>"

Thanks
Vignesh
Reply With Quote
  #4 (permalink)  
Old 02-04-2008
rikxik's Avatar
Registered User
 

Join Date: Dec 2007
Posts: 104
Stumble this Post!
Can be done in python (works upto seconds (not milliseconds) and also works across days):

LOG
Code:
2008-01-27 00:05:00 (2008-01-27 00:05:00.055000000Z): message={Data="<?xml version="1.0" encoding="UTF-8"?>
<Envelope>
<Header>
....
....
</Header>
<Body>
....
....
</Body>
</Envelope>
2008-01-27 00:05:12 (2008-01-27 00:05:12.055000000Z): message={Data="<?xml version="1.0" encoding="UTF-8"?>
<Envelope>
<Header>
....
....
</Header>
<Body>
....
....
</Body>
</Envelope>
2008-01-27 00:05:12 (2008-01-27 00:06:10.055000000Z): message={Data="<?xml version="1.0" encoding="UTF-8"?>
<Envelope>
<Header>
....
....
</Header>
<Body>
....
....
</Body>
</Envelope>
Python script:
Code:
import time
import datetime
import re
import sys

#---------------------------------------------------------
# Variables and defs
#---------------------------------------------------------

if len(sys.argv) == 1:
        print "Usage : ", sys.argv[0], "<file_name>", "from_date(in_double_quotes)", "to_date(in_double_quotes)"
        quit()

[file, frm, to] = sys.argv[1:4]
dtfmt = "%Y-%m-%d %H:%M:%S"
patstr = "^[0-9]{4}-[0-9]{2}-[0-9]{2} "

def mk_date(dstr, dpat):
        return time.mktime(time.strptime(dstr, dpat))

fobj = mk_date(frm, dtfmt)
tobj = mk_date(to, dtfmt)
datepat = re.compile(patstr)

#---------------------------------------------------------
# main part
#---------------------------------------------------------
fh = open(file, 'r')
count = 0
for line in fh:
        if (datepat.match(line)):
                dt = line.split('(')[1].split(')')[0][0:19]
                dobj = mk_date(dt, dtfmt)
                if tobj > dobj > fobj:
                        count += 1
print "Total ocurrances between ", frm, " and ", to, ":", count

fh.close()
Usage:
Code:
C:\>lr.py
Usage :  C:\lr.py <file_name> from_date(in_double_quotes) to_date(in_double_quotes)

C:\>lr.py log.xml "2008-01-27 00:04:59" "2008-01-27 00:05:13"
Total ocurrances between  2008-01-27 00:04:59  and  2008-01-27 00:05:13 : 2
HTH
Reply With Quote
  #5 (permalink)  
Old 02-04-2008
Registered User
 

Join Date: Feb 2008
Posts: 7
Stumble this Post!
Thanks for your help.
Reply With Quote
  #6 (permalink)  
Old 02-04-2008
Registered User
 

Join Date: Jan 2008
Posts: 306
Stumble this Post!
A perl solution:

Code:
#!/usr/bin/perl
use strict;
use warnings;
my ($start,$end) = @ARGV;
unless ($start =~ /^\d{4}-\d\d-\d\d\s+\d\d:\d\d:\d\d$/ && 
        $end   =~ /^\d{4}-\d\d-\d\d\s+\d\d:\d\d:\d\d$/) {
   die qq{Usage: perl path/to/count.pl "start" "end"
start and end = "YYYY-MM-DD HH:MM:SS" including quotes};
}
my $count = 0;
(my $s = $start) =~ tr/- ://d;
(my $e = $end)   =~ tr/- ://d;
open (my $in, 'path/to/input_file') or die "$!"; 
while(<$in>){
    if (/^(\d\d\d\d)-(\d\d)-(\d\d)\s+(\d\d):(\d\d):(\d\d)/) {
        $count++ if ($e ge "$1$2$3$4$5$6" && "$1$2$3$4$5$6" ge $s);
    }
}
close $in;
print "Number of messages found between $start and $end: $count\n";
Reply With Quote
Google The UNIX and Linux Forums
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 10:04 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0