Sponsored Content
Top Forums Shell Programming and Scripting Parsing a mixed format (flatfile+xml) logfile Post 302724971 by Corona688 on Thursday 1st of November 2012 01:52:25 PM
Old 11-01-2012
First I should note that setting FS to a regex like this is a GNU awk feature. Most other versions of awk can't do that.

For a really complicated line like this, you can change FS on the fly and re-split a line by assigning $0 to it. You could do this with arrays and split() but it gets ugly to nest that too much.

First I split on () to extract the XML data, then I split on < to separate the tags from each other. I extract the string data in a loop and cram it into an array.

Then I split on whitespace, dashes, and commas while cramming all the data that wasn't processed before into $0.

Lastly I set FS back to [()] to get ready for the next line.

Not a complete solution since it's not clear where all your data is coming from, but should be enough for you to fill in the blanks:

Code:
BEGIN {         OLDFS=FS="[()]" }

{
        for(X in XML) delete ARR[X];
        # Save some bits, and re-split line using <
        A=$1;   B=$3;   FS="<"; $0=$2
        for(N=1; N<=NF; N++)  # Process "tagname>data" strings only.
        {
                if($N == "")                    continue;
                if(substr($N,1,1) == "/")       continue; # Ignore close-tags
                if(split($N, ARR, ">") == 2)    XML[ARR[1]]=ARR[2];
        }

        # XML["event_n"] would be "blah" for example.
        for(X in XML) print X, XML[X];

        # Split on whitespace, dashes, and colons, and process the rest.
        FS="[ \r\n\t:-]+";      $0=A" "B
        # ...now available in $1 ... $N.
        print $1, $2, $3, $4, $5, $6, $7, $8
        FS=OLDFS        # So the next line splits on  ()
}

Code:
$ awk -f xml.awk datafile

column username
new_val
old_val blabla
event_n blah
time 1347270053954
2012/09/10 12 18 18 username@192.168.1.1 OPERATION user succeeded

$

This User Gave Thanks to Corona688 For This Post:
 

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

parsing xml

I want to use wget comment to parse an xml parse that exist in an online website. How can I connect it using shell script through Unix and how can I parse it?? (1 Reply)
Discussion started by: walnut
1 Replies

2. Shell Programming and Scripting

Logfile parsing with variable, multiple criterias among multiple lines

Hi all I've been working on a bash script parsing through debug/trace files and extracting all lines that relate to some search string. So far, it works pretty well. However, I am challenged by one requirement that is still open. What I want to do: 1) parse through a file and identify all... (3 Replies)
Discussion started by: reminder
3 Replies

3. Shell Programming and Scripting

logfile parsing

I thought I was pretty handy with awk until I got this one. :) I'm trying to parse a log file where the events could have different delimiters (2 scripts is ok), the errors are spread over multiple lines, and I"m trying to figure out how to not read the same lines that have already been read. ... (1 Reply)
Discussion started by: linkslice
1 Replies

4. Shell Programming and Scripting

XML parsing

I have a xml file attached. I need to parse parameterId and its value My output should be like 151515 38 151522 32769 and so on.. Please help me. Its urgent (6 Replies)
Discussion started by: LavanyaP
6 Replies

5. UNIX for Advanced & Expert Users

XML Parsing

I had a big XML and from which I have to make a layout as below *TOTAL+CB | *CB+FX | CS |*IR | *TOTAL | -------------------------------------------------------------------------------------------------- |CB FX | | | | DMFXNY EMSGFX... (6 Replies)
Discussion started by: manas_ranjan
6 Replies

6. Shell Programming and Scripting

Parsing XML

I am trying to parse an xml file and trying to grab certain values and inserting them into database table. I have the following xml that I am parsing: <dd:service name="locator" link="false"> <dd:activation mode="manual" /> <dd:run mode="direct_persistent" proxified="false" managed="true"... (7 Replies)
Discussion started by: $criptKid617
7 Replies

7. Shell Programming and Scripting

Generating XML from a flatfile

Hi all, I am trying to generate an XML file from a flatfile in ksh/bash (could also use perl at a pinch, but out of my depth there!). I have found several good solutions on this very forum for cases where the header line in the file forms the XML tags, however my flatfile is as follows:... (5 Replies)
Discussion started by: ianmrid
5 Replies

8. Shell Programming and Scripting

Parsing Logfile

Hi, I need to continuously monitor a logfile to get the log information between a process start and end. the logfile look like this abcdddddddddd cjjckkkkkkkkkkkk abc : Process started aaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbb abc... (6 Replies)
Discussion started by: Byorg
6 Replies

9. Shell Programming and Scripting

XML: parsing of the Google contacts XML file

I am trying to parse the XML Google contact file using tools like xmllint and I even dived into the XSL Style Sheets using xsltproc but I get nowhere. I can not supply any sample file as it contains private data but you can download your own contacts using this script: #!/bin/sh # imports... (9 Replies)
Discussion started by: ripat
9 Replies

10. Shell Programming and Scripting

XML parsing

i have xml output in below format... <AlertsResponse> <Alert id="11216" name="fgdfg"> <AlertActionLog timestamp="1356521629778" user="admin" detail="Recovery Alert"/> </Alert> <Alert id="11215" name="gdfg <AlertActionLog timestamp="1356430119840" user=""... (12 Replies)
Discussion started by: vivek d r
12 Replies
MKDoc::XML(3pm) 					User Contributed Perl Documentation					   MKDoc::XML(3pm)

NAME
MKDoc::XML - The MKDoc XML Toolkit SYNOPSIS
This is an article, not a module. SUMMARY
MKDoc is a web content management system written in Perl which focuses on standards compliance, accessiblity and usability issues, and multi-lingual websites. At MKDoc Ltd we have decided to gradually break up our existing commercial software into a collection of completely independent, well- documented, well-tested open-source CPAN modules. Ultimately we want MKDoc code to be a coherent collection of module distributions, yet each distribution should be usable and useful in itself. MKDoc::XML is part of this effort. You could help us and turn some of MKDoc's code into a CPAN module. You can take a look at the existing code at http://download.mkdoc.org/. If you are interested in some functionality which you would like to see as a standalone CPAN module, send an email to <mkdoc-modules@lists.webarch.co.uk>. DISCLAIMER
MKDoc::XML is a low level XML library. MKDoc::XML::* modules do not make sure your XML is well-formed. MKDoc::XML::* modules can be used to work with somehow broken XML. MKDoc::XML::* modules should not be used as high-level parsers with general purpose XML unless you know what you're doing. WHAT'S IN THE BOX XML tokenizer MKDoc::XML::Tokenizer splits your XML / XHTML files into a list of MKDoc::XML::Token objects using a single regex. XML tree builder MKDoc::XML::TreeBuilder sits on top of MKDoc::XML::Tokenizer and builds parsed trees out of your XML / XHTML data. XML stripper MKDoc::XML::Stripper objects removes unwanted markup from your XML / HTML data. Useful to remove all those nasty presentational tags or 'style' attributes from your XHTML data for example. XML tagger MKDoc::XML::Tagger module matches expressions in XML / XHTML documents and tag them appropriately. For example, you could automatically hyperlink certain glossary words or add <abbr> tags based on a dictionary of abbreviations and acronyms. XML entity decoder MKDoc::XML::Decode is a pluggable, configurable entity expander module which currently supports html entities, numerical entities and basic xml entities. XML entity encoder MKDoc::XML::Encode does the exact reverse operation as MKDoc::XML::Decode. XML Dumper MKDoc::XML::Dumper serializes arbitrarily complex perl structures into XML strings. It is also able of doing the reverse operation, i.e. deserializing an XML string into a perl structure. AUTHOR
Copyright 2003 - MKDoc Holdings Ltd. Author: Jean-Michel Hiver This module is free software and is distributed under the same license as Perl itself. Use it at your own risk. SEE ALSO
Petal: http://search.cpan.org/dist/Petal/ MKDoc: http://www.mkdoc.com/ Help us open-source MKDoc. Join the mkdoc-modules mailing list: mkdoc-modules@lists.webarch.co.uk perl v5.10.1 2005-03-10 MKDoc::XML(3pm)
All times are GMT -4. The time now is 01:04 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy