Parsing a mixed format (flatfile+xml) logfile


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Parsing a mixed format (flatfile+xml) logfile
# 8  
Old 11-04-2012
Quote:
Originally Posted by Corona688
Code:
BEGIN {         OLDFS=FS="[()]" }
 
{
        for(X in XML) delete ARR[X];
        # Save some bits, and re-split line using <
        A=$1;   B=$3;   FS="<"; $0=$2
        for(N=1; N<=NF; N++)  # Process "tagname>data" strings only.
        {
                if($N == "")                    continue;
                if(substr($N,1,1) == "/")       continue; # Ignore close-tags
                if(split($N, ARR, ">") == 2)    XML[ARR[1]]=ARR[2];
        }
 
        # XML["event_n"] would be "blah" for example.
        for(X in XML) print X, XML[X];
 
        # Split on whitespace, dashes, and colons, and process the rest.
        FS="[ \r\n\t:-]+";      $0=A" "B
        # ...now available in $1 ... $N.
        print $1, $2, $3, $4, $5, $6, $7, $8
        FS=OLDFS        # So the next line splits on  ()
}

Code:
$ awk -f xml.awk datafile
 
$

Hi Corona,

I decided to give this a try as it is much more elagant than what i am doing.

On executing the script, i am getting the below error:
Code:
bash-3.00$ awk -f vtest.awk datafile > 1234
awk: XML is not an array
 record number 1

The contents of vtest.awk is:

Code:
BEGIN {         OLDFS=FS="[()]" }
{
        for(X in XML) delete ARR[X];
#        for(N=1; N<=4; N++) sub(/~/, "|");
        # Save some bits, and re-split line using <
        A=$1;   B=$3;   FS="<"; $0=$2
        for(N=1; N<=NF; N++)  # Process "tagname>data" strings only.
        {
                if($N == "")                    continue;
                if(substr($N,1,1) == "/")       continue; # Ignore close-tags
                if(split($N, ARR, ">") == 2)    XML[ARR[1]]=ARR[2];
        }
        for(X in XML) print X, XML[X];
        # Split on whitespace, dashes, and colons, and process the rest.
        FS="[ \r\n\t:-]+";      $0=A" "B
        # ...now available in $1 ... $N.
        print $1, $2, $3, $4, $5, $6, $7, $8
        FS=OLDFS        # So the next line splits on  ()
}
END{
print NR,"Records Processed";
}

Can you please assist me in what i am doing wrong?
# 9  
Old 11-14-2012
Sorry, I've been away at a conference and hadn't had time to catch up on these things.

Please post some of the data you ran this with.
# 10  
Old 11-15-2012
Quote:
Originally Posted by Corona688
Sorry, I've been away at a conference and hadn't had time to catch up on these things.

Please post some of the data you ran this with.
Hi Corona,

No worries. Its great that you are helping us out here.

The issue was that i was using the wrong awk. when i used the /usr/xpg4/bin/awk i got the below error

Code:
/usr/xpg4/bin/awk: line 22 (NR=2431): Record too long (LIMIT: 19999 bytes)

The issue is that my original file can get quite big in certain cases. The data is actually read from a DB and in some cases a context is written into the file. This means that there are multiple xml tags(as escape characters) within our standatd xml tags as below. This makes some lines massive thus overflowing the awk.

Code:
&lt;config&gt;&lt;timeout&gt;60&lt;/timeout&gt;&lt;enable_timeout&gt;true&lt;/enable_timeout&gt;&lt;overview&gt;false&lt;/overview&gt;&lt;


Last edited by goddevil; 11-15-2012 at 09:20 PM..
# 11  
Old 11-15-2012
...and if you'd posted actual data when asked I could have even warned you about that.

You will need GNU awk to handle lines that huge.
# 12  
Old 11-15-2012
Quote:
Originally Posted by Corona688
...and if you'd posted actual data when asked I could have even warned you about that.

You will need GNU awk to handle lines that huge.
Of course Corona. I am afraid that i didnt have the liberty to post the actual data. That said, i shouldve taken the time to post a better example in order not to waste your time. I will make sure to do it in the future.
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

XML parsing

i have xml output in below format... <AlertsResponse> <Alert id="11216" name="fgdfg"> <AlertActionLog timestamp="1356521629778" user="admin" detail="Recovery Alert"/> </Alert> <Alert id="11215" name="gdfg <AlertActionLog timestamp="1356430119840" user=""... (12 Replies)
Discussion started by: vivek d r
12 Replies

2. Shell Programming and Scripting

XML: parsing of the Google contacts XML file

I am trying to parse the XML Google contact file using tools like xmllint and I even dived into the XSL Style Sheets using xsltproc but I get nowhere. I can not supply any sample file as it contains private data but you can download your own contacts using this script: #!/bin/sh # imports... (9 Replies)
Discussion started by: ripat
9 Replies

3. Shell Programming and Scripting

Parsing Logfile

Hi, I need to continuously monitor a logfile to get the log information between a process start and end. the logfile look like this abcdddddddddd cjjckkkkkkkkkkkk abc : Process started aaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaa aaaaaaaaaaaaaaaaaaaaaaa bbbbbbbbbbbbbbbbbbbbbbb abc... (6 Replies)
Discussion started by: Byorg
6 Replies

4. Shell Programming and Scripting

Generating XML from a flatfile

Hi all, I am trying to generate an XML file from a flatfile in ksh/bash (could also use perl at a pinch, but out of my depth there!). I have found several good solutions on this very forum for cases where the header line in the file forms the XML tags, however my flatfile is as follows:... (5 Replies)
Discussion started by: ianmrid
5 Replies

5. Shell Programming and Scripting

Parsing XML

I am trying to parse an xml file and trying to grab certain values and inserting them into database table. I have the following xml that I am parsing: <dd:service name="locator" link="false"> <dd:activation mode="manual" /> <dd:run mode="direct_persistent" proxified="false" managed="true"... (7 Replies)
Discussion started by: $criptKid617
7 Replies

6. UNIX for Advanced & Expert Users

XML Parsing

I had a big XML and from which I have to make a layout as below *TOTAL+CB | *CB+FX | CS |*IR | *TOTAL | -------------------------------------------------------------------------------------------------- |CB FX | | | | DMFXNY EMSGFX... (6 Replies)
Discussion started by: manas_ranjan
6 Replies

7. Shell Programming and Scripting

XML parsing

I have a xml file attached. I need to parse parameterId and its value My output should be like 151515 38 151522 32769 and so on.. Please help me. Its urgent (6 Replies)
Discussion started by: LavanyaP
6 Replies

8. Shell Programming and Scripting

logfile parsing

I thought I was pretty handy with awk until I got this one. :) I'm trying to parse a log file where the events could have different delimiters (2 scripts is ok), the errors are spread over multiple lines, and I"m trying to figure out how to not read the same lines that have already been read. ... (1 Reply)
Discussion started by: linkslice
1 Replies

9. Shell Programming and Scripting

Logfile parsing with variable, multiple criterias among multiple lines

Hi all I've been working on a bash script parsing through debug/trace files and extracting all lines that relate to some search string. So far, it works pretty well. However, I am challenged by one requirement that is still open. What I want to do: 1) parse through a file and identify all... (3 Replies)
Discussion started by: reminder
3 Replies

10. Shell Programming and Scripting

parsing xml

I want to use wget comment to parse an xml parse that exist in an online website. How can I connect it using shell script through Unix and how can I parse it?? (1 Reply)
Discussion started by: walnut
1 Replies
Login or Register to Ask a Question