XML Parsing using awk


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting XML Parsing using awk
# 1  
Old 11-21-2012
Linux XML Parsing using awk

Hi All,

I have a problem to resolve. For following XML file, I need to parse the values based on Tag Name. I would prefer to use this by awk. I have used sed command to replace the tags (s/<SeqNo>//).

In this case there can be new tags introduced. So need to parse it based on Tag Name. Any awk command suggestions?
Code:
<Target>
    <SeqNo>43156489079</SeqNo>
    <AuthenticationToken><![CDATA[nY+sHZ2PrBmdj6wVnY]]></AuthenticationToken>
    <redcode>SKNEQGGEVHW</redcode>
    <GenError>Upload-Success</GenError>
</Target>
<Target>
    <SeqNo>43156489079</SeqNo>
    <AuthenticationToken><![CDATA[nY+sHZ2PrBmdj6wVnY]]></AuthenticationToken>
    <redcode>SKNEQGGEVHW</redcode>
    <GenError>Upload-Success</GenError>
</Target>


Last edited by Franklin52; 11-21-2012 at 05:49 AM.. Reason: Please use code tags for data and code samples
# 2  
Old 11-21-2012
What's the expected output?
And please wrap your code and data samples with code tags to preserve formatting.
# 3  
Old 11-21-2012
something like this.. ?

Code:
 
$ nawk -F"[<>]" -v pat="SeqNo" '$0~pat{print $3}' a.txt
43156489079
43156489079
$ nawk -F"[<>]" -v pat="redcode" '$0~pat{print $3}' a.txt
SKNEQGGEVHW
SKNEQGGEVHW
$ nawk -F"[<>]" -v pat="AuthenticationToken" '$0~pat{print $4}' a.txt
![CDATA[nY+sHZ2PrBmdj6wVnY]]
![CDATA[nY+sHZ2PrBmdj6wVnY]]

# 4  
Old 11-21-2012
Not quite sure how you like your output, like this?

Code:
awk -F"[<>]" '{print $5,$9,$13}' RS="</Target>\n" file
43156489079 SKNEQGGEVHW Upload-Success
43156489079 SKNEQGGEVHW Upload-Success

# 5  
Old 11-21-2012
Quote:
Originally Posted by Jotne
Not quite sure how you like your output, like this?

Code:
awk -F"[<>]" '{print $5,$9,$13}' RS="</Target>\n" file
43156489079 SKNEQGGEVHW Upload-Success
43156489079 SKNEQGGEVHW Upload-Success

A regexp RS will not work with all awk implementations.
# 6  
Old 11-21-2012
Linux Awk XML Parsing based on Tags

Hi I want parse this file and write into delimited file format
Source file:

<Target>
<SeqNo>43156489079</SeqNo>
<AuthenticationToken><![CDATA[nY+sHZ2PrBmdj6wVnY]]></AuthenticationToken>
<RedCode>SKNEQGGEVHW</RedCode>
<IncentiveGenError>Upload-Success</IncentiveGenError>
</Target>
<Target>
<SeqNo>43156489070</SeqNo>
<AuthenticationToken><![CDATA[nY+sHZ2PrBmdj6wVnY]]></AuthenticationToken>
<RedCode>SKNEQGGEVHW</RedCode>
<IncentiveGenError>Upload-Success</IncentiveGenError>
</Target>

Answer:

43156489079 SKNEQGGEVHW Upload-Success
43156489079 SKNEQGGEVHW Upload-Success

The tags can be changed in the order or new Tags can be introduced. So I want to parse this based on the Tag name.

---------- Post updated at 03:01 PM ---------- Previous update was at 01:12 PM ----------

Thanks for your input.. I used following script:

Code:
nawk 'BEGIN{FS="[<|>]"}
/<SeqNo>/{SeqNo=$3}
/<RedCode>/{Redcd=$3}
{printf(" %s,%s\n",SeqNo,Redcd)}' newack.xml

Only problem I found is its duplicating the results.. Any idea why?

Thanks,
Tons
# 7  
Old 11-21-2012
Not awk. But here you have one solution using XML::Twig parser in perl:
Code:
$ cat xmlfile 
<root>
  <Target>
    <SeqNo>43156489079</SeqNo>
    <AuthenticationToken><![CDATA[nY+sHZ2PrBmdj6wVnY]]></AuthenticationToken>
    <RedCode>SKNEQGGEVHW</RedCode>
    <IncentiveGenError>Upload-Success</IncentiveGenError>
  </Target>
  <Target>
    <SeqNo>43156489070</SeqNo>
    <AuthenticationToken><![CDATA[nY+sHZ2PrBmdj6wVnY]]></AuthenticationToken>
    <RedCode>SKNEQGGEVHW</RedCode>
    <IncentiveGenError>Upload-Success</IncentiveGenError>
  </Target>
</root>
$ cat script.pl
#!/usr/bin/perl

use strict;
use warnings;
use XML::Twig;

{
        my $twig = XML::Twig->new(
                twig_handlers => {
                        'Target' => sub {
                                printf qq|%s\n|, 
                                        join q| |, 
                                        map { $_->trimmed_text } 
                                        grep { ! $_->is_cdata && $_->is_text } 
                                        $_->descendants
                        }
                },
        )->parsefile( shift );
}
$ perl-5.14.2 script.pl xmlfile 
43156489079 SKNEQGGEVHW Upload-Success
43156489070 SKNEQGGEVHW Upload-Success

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Multiple command execution inside awk command during xml parsing

below is the output xml string from some other command and i will be parsing it using awk cat /tmp/alerts.xml <Alert id="10102" name="APP-DS-ds_ha-140018-componentFailure-S" alertDefinitionId="13982" resourceId="11427" ctime="1359453507621" fixed="false" reason="If Event/Log Level(ANY) and... (2 Replies)
Discussion started by: vivek d r
2 Replies

2. Shell Programming and Scripting

XML: parsing of the Google contacts XML file

I am trying to parse the XML Google contact file using tools like xmllint and I even dived into the XSL Style Sheets using xsltproc but I get nowhere. I can not supply any sample file as it contains private data but you can download your own contacts using this script: #!/bin/sh # imports... (9 Replies)
Discussion started by: ripat
9 Replies

3. Shell Programming and Scripting

xml parsing with awk

hi all.. need your help again.. i have xml file and i want to parsing some data from the xml file.. <ex-name="keroco"> <................> <................> <................> <br-name="cincai"> <ship="123456"> <...................> ... (3 Replies)
Discussion started by: buncit8
3 Replies

4. Shell Programming and Scripting

Help needed for parsing large XML with awk.

My XML structure looks like: <?xml version="1.0" encoding="UTF-8"?> <SearchRepository> <SearchItems> <SearchItem> ... </SearchItem> <SearchItem> ... ... (1 Reply)
Discussion started by: jasonjustice
1 Replies

5. Shell Programming and Scripting

Parsing XML in awk : OFS does not work as expected

Hi, I am trying to parse regular XML file where I have to reduce number of decimal points in some xml elements. I am using following AWK command to achive that : #!/bin/ksh EDITCMD='BEGIN { FS = ""; OFS=FS } { if ( $3 ~ "*\\.*" && length(substr($3,1+index($3,"."))) == 15 ) {... (4 Replies)
Discussion started by: martin.franek
4 Replies

6. Shell Programming and Scripting

parsing(xml) using nawk/awk

Hi , I have an xml format as shown below: <Info> <last name="sean" first name="john"/> <period="5" time="11"/> <test value="1",test2 value="2",test3 value="3",test4 value="5"> <old> <value1>1</value1> <value2>2</value2> </old> <new> <value1>4</value1> <value2>3</value2> </new>... (1 Reply)
Discussion started by: natalie23
1 Replies

7. Shell Programming and Scripting

parsing xml using awk

hello , i am trying to parse xml using awk however its a little bit tricky as i want <databases> <source> <host>prod</host> <port>1522</port> <tns>GP1</tns> <user>P11</user>... (6 Replies)
Discussion started by: amit1_x
6 Replies

8. Shell Programming and Scripting

Parsing xml using awk - more help needed

As per another thread - https://www.unix.com/shell-programming-scripting/81027-how-can-i-parse-xml-file-2.html I am using the following to extract the Subaccid and RecAccTotal from the xm file below awk -v v=SubaccId -F'' '$2==v{s=$3;getline;a+=$3}END {for (i in a)print v,i,a}' file Can... (6 Replies)
Discussion started by: frustrated1
6 Replies

9. Shell Programming and Scripting

parsing xml with awk/sed

Hi people!, I need extract from the file (test-file.txt) the values between <context> and </context> tag's , the total are 7 lines,but i can only get 5 or 2 lines!!:confused: Please look my code: #awk '/context/{flag=1} /\/context/{flag=0} !/context/{ if (flag==1) p rint $0; }'... (3 Replies)
Discussion started by: ricgamch
3 Replies

10. UNIX for Dummies Questions & Answers

Parsing XML dynamic data via awk?

I am trying to use a line of output in an XML file as input in another new XML file for processing purposes via a shell script. Since I am a newbie though, I'm not sure how to do this since the data is different everytime. I am using this technique with static data right now: echo -n "Running... (5 Replies)
Discussion started by: corwin43
5 Replies
Login or Register to Ask a Question