Parsing XML file


 
Thread Tools Search this Thread
Top Forums UNIX for Dummies Questions & Answers Parsing XML file
# 1  
Old 05-22-2014
Parsing XML file

I want to parse xml file
sample file.......
Code:
<name locale="en">my_name<>/name><lastChanged>somedate</lastChanged><some more code here>
<name locale="en">tablename1<>/name><lastChanged>somedate</lastChanged>
 <definition><dbquery><sources><sql type="cognos">select * from tablename1</sql><lastChanged>somedate</lastChanged><somemorecode here><name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><lastChanged>somedate</lastChanged><name locale="en">col3<>/name><lastChanged>somedate</lastChanged><abc>bbbgbssx</<abc>
<name locale="en">tablename2<>/name><lastChanged>somedate</lastChanged><definition><dbquery><sources><sql type="cognos">select * from tablename2</sql><lastChanged>somedate</lastChanged><somemorecode here>
<name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><lastChanged>somedate</lastChanged><name locale="en">col3<>/name><abc>bbbgbssx</<abc>
<name locale="en">tablename3<>/name><lastChanged>somedate</lastChanged><definition><dbquery><sources><sql type="cognos">select * from tablename1</sql><somemorecode here><name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><name locale="en">col3<>/name><abc>bbbgbssx</<abc><usage>attribute</usage><datatype>char</datatype><collectionSequenceName>en</collectionSequenceName><collectionSequenceLevel>1</collectionSequenceLevel><querySubject status="sometext"><name locale="en">This is important</name><lastChanged>somedate</lastChanged><definition><modelQuery><sql type=cognos">select 
 tablename1.col1 as colname,
 tablename1.col2 as colname1,
 tablename2.col1 as colname2,
coalesce(tablename3.col1,0) as colname3
 from 
tablename1 join
tablename2
join
tablename3</sql></modelQuery></definition><lastChanged>somedate</lastChanged><name locale="en">tablename4<>/name><lastChanged>somedate</lastChanged><definition><dbquery><sources><sql type="cognos">select * from tablename1</sql><lastChanged>somedate</lastChanged><somemorecode here><name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><lastChanged>somedate</lastChanged><name locale="en">col3<>/name><lastChanged>somedate</lastChanged><abc>bbbgbssx</<abc><name locale="en">tablename5<>/name><lastChanged>somedate</lastChanged><definition><dbquery><sources><sql type="cognos">select * from tablename2</sql><lastChanged>somedate</lastChanged><somemorecode here>
 <name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><lastChanged>somedate</lastChanged><name locale="en">col3<>/name><abc>bbbgbssx</<abc>
 <name locale="en">tablename6<>/name><lastChanged>somedate</lastChanged>
 <definition><dbquery><sources><sql type="cognos">select * from tablename4</sql><somemorecode here>
 <name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><name locale="en">col3<>/name><abc>bbbgbssx</<abc><usage>attribute</usage><datatype>char</datatype><collectionSequenceName>en</collectionSequenceName><collectionSequenceLevel>1</collectionSequenceLevel><querySubject status="sometext"><name locale="en">This is also important</name><lastChanged>somedate</lastChanged><definition><modelQuery><sql type=cognos">select 
 tablename4.col1 as colname,
 tablename4.col2 as colname1,
 tablename5.col1 as colname2,
 (tablename5.col1*10) as colname3
 from 
tablename4 join
tablename4
</sql></modelQuery></definition><lastChanged>somedate</lastChanged>
<some more here similar to this>
<some more here similar to this>
<some more here similar to this>
<some more here similar to this>
<some more here similar to this>



there are N number of similar blocks within the same pattern in a single xml file with some other unnecessary text.
So first I have to extract these blocks into a single file and the finally

I need output like below into a file redirected

Code:
 This is important,tablename1,col1,colname,
This is important,tablename1,col2,colname1,
This is important,tablename2,col1,colname2,
This is important,tablename3,col1,colname3,
This is also important,tablename4,col1,colname,
This is also important,tablename4,col2,colname1,
This is also important,tablename5,col1,colname2,
This is also important,tablename5,col1,colname3

----like this

hope I have explained the whole scenario, please help

Thanks in advance

Last edited by Don Cragun; 05-29-2014 at 04:37 PM.. Reason: Add CODE tags AGAIN!
# 2  
Old 05-22-2014
Welcome to forums, Sorry your post is not clear, please modify.
# 3  
Old 05-22-2014
I agree with Akshay that your requirements are not clear.

It looks like you're saying that you have a file that contains blocks text where the first line in a block is:
Code:
- - - - -

and the last two lines in a block are:
Code:
 - - - - - 
 - - - -

For output you seem to want a list of comma separated lines using the data from the x XML tag in that block and lines where the 3rd field is "as" for each input block and you want the output blocks separated by the input block terminator with the leading spaces removed.

These are both pretty strange formats for an XML input file and for a CSV output file.

Please give us a clear description of your input file format and your output file format. (And use CODE tags so we can tell where leading and trailing spaces are present and where multiple adjacent spaces or tabs appear in your input and output.)

If you show us code that you have written to try to solve your problem it will help us understand what you're trying to do.
# 4  
Old 05-22-2014
Making the assumption that the '-----' characters are simply there to indicate where other text might be...try this one liner:

Code:
awk '/<sql>/{f++};/^from/{f && f--}f' xmlfile|awk -F"[<>]" '{a=$0};{if (a ~ "select") {c=$3}};{gsub(/[\.]/,",");gsub(/ as /,",");gsub(/,$/,"");{if ($0~",") print c","$0}}'

# 5  
Old 05-29-2014
code sample

Code:
<name locale="en">my_name<>/name><lastChanged>somedate</lastChanged><some more code here>
<name locale="en">tablename1<>/name><lastChanged>somedate</lastChanged>
<definition><dbquery><sources><sql type="cognos">select * from tablename1</sql><lastChanged>somedate</lastChanged><somemorecode here><name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><lastChanged>somedate</lastChanged><name locale="en">col3<>/name><lastChanged>somedate</lastChanged><abc>bbbgbssx</<abc>
<name locale="en">tablename2<>/name><lastChanged>somedate</lastChanged><definition><dbquery><sources><sql type="cognos">select * from tablename2</sql><lastChanged>somedate</lastChanged><somemorecode here>
<name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><lastChanged>somedate</lastChanged><name locale="en">col3<>/name><abc>bbbgbssx</<abc>
<name locale="en">tablename3<>/name><lastChanged>somedate</lastChanged><definition><dbquery><sources><sql type="cognos">select * from tablename1</sql><somemorecode here><name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><name locale="en">col3<>/name><abc>bbbgbssx</<abc><usage>attribute</usage><datatype>char</datatype><collectionSequenceName>en</collectionSequenceName><collectionSequenceLevel>1</collectionSequenceLevel><querySubject status="sometext"><name locale="en">This is important</name><lastChanged>somedate</lastChanged><definition><modelQuery><sql type=cognos">select 
tablename1.col1 as colname,
tablename1.col2 as colname1,
tablename2.col1 as colname2,
coalesce(tablename3.col1,0) as colname3
from 
tablename1 join
tablename2
join
tablename3</sql></modelQuery></definition><lastChanged>somedate</lastChanged><name locale="en">tablename4<>/name><lastChanged>somedate</lastChanged><definition><dbquery><sources><sql type="cognos">select * from tablename1</sql><lastChanged>somedate</lastChanged><somemorecode here><name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><lastChanged>somedate</lastChanged><name locale="en">col3<>/name><lastChanged>somedate</lastChanged><abc>bbbgbssx</<abc><name locale="en">tablename5<>/name><lastChanged>somedate</lastChanged><definition><dbquery><sources><sql type="cognos">select * from tablename2</sql><lastChanged>somedate</lastChanged><somemorecode here>
<name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><lastChanged>somedate</lastChanged><name locale="en">col3<>/name><abc>bbbgbssx</<abc>
<name locale="en">tablename6<>/name><lastChanged>somedate</lastChanged>
<definition><dbquery><sources><sql type="cognos">select * from tablename4</sql><somemorecode here>
<name locale="en">col1<>/name><lastChanged>somedate</lastChanged><abc>bbbbssx</<abc><name locale="en">col2<>/name><name locale="en">col3<>/name><abc>bbbgbssx</<abc><usage>attribute</usage><datatype>char</datatype><collectionSequenceName>en</collectionSequenceName><collectionSequenceLevel>1</collectionSequenceLevel><querySubject status="sometext"><name locale="en">This is also important</name><lastChanged>somedate</lastChanged><definition><modelQuery><sql type=cognos">select 
tablename4.col1 as colname,
tablename4.col2 as colname1,
tablename5.col1 as colname2,
(tablename5.col1*10) as colname3
from 
tablename4 join
tablename4
</sql></modelQuery></definition><lastChanged>somedate</lastChanged>
<some more here similar to this>
<some more here similar to this>
<some more here similar to this>
<some more here similar to this>
<some more here similar to this>

Moderator's Comments:
Mod Comment CODE tags use [ and ] delimiters, not < and >.

Last edited by Don Cragun; 05-29-2014 at 04:32 PM.. Reason: Fix tags.
# 6  
Old 06-01-2014
@ms2001 : if you just post sample data , what we have to think ? don't think that others will guess and answer you., from your last 2 sample it looks like you did not read Don Cragun's answer., without clear description we can't help.
# 7  
Old 06-01-2014
You changed your sample data after I posted a potential solution. Try this below it should work on the new sample data:

Code:
awk '/>select[[:space:]]*$/{f++};/^from/{f && f--}f' xmlfile|awk -F"[<>]" '{a=$0};{if (a ~ "select") {c=$(NF-12)","}};{gsub(/[\.]/,",");gsub(/ as /,",");gsub(/,$/,"");{gsub(/.*\(/,"");gsub(/[[:punct:]][0-9]*\)/,"")};{if ($0~",") print c$0}}'

This User Gave Thanks to pilnet101 For This Post:
 
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Help with parsing xml file

Hi, Need help with parsing xml data in unix and place it in a csv file. My xml file looks like this: <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <iwgroups> <nextid value="128"> </nextid> <iwgroup name="RXapproval" id="124" display-name="RXapproval"... (11 Replies)
Discussion started by: ajayakunuri
11 Replies

2. Shell Programming and Scripting

XML: parsing of the Google contacts XML file

I am trying to parse the XML Google contact file using tools like xmllint and I even dived into the XSL Style Sheets using xsltproc but I get nowhere. I can not supply any sample file as it contains private data but you can download your own contacts using this script: #!/bin/sh # imports... (9 Replies)
Discussion started by: ripat
9 Replies

3. Shell Programming and Scripting

Help in parsing XML output file in perl.

Hi I have an XML output like : <?xml version="1.0" encoding="ISO-8859-1" ?> - <envelope> - <body> - <outputGetUsageSummary> - <usgSumm rerateDone="5"> - <usageAccum accumId="269" accumCaptn="VD_DP_AR" inclUnits="9999999.00" inclUnitsUsed="0.00" shared="false" pooled="false"... (7 Replies)
Discussion started by: rkrish
7 Replies

4. Shell Programming and Scripting

Parsing an XML file

Hello, I have the following xml file as an input. <?xml version="1.0" encoding="UTF-8"?> <RECORDS PS3_VERSION="1104_01"><RECORD> <POI_ID>931</POI_ID> <SUPPLIER_ID>2</SUPPLIER_ID> <POI_PVID>997920846</POI_PVID> <DB_ID>1366650925</DB_ID> <REGION>H1</REGION> <POI_NAME NAME_TYPE="Official"... (4 Replies)
Discussion started by: ramky79
4 Replies

5. Shell Programming and Scripting

parsing xml file

Hello! We need to parse weblogic config.xml file and display rows in format: machine:listen-port:name:application_name In our enviroment the output should be (one line for every instance): Crm-Test-Web:8001:PIA:peoplesoft Crm-Test-Web:8011:PIA:peoplesoft... (9 Replies)
Discussion started by: annar
9 Replies

6. Shell Programming and Scripting

Help in parsing xml file (sed/nawk)

I have a large xml file as shown below: <input> <blah> <blah> <atr="blah blah value = ""> <blah> <blah> </input> ..2nd chunk... ..3rd chunk... ...4th chunk... All lines between <input> and </input> is one 'order' and this 'order' is repeated... (14 Replies)
Discussion started by: shekhar2010us
14 Replies

7. Shell Programming and Scripting

Parsing xml file

hi guys, great help to the original question, can i expand please? i have large files filled with blocks like this <Placemark> network type: hot line1 line2 line3 <styleUrl>red.png</styleUrl> </Placemark> <Placemark> network type: cold line1 line2 line3... (3 Replies)
Discussion started by: garvald
3 Replies

8. UNIX for Dummies Questions & Answers

Help parsing a XML file ....

Well I have read several threads on the subject ... but being a newbie like me makes it hard to understand ... What I need is the following: Input data: ------- snip --------- <FavouriteLocations> <FavouriteLocations class="FavouriteList"><Item... (6 Replies)
Discussion started by: misak
6 Replies

9. Shell Programming and Scripting

XML file parsing using script

Hi I need some help with XML file parsing. I have an XML file with the below tag, I need a script to identify the value of srvcName which is this case is "AAA srvc name". I need to put contents of this value which is AAA srvc and name into different variables using an array and then reformat it... (6 Replies)
Discussion started by: zmfcat1
6 Replies

10. UNIX for Advanced & Expert Users

Parsing xml file using Sed

Hi All, I have this(.xml) file as: <!-- define your instance here --> <instance name='ins_C2Londondev' user='' group='' fullname='B2%20-%20London%20(dev)' > <property> </property> </instance> I want output as: <!-- define your instance here --> <instance... (3 Replies)
Discussion started by: kapilkinha
3 Replies
Login or Register to Ask a Question