Need help in creating a Unix Script to parse xml file


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Need help in creating a Unix Script to parse xml file
# 1  
Old 04-10-2008
Need help in creating a Unix Script to parse xml file

Hi All,

My requirement is create an unix script to parse the xml file and display the values of the Elements/value between the tags on console. Like say, I would like to fetch the value of errorCode from the below xml which is 'U007' and display it. Can we use SED command for this? I have tried using the following command but is not working:
Code:
sed -n -e "s/<errorCode>\([a-z]*[0-9]*\)<\/errorCode>/\1/p" /x01/hub/data/incoming/Txn200802251031080012-093624998419.xml

Code:
<gDSNError>
  <errorCode>U007</errorCode> 
  <errorDescription>00093624998419|BEST_BUY_LONG_DESCRIPTION|PDQ Import - The value coded against this attribute exceeds the maximum field length. Please amend & resend.</errorDescription> 
  <errorDateTime>2008-02-26T09:04:11.728-00:00</errorDateTime> 
  </gDSNError>

Can anyone please help me in creating the unix script to parse and display the values between the tags. Its damn urgent requirement for me I would be very thankful if anyone can help me on this

Last edited by Yogesh Sawant; 04-10-2008 at 08:26 AM.. Reason: added code tags
# 2  
Old 04-10-2008
Please provide us with the sample input and output
# 3  
Old 04-10-2008
Code:
awk '/<errorCode>/ {
 gsub(/<errorCode>|<\/errorCode>/,"")
 print $0
}' file

use a dedicated xml parser for more complex operations
# 4  
Old 04-10-2008
the sample xml input file is as below:

<gDSNError>
<errorCode>U007</errorCode>
<errorDescription>00093624998419|BEST_BUY_LONG_DESCRIPTION|PDQ Import - The value coded against this attribute exceeds the maximum field length. Please amend & resend.</errorDescription>
<errorDateTime>2008-02-26T09:04:11.728-00:00</errorDateTime>
</gDSNError>

I tried getting the values between the tags using the following code and I am able to get it:

#to get Error Description

Error_Desc = grep "<errorDescription>.*<.errorDescription>" {#hub_in_dir} | sed -e "s/^.*<errorDescription/<errorDescription/" | cut -f2 -d">"| cut -f1 -d"<"
Error_Code = grep "<errorCode>.*<.errorCode>" {#hub_in_dir} | sed -e "s/^.*<errorCode/<errorCode/" | cut -f2 -d">"| cut -f1 -d"<"

in addition my requirement is to write the values to the file with comma seperated. The output file should be something like specified below all in 1 line:

U007, 00093624998419|BEST_BUY_LONG_DESCRIPTION|PDQ Import - The value coded against this attribute exceeds the maximum field length. Please amend & resend, 2008-02-26T09:04:11.728-00:00
# 5  
Old 04-10-2008
of course you can use a complicated awk script to do this (and handle the cases when the sequence of data is something else than you gave above), but i would suggest you import the xml file in a database (for example, ms access or ms excel) and then run an sql query to extract the data the way you want. i think access can also create normalised tables for you.
# 6  
Old 04-10-2008
In our case as we get very large xml's and xml's being varying every time it may become bit complex to import the data to database and as i only need 2 or 3 tag values from the whole xml i dont find the need to import whole data to database. I can fetch those by using SED or awk Scripts.

please suggest me the way I can write the fetched data to a file in 1 line with comma separated using unix script(ksh or sh).

Input sample xml file:

<gDSNError>
<errorCode>U007</errorCode>
<errorDescription>00093624998419|BEST_BUY_LONG_DESCRIPTION|PDQ Import - The value coded against this attribute exceeds the maximum field length. Please amend & resend.</errorDescription>
<errorDateTime>2008-02-26T09:04:11.728-00:00</errorDateTime>
</gDSNError>

The output should be something like as below:

U007, 00093624998419|BEST_BUY_LONG_DESCRIPTION|PDQ Import - The value coded against this attribute exceeds the maximum field length. Please amend & resend, 2008-02-26T09:04:11.728-00:00
# 7  
Old 04-10-2008
Try this:

Code:
awk 'BEGIN{FS="<|>"}
NF==5&&!f{printf("%s",$3);f=1;next}
NF==5&&f{printf(",%s",$3)}
END{print ""}
' file

Regards

Last edited by Franklin52; 04-10-2008 at 02:39 PM..
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Parse xml file

I am trying to create a shell script that will parse an xml file (file attached). awk '/Id v=/ { print }' Test.xml | sed 's!<Id v=\"\(.*\)\"/>!\1!' > output.txt An output.txt file is created but it is empty. It should contain the value 222159 in it. Thanks. (7 Replies)
Discussion started by: cmccabe
7 Replies

2. UNIX for Dummies Questions & Answers

Parse xml file

HI Guys, Input .XML <xn:MeContext id="L0307"> <xn:ManagedElement id="1"> <xn:VsDataContainer id="1"> <xn:attributes> <xn:vsDataType>vsDataENodeBFunction</xn:vsDataType> ... (3 Replies)
Discussion started by: pareshkp
3 Replies

3. Shell Programming and Scripting

KSH - help needed for creating a script to generate xml file from text file

Dear Members, I have a table in Oracle DB and one of its column name is INFO which has data in text format which we need to fetch in a script and create an xml file of a new table from the input. The contents of a single cell of INFO column is like: Area:app - aam Clean Up Criteria:... (0 Replies)
Discussion started by: Yoodit
0 Replies

4. Shell Programming and Scripting

parse xml file

Hello all, Given the following extract from a xml file with multiple <JOB> .... </JOB> entries <JOB APPLICATION="APP" APR="0" AUG="0" AUTHOR="AUT" AUTOARCH="0" CMDLINE="/tmp/test1 %%var" CONFIRM="1" CREATION_DATE="20100430" CREATION_TIME="130739" ... (2 Replies)
Discussion started by: cabrao
2 Replies

5. Shell Programming and Scripting

Parse XML file in shell script

Hi Everybody, I have an XML file containing some data and i want to extract it, but the specific issue in my file is that the data is repeated some times like the following example : <section1> <subsection1> X=... Y=... Z=... <\subsection1> <subsection2> X=... Y=... Z=...... (2 Replies)
Discussion started by: yassine
2 Replies

6. Shell Programming and Scripting

Need to Parse XML from bash script

I am completely new to bash scripting and now need to write a bash script that would parse a XML file and take out values from specific tags. I tried using xsltproc, xml_grep commands. But the issue is that the XML i am trying to parse is not UTF 8. so those commands are unable to parse my XML's... (4 Replies)
Discussion started by: shivashankar.g
4 Replies

7. Shell Programming and Scripting

How can I parse xml file?

How can I parse file containing xml ? I am sure that its best to use perl - but my perl is not very good - can someone help? Example below contents of file containing the xml - I basically want to parse the file and have each field contained in a variable.. ie. I want to store the account... (14 Replies)
Discussion started by: frustrated1
14 Replies

8. Shell Programming and Scripting

Parse a string in XML file using shell script

Hi! I'm just new here and don't know much about shell scripting. I just want to ask for help in creating a shell script that will parse a string or value of the status in the xml file. Please sample xml file below. Can you please help me create a simple script to get the value of status? Also it... (46 Replies)
Discussion started by: ayhanne
46 Replies

9. Shell Programming and Scripting

Parse XML file

Hi, I need to parse the following XML data enclosed in <a> </a> XML tag using shell script. <X> ..... </X> <a> <b> <c>data1</c> <c>data2</c> </b> <d> <c>data3</c> </d> </a> <XX> ... </XX> (5 Replies)
Discussion started by: viki
5 Replies

10. Shell Programming and Scripting

How to parse a XML file using PERL and XML::DOm

I need to know the way. I have got parsing down some nodes. But I was unable to get the child node perfectly. If you have code please send it. It will be very useful for me. (0 Replies)
Discussion started by: girigopal
0 Replies
Login or Register to Ask a Question