Sponsored Content
Full Discussion: Help parsing a XML file ....
Top Forums UNIX for Dummies Questions & Answers Help parsing a XML file .... Post 302398879 by fpmurphy on Thursday 25th of February 2010 09:49:03 PM
Old 02-25-2010
While you can use awk and other command line utilities to parse your file and extract the data you require, a better way is to use tools which are designed to work with XML documents.

For the purposes of this example, I have simplified the structure of your document down to the following elements (demo.xml):
Code:
<FavouriteLocations>
   <FavouriteLocations class="FavouriteList">
       <Item class="Favourite">
          <Object class="Location">
              <lat>LAT1</lat>
              <long>LONG1</long>
              <entryName>NAME1</entryName>
          </Object>
       </Item>
       <Item class="Favourite">
          <Object class="Location">
              <lat>LAT2</lat>
              <long>LONG2</long>
              <entryName>NAME2</entryName>
          </Object>
       </Item>
       <Item class="Favourite">
          <Object class="Location">
              <lat>LAT3</lat>
              <long>LONG3</long>
              <entryName>NAME3</entryName>
          </Object>
       </Item>
    </FavouriteLocations>
</FavouriteLocations>

The best tool to extract the data you want from this XML document is a stylesheet transformation processor such as xsltproc or saxon. Here is a simple stylesheet (demo.xsl) which extracts values of the lat, long and entryName elements.
Code:
<?xml version="1.0"?>
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">

<xsl:output method="text"/>

<xsl:template match="/">
   <xsl:apply-templates select="//Object" />
</xsl:template>

<xsl:template match="//Object">
    <xsl:value-of select="./lat"/>
    <xsl:text>   </xsl:text>
    <xsl:value-of select="./long"/>
    <xsl:text>   </xsl:text>
    <xsl:value-of select="./entryName"/>
    <xsl:text>
</xsl:text>
</xsl:template>

</xsl:stylesheet>

Here is the output from transforming the document using xsltrpoc.
Code:
$ xsltproc demo.xsl demo.xml
LAT1   LONG1   NAME1
LAT2   LONG2   NAME2
LAT3   LONG3   NAME3
$

 

10 More Discussions You Might Find Interesting

1. UNIX for Advanced & Expert Users

Parsing xml file using Sed

Hi All, I have this(.xml) file as: <!-- define your instance here --> <instance name='ins_C2Londondev' user='' group='' fullname='B2%20-%20London%20(dev)' > <property> </property> </instance> I want output as: <!-- define your instance here --> <instance... (3 Replies)
Discussion started by: kapilkinha
3 Replies

2. Shell Programming and Scripting

XML file parsing using script

Hi I need some help with XML file parsing. I have an XML file with the below tag, I need a script to identify the value of srvcName which is this case is "AAA srvc name". I need to put contents of this value which is AAA srvc and name into different variables using an array and then reformat it... (6 Replies)
Discussion started by: zmfcat1
6 Replies

3. Shell Programming and Scripting

Parsing xml file

hi guys, great help to the original question, can i expand please? i have large files filled with blocks like this <Placemark> network type: hot line1 line2 line3 <styleUrl>red.png</styleUrl> </Placemark> <Placemark> network type: cold line1 line2 line3... (3 Replies)
Discussion started by: garvald
3 Replies

4. Shell Programming and Scripting

Help in parsing xml file (sed/nawk)

I have a large xml file as shown below: <input> <blah> <blah> <atr="blah blah value = ""> <blah> <blah> </input> ..2nd chunk... ..3rd chunk... ...4th chunk... All lines between <input> and </input> is one 'order' and this 'order' is repeated... (14 Replies)
Discussion started by: shekhar2010us
14 Replies

5. Shell Programming and Scripting

parsing xml file

Hello! We need to parse weblogic config.xml file and display rows in format: machine:listen-port:name:application_name In our enviroment the output should be (one line for every instance): Crm-Test-Web:8001:PIA:peoplesoft Crm-Test-Web:8011:PIA:peoplesoft... (9 Replies)
Discussion started by: annar
9 Replies

6. Shell Programming and Scripting

Parsing an XML file

Hello, I have the following xml file as an input. <?xml version="1.0" encoding="UTF-8"?> <RECORDS PS3_VERSION="1104_01"><RECORD> <POI_ID>931</POI_ID> <SUPPLIER_ID>2</SUPPLIER_ID> <POI_PVID>997920846</POI_PVID> <DB_ID>1366650925</DB_ID> <REGION>H1</REGION> <POI_NAME NAME_TYPE="Official"... (4 Replies)
Discussion started by: ramky79
4 Replies

7. Shell Programming and Scripting

Help in parsing XML output file in perl.

Hi I have an XML output like : <?xml version="1.0" encoding="ISO-8859-1" ?> - <envelope> - <body> - <outputGetUsageSummary> - <usgSumm rerateDone="5"> - <usageAccum accumId="269" accumCaptn="VD_DP_AR" inclUnits="9999999.00" inclUnitsUsed="0.00" shared="false" pooled="false"... (7 Replies)
Discussion started by: rkrish
7 Replies

8. Shell Programming and Scripting

XML: parsing of the Google contacts XML file

I am trying to parse the XML Google contact file using tools like xmllint and I even dived into the XSL Style Sheets using xsltproc but I get nowhere. I can not supply any sample file as it contains private data but you can download your own contacts using this script: #!/bin/sh # imports... (9 Replies)
Discussion started by: ripat
9 Replies

9. UNIX for Dummies Questions & Answers

Parsing XML file

I want to parse xml file sample file....... <name locale="en">my_name<>/name><lastChanged>somedate</lastChanged><some more code here> <name locale="en">tablename1<>/name><lastChanged>somedate</lastChanged> <definition><dbquery><sources><sql type="cognos">select * from... (10 Replies)
Discussion started by: ms2001
10 Replies

10. Shell Programming and Scripting

Help with parsing xml file

Hi, Need help with parsing xml data in unix and place it in a csv file. My xml file looks like this: <?xml version="1.0" encoding="UTF-8" standalone="yes" ?> <iwgroups> <nextid value="128"> </nextid> <iwgroup name="RXapproval" id="124" display-name="RXapproval"... (11 Replies)
Discussion started by: ajayakunuri
11 Replies
xsltproc(1)                                                   General Commands Manual                                                  xsltproc(1)

NAME
xsltproc - command line xslt processor SYNOPSIS
xsltproc [-V | -v | -o file | --timing | --repeat | --debug | --novalid | --noout | --maxdepth val | --html | --docbook | --param name value | --stringparam name value | --nonet | --warnnet | --catalogs | --xinclude | --profile] [stylesheet] [ file1 ] [ file2 ] [ .... ] INTRODUCTION
xsltproc is a command line tool for applying XSLT stylesheets to XML documents. It is part of libxslt, the XSLT C library for GNOME. While it was developed as part of the GNOME project, it can operate independently of the GNOME desktop. xsltproc is invoked from the command line with the name of the stylesheet to be used followed by the name of the file or files to which the stylesheet is to be applied. If a stylesheet is included in an XML document with a Stylesheet Processing Instruction, no stylesheet need be named at the command line. xsltproc will automatically detect the included stylesheet and use it. By default, output is to stdout. You can specify a file for output using the -o option. COMMAND LINE OPTIONS
-V or --version Show the version of libxml and libxslt used. -v or --verbose Output each step taken by xsltproc in processing the stylesheet and the document. -o or --output file Direct output to the file named file. For multiple outputs, also known as "chunking", -o directory/ directs the output files to a specified directory. The directory must already exist. --timing Display the time used for parsing the stylesheet, parsing the document and applying the stylesheet and saving the result. Displayed in milliseconds. --repeat Run the transformation 20 times. Used for timing tests. --debug Output an XML tree of the transformed document for debugging purposes. --novalid Skip loading the document's DTD. --noout Do not output the result. --maxdepth value Adjust the maximum depth of the template stack before libxslt concludes it is in an infinite loop. The default is 500. --html The input document is an HTML file. --docbook The input document is DocBook SGML. --param name value Pass a parameter of name name and value value to the stylesheet. You may pass multiple name/value pairs up to a maximum of 32. If the value being passed is a string rather than a node identifier, use --stringparam instead. --stringparam name value Pass a paramenter of name name and value value where value is a string rather than a node identifier. --nonet Do not use the Internet to fetch DTD's or entities. --warnnet Output notification when DTD's or entities are fetched over the Internet. --catalogs Use catalogs to resolve the location of external entities. This speeds DTD resolution. By having a catalog file point to a local version of the DTD, xsltproc does not have to use the Internet to fetch the DTD. xsltproc uses the catalog identified by the envi- ronmental variable SGML_CATALOG_FILES. --xinclude Process the input document using the Xinclude specification. More details on this can be found in the Xinclude specification: http://www.w3.org/TR/xinclude/. --profile or --norman Output profiling information detailing the amount of time spent in each part of the stylesheet. This is useful in optimizing stylesheet performance. RETURN VALUES
xsltproc's return codes provide information that can be used when calling it from scripts. 0: normal 1: no argument 2: too many parameters 3: unknown option 4: failed to parse the stylesheet 5: error in the stylesheet 6: error in one of the documents 7: unsupported xsl:output method 8: string parameter contains both quote and double-quotes MORE INFORMATION
libxml web page: http://www.xmlsoft.org/ W3C XSLT page: http://www.w3.org/TR/xslt AUTHOR
Copyright 2001 by John Fleck <jfleck@inkstain.net>. This is release 0.2 of the xsltproc Manual. NOTES
Source for libxslt is available in the SUNWlxslS package. Documentation for libxslt is available on-line at http://xmlsoft.org/XSLT/ 2002 Jun 27 xsltproc(1)
All times are GMT -4. The time now is 08:26 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy