Getting VALUE from Big XML File -- That's All


 
Thread Tools Search this Thread
Homework and Emergencies Emergency UNIX and Linux Support Getting VALUE from Big XML File -- That's All
# 1  
Old 01-15-2016
Getting VALUE from Big XML File -- That's All

We got data that was supposed to be CSV, but was sent in a huge XML file.

I've downloaded xmlstarlet, but I'm darned if I can get it to operate the "sel" feature to look down a path and get any sort of value. I see pieces of what should be paths, but they seem to have extraneous characters, and I don't know how to use the various <...> fields to make s decent query. For example,
I want to get: <es:mixedModeRadio>false</es:mixedModeRadio> from the below small piece of the XML file: How?

Code:
xmlstarlet sel "/<configData dnPrefix="Undefined">/<xn:SubNetwork id="ONRM_ROOT_MO_R">/<xn:SubNetwork     id="MyTown">/<xn:MeContext id="LL12345">/<xn:VsDataContainer id="LL12345">"

Is there an easier way? Is there some intermediate step I'm missing?

Here's a very tiny part of a very large file:

Code:
<?xml version="1.0" encoding="UTF-8"?>
<bulkCmConfigDataFile xmlns:un="utranNrm.xsd"
    xmlns:es="Edward.15.25.xsd"
    xmlns:xn="genericNrm.xsd" xmlns:gn="geranNrm.xsd" xmlns="configData.xsd">
    <fileHeader fileFormatVersion="32.615 V4.5" vendorName="Edward"/>
    <configData dnPrefix="Undefined">
        <xn:SubNetwork id="ONRM_ROOT_MO_R">
            <xn:SubNetwork id="MyTown">
                <xn:attributes>
                    <xn:userDefinedNetworkType>MY_SERVERS</xn:userDefinedNetworkType>
                    <xn:userLabel>MyTown</xn:userLabel>
                </xn:attributes>
                <xn:MeContext id="LL12345">
                    <xn:VsDataContainer id="LL12345">
                        <xn:attributes>
                            <xn:vsDataType>vsDataMeContext</xn:vsDataType>
                            <xn:vsDataFormatVersion>EdwardSpecificAttributes.15.25</xn:vsDataFormatVersion>
                            <es:vsDataMeContext>
                                <es:userLabel>LL12345</es:userLabel>
                                <es:ipAddress>11.164.0.116</es:ipAddress>
                                <es:neMIMversion>vF.1.107</es:neMIMversion>
                                <es:lostSynchronisation>SYNCHRONISED</es:lostSynchronisation>
                                <es:bcrLastChange>1452424403156</es:bcrLastChange>
                                <es:bctLastChange>1452160614628</es:bctLastChange>
                                <es:multiStandardRbs6k>true</es:multiStandardRbs6k>
                                <es:mixedModeRadio>false</es:mixedModeRadio>
                                <es:mirrorMIBversion>F.1.100.S.1.6</es:mirrorMIBversion>
                                <es:stnNodes></es:stnNodes>
                            </es:vsDataMeContext>
                        </xn:attributes>
                    </xn:VsDataContainer>
                    <xn:ManagedElement id="1">
                        <xn:attributes>
                            <xn:locationName></xn:locationName>
                            <xn:userDefinedState></xn:userDefinedState>
                            <xn:vendorName>Edward</xn:vendorName>
                            <xn:userLabel>LL12345</xn:userLabel>
                            <xn:managedElementType>ERBS</xn:managedElementType>
                            <xn:swVersion>108991/23_R0DX</xn:swVersion>
                            <xn:managedBy>SubNetwork=ONRM_ROOT_MO_R,ManagementNode=ONRM</xn:managedBy>


Last edited by Don Cragun; 01-15-2016 at 06:04 PM.. Reason: Add CODE and ICODE tags.
# 2  
Old 01-16-2016
Not sure I understand your question. Do you need help with xmlstarlet or just need to extract that line?
# 3  
Old 01-16-2016
As a complete newbie to XML, I need help with xmlstarlet. That particular line is just an example of one of the values I need to extract. I don't understand the syntax of how to use xmlstarlet to do that sort of thing. An example would help.

Last edited by gmark99; 01-16-2016 at 10:55 AM..
# 4  
Old 01-19-2016
Option for sel (select) is using xpath.

What is Xpath ? May be you might get some basics from URL:
XPath Tutorial

Code:
bash-2.03$ xml
XMLStarlet Toolkit: Command line utilities for XML
Usage: xml [<options>] <command> [<cmd-options>]
where <command> is one of:
   ed    (or edit)      - Edit/Update XML document(s)
   sel   (or select)    - Select data or query XML document(s) (XPATH, etc)
   tr    (or transform) - Transform XML document(s) using XSLT
   val   (or validate)  - Validate XML document(s) (well-formed/DTD/XSD/RelaxNG)
   fo    (or format)    - Format XML document(s)
   el    (or elements)  - Display element structure of XML document
   c14n  (or canonic)   - XML canonicalization
   ls    (or list)      - List directory as XML
   esc   (or escape)    - Escape special XML characters
   unesc (or unescape)  - Unescape special XML characters
   pyx   (or xmln)      - Convert XML into PYX format (based on ESIS - ISO 8879)
   p2x   (or depyx)     - Convert PYX into XML
<options> are:
   --version            - show version
   --help               - show help
Wherever file name mentioned in command help it is assumed
that URL can be used instead as well.

Type: xml <command> --help <ENTER> for command help

Most of distro has xmllint. That is my fav.
This User Gave Thanks to chakrapani For This Post:
# 5  
Old 01-19-2016
One of my biggest concerns is the size of the file we're looking at -- 3.5Gig. I've been told that xmllib2 has problems with files that are around a few hundred lines. Is XMLSTARLET fairly stable with large files?
# 6  
Old 01-20-2016
I have used xmllint with huge files
Code:
xmllint --format hugefile.xml >> hugefile_formatted.xml

to get formatted or use --shell option to get you values using xpath.
# 7  
Old 01-25-2016
Wonderful! Now let me ask one more question: Examples of "XPATH" don't quite correspond to examples I see, so tags like "/first/second/third" from a body of XML that looks like:
Code:
<first>
    <second>
        <third>

... in my case is more like:
Code:
<?xml version="1.0" encoding="UTF-8"?>
<first xmlns:un="Floo.xsd"
    xmlns:es="MoreFloo.xsd"
    xmlns:xn="MoreEvenFloo.xsd" xmlns:gn="geranNrm.xsd" xmlns="configData.xsd">
 <second fileFormatVersion="32.615 V4.5" vendorName="Edward"/>
    <third dnPrefix="Undefined">
        <xn:SubNetwork id="ONRM_ROOT_MO_R">

So my question is how much of these additional strings are used in the path?
Code:
/<first xmlns:un="Floo.xsd"
    xmlns:es="MoreFloo.xsd"
    xmlns:xn="MoreEvenFloo.xsd" xmlns:gn="geranNrm.xsd" xmlns="configData.xsd">/ <second fileFormatVersion="32.615 V4.5" vendorName="Edward"/>/<third dnPrefix="Undefined">/

or... the same minus the "<" and ">" ???

In the meantime, I'll keep looking over the documentation and hope to encounter this.

Thanks!!
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Split Big XML file Base on tag

HI I want to split file base on tag name. I have few header and footer on file <?xml version="1.33" encing="UTF-8"?> <bulkCmConfigDataFile" <xn:SubNetwork id="ONRM_ROOT"> <xn:MeContext id="PPP04156"> ... (4 Replies)
Discussion started by: pareshkp
4 Replies

2. Shell Programming and Scripting

Splitting a single xml file into multiple xml files

Hi, I'm having a xml file with multiple xml header. so i want to split the file into multiple files. Sample.xml consists multiple headers so how can we split these multiple headers into multiple files in unix. eg : <?xml version="1.0" encoding="UTF-8"?> <ml:individual... (3 Replies)
Discussion started by: Narendra921631
3 Replies

3. Shell Programming and Scripting

Comparing delta values of one xml file in other xml file

Hi All, I have two xml files. One is having below input <NameValuePair> <name>Daemon</name> <value>tcp:7474</value> </NameValuePair> <NameValuePair> <name>Network</name> <value></value> </NameValuePair> ... (2 Replies)
Discussion started by: sharsour
2 Replies

4. Shell Programming and Scripting

Help required in Splitting a xml file into multiple and appending it in another .xml file

HI All, I have to split a xml file into multiple xml files and append it in another .xml file. for example below is a sample xml and using shell script i have to split it into three xml files and append all the three xmls in a .xml file. Can some one help plz. eg: <?xml version="1.0"?>... (4 Replies)
Discussion started by: ganesan kulasek
4 Replies

5. Shell Programming and Scripting

How to add the multiple lines of xml tags before a particular xml tag in a file

Hi All, I'm stuck with adding multiple lines(irrespective of line number) to a file before a particular xml tag. Please help me. <A>testing_Location</A> <value>LA</value> <zone>US</zone> <B>Region</B> <value>Russia</value> <zone>Washington</zone> <C>Country</C>... (0 Replies)
Discussion started by: mjavalkar
0 Replies

6. Shell Programming and Scripting

big xml file with nested loop parse

I have an xml file with the structure: <tag1> <value1>xyx</value1> <value2>123</value2> </tag1> <tag1> <value1>568</value1> <value2>zzzzz</value2> </tag1> where I want to parse each data pair in the this single file, so something like: find first tag1 data pair... (1 Reply)
Discussion started by: unclecameron
1 Replies

7. Shell Programming and Scripting

Need to Split Big XML into multiple xmls

Hi friends.. We have urgent requirement.We need to split the big xml having multiple orders into multiple xmls having each order in each xml. For Example In input XMl will be in following format with multiple line orders.. <OrderDetail BillToKey="20100805337" Createuserid="CreateGuestOrder"... (8 Replies)
Discussion started by: dprakash
8 Replies

8. UNIX for Dummies Questions & Answers

How big is too big a config.log file?

I have a 5000 line config.log file with several "maybe" errors. Any reccomendations on finding solvable problems? (2 Replies)
Discussion started by: NeedLotsofHelp
2 Replies

9. Shell Programming and Scripting

How to remove xml namespace from xml file using shell script?

I have an xml file: <AutoData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Table1> <Data1 10 </Data1> <Data2 20 </Data2> <Data3 40 </Data3> <Table1> </AutoData> and I have to remove the portion xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" only. I tried using sed... (10 Replies)
Discussion started by: Gary1978
10 Replies

10. UNIX for Dummies Questions & Answers

How to view a big file(143M big)

1 . Thanks everyone who read the post first. 2 . I have a log file which size is 143M , I can not use vi open it .I can not use xedit open it too. How to view it ? If I want to view 200-300 ,how can I implement it 3 . Thanks (3 Replies)
Discussion started by: chenhao_no1
3 Replies
Login or Register to Ask a Question