Extract value from XML


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Extract value from XML
# 1  
Old 01-05-2012
Extract value from XML

I have a file like below
Code:
<soap:Envelope xmlns:soap="http://schemas.xmlsoap.org/soap/envelope/"><soap:Body><ns2:executeMDXResponse xmlns:ns2="http://webservices.quartetfs.com"><aggregates><axes><axis><name>ROWS</name><positions><position><members><member><depth>0</depth><dimensionName>AsOfDate</dimensionName><displayName>AllMember</displayName><levelName>ALL</levelName><path><items><item>AllMember</item></items></path></member></members></position><position><members><member><depth>1</depth><dimensionName>AsOfDate</dimensionName><displayName>04-01-2012</displayName><levelName>AsOfDate</levelName><path><items><item>AllMember</item><item>04-01-2012</item></items></path></member></members></position><position><members><member><depth>1</depth><dimensionName>AsOfDate</dimensionName><displayName>20-12-2011</displayName><levelName>AsOfDate</levelName><path><items><item>AllMember</item><item>20-12-2011</item></items></path></member></members></position><position><members><member><depth>1</depth><dimensionName>AsOfDate</dimensionName><displayName>12-12-2011</displayName><levelName>AsOfDate</levelName><path><items><item>AllMember</item><item>12-12-2011</item></items></path></member></members></position><position><members><member><depth>1</depth><dimensionName>AsOfDate</dimensionName><displayName>09-12-2011</displayName><levelName>AsOfDate</levelName><path><items><item>AllMember</item><item>09-12-2011</item></items></path></member></members></position></positions></axis></axes><cells><cell><formattedValue>3840769</formattedValue><ordinal>0</ordinal><value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">3840769</value></cell><cell><formattedValue>444930</formattedValue><ordinal>1</ordinal><value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">444930</value></cell><cell><formattedValue>1136654</formattedValue><ordinal>2</ordinal><value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">1136654</value></cell><cell><formattedValue>1081680</formattedValue><ordinal>3</ordinal><value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">1081680</value></cell><cell><formattedValue>1177505</formattedValue><ordinal>4</ordinal><value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">1177505</value></cell></cells><slicerAxis><name>SlicerAxis</name><positions><position><members><member><depth>0</depth><dimensionName>Measures</dimensionName><displayName>contributors.COUNT</displayName><levelName>Measures</levelName><path><items><item>contributors.COUNT</item></items></path></member></members></position></positions></slicerAxis></aggregates></ns2:executeMDXResponse></soap:Body></soap:Envelope>

not in properly aligned and everything in one line. So If I try to serach by grep or sed for a particular tag and value in between them, returns whole file ?
can anyone how can I search it?

I need to search date between displayName tag?
# 2  
Old 01-05-2012
Code:
$ awk -F"</?displayName>" '{for(i=1;++i<=NF;) if(length($i)==10) print $i}' yourfile.xml
04-01-2012
20-12-2011
12-12-2011
09-12-2011

This User Gave Thanks to ctsgnb For This Post:
# 3  
Old 01-05-2012
Code:
awk '/displayName/ && $2~/^[0-9][0-9]-/{print $2}' FS="[><]" RS='><' xmlFile

This User Gave Thanks to mirni For This Post:
# 4  
Old 01-05-2012
hello there,
how could I get the value between formatted tag ? by below logic
which is based on Ordinal tag

let's say in my first xml posted on the top
Code:
<formattedValue>1177505</formattedValue> 
  <ordinal>4</ordinal> 
  <value xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:type="xs:long">1177505</value>

ordinal value is 4 , so that means it should search for 4th displayName tag
(skip ordinal value 0)

and o/p will be
Code:
09-12-2011  1177505

as each ordinal tag relates to the displayName tag seqentially


so final output should be

Code:
04-01-2012    444930     as ordinal 1 and  it's formattedValue 
20-12-2011    1136654   as ordinal 2 and  it's formattedValue 
12-12-2011    1081680    as ordinal 3 and  it's formattedValue 
09-12-2011    1177505   as ordinal 4 and  it's formattedValue

# 5  
Old 01-06-2012
Try this:
Code:
awk '
/displayName/ && $2~/^[0-9][0-9]-/{dt[++cnt]=$2}
/^formattedValue>/{fv=$2; getline; print dt[$2],fv,$2 }
' FS="[><]" RS='><' xmlFile

It assumes that ordinal tag is the next tag right after formattedValue tag. If that is not always the case, you could try this a little more general approach:
Code:
awk '
/displayName/ && $2~/^[0-9][0-9]-/{dt[c1++]=$2}
/^formattedValue>/{fv[c2++]=$2}
/^ordinal>/{o[c3++]=$2}
END{
  for(i=0; i<c1; i++) 
     print dt[o[i]],fv[o[i]+1]
}' FS="[><]" RS='><' xmlFile

This User Gave Thanks to mirni For This Post:
# 6  
Old 01-06-2012
this is amazing mi. this is waht exactly looking for
thanks again. can you please point out, where exactly I'm doing it wrong, if I gave little formatting behaviour to below awk.

Code:
awk '
                                /displayName/ && $2~/^[0-9][0-9]-/{dt[c1++]=$2}
                                /^formattedValue>/{fv[c2++]=$2}
                                /^ordinal>/{o[c3++]=$2}
                                END{
                                  for(i=0; i<c1; i++){
                cnt=split(dt[o[i]],a,"-")
for (j=cnt,j<=1;j--){ date=a[j] }
print date,fv[o[i]+1]
}date=""} '  FS="[><]" RS='><' file.txt

I want o/p to be
Code:
20111209 1177505


Last edited by manas_ranjan; 01-06-2012 at 05:57 AM..
# 7  
Old 01-06-2012
Well, you do have a bunch of syntax errors in this line:
Quote:
Code:
for (j=cnt,j<=1;j--); print a[j],

Commas, semicolons and logic in the for statement are messed up.
Here:
Code:
awk '
/displayName/ && $2~/^[0-9][0-9]-/{dt[c1++]=$2}
/^formattedValue>/{fv[c2++]=$2}
/^ordinal>/{o[c3++]=$2}
END{
  for(i=0; i<c1; i++) {
    split(dt[o[i]],a,"-"); 
    print a[3] a[2] a[1],fv[o[i]+1] 
  }
}' FS="[><]" RS='><' xmlFile

Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Extract a value from an xml file

I have this XML file format and all in one line: Fri Dec 23 00:14:52 2016 Logged Message:689|<?xml version="1.0" encoding="UTF-8"?><PORT_RESPONSE><HEADER><ORIGINATOR>XMG</ORIGINATOR><DESTINAT... (16 Replies)
Discussion started by: mrn6430
16 Replies

2. Shell Programming and Scripting

Extract strings from XML files and create a new XML

Hello everybody, I have a double mission with some XML files, which is pretty challenging for my actual beginner UNIX knowledge. I need to extract some strings from multiple XML files and create a new XML file with the searched strings.. The original XML files contain the source code for... (12 Replies)
Discussion started by: milano.churchil
12 Replies

3. Shell Programming and Scripting

Extract a particular xml only from an xml jar file

Hi..need help on how to extract a particular xml file only from an xml jar file... thanks! (2 Replies)
Discussion started by: qwerty000
2 Replies

4. Shell Programming and Scripting

Extract string from XML

Hi, I wish to grep for the first instance of <listen-address> value between the first <server></server> tag in an xml file. Sample xml: ......... <timeout-seconds>1500</timeout-seconds> </jta> <server> <name>Adminserver_DEV</name> ... (9 Replies)
Discussion started by: mohtashims
9 Replies

5. Shell Programming and Scripting

xml extract problem

I have looked at other responses and never was able to modify to work. data is: <?xml version="1.0"?> <note version="0.3" xmlns:link="http://beatniksoftware.com/tomboy/link" xmlns:size="http://beatniksoftware.com/tomboy/size" xmlns="http://beatniksoftware.com/tomboy"><title>recoll</title><text... (12 Replies)
Discussion started by: Klasform
12 Replies

6. Shell Programming and Scripting

sed extract from xml

I have an xml file that generally looks like this: "<row><dnorpattern>02788920</dnorpattern><description/></row><row><dnorpattern>\+ 44146322XXXX</dnorpattern><description/></row><row><dnorpattern>40XXX</dnorpattern><description/></row><row><dnorpattern>11</dn... (4 Replies)
Discussion started by: garboon
4 Replies

7. Shell Programming and Scripting

XML data extract

Hi all, I have the following xml document : <HEADER><El1>asdf</El1> <El2>3</El2> <El3>asad</El3> <El4>asasdf</El4> <El5>asdf</El5> <El6>asdf</El6> <El7>asdf</El7> <El8>A</El8> <El9>0</El9> <El10>75291028141917</El10> <El11>asdf</El11> <El12>sdf</El12> <El13>er</El13> <El14><El15>asdf... (1 Reply)
Discussion started by: nthed
1 Replies

8. Shell Programming and Scripting

Extract xml data

Hi all, I have the following xml file : <xmlhead><xmlelement1>element1value</xmlelement1>\0a<xmlelement2>jjasd</xmlelement2>...</xmlhead> As you can see there are no lines or spaces seperating the elements, just the character \0a. How can i find and print the values of a specific element?... (1 Reply)
Discussion started by: nthed
1 Replies

9. Shell Programming and Scripting

SED extract XML value

I have the following string: <min-pool-size>2</min-pool-size> When I pipe the string into the following code I am expcting for it to return just the value "2", but its just reurning the whole string. Why?? sed -n '/<min-pool-size>/,/<\/min-pool-size>/p' Outputting:... (13 Replies)
Discussion started by: ArterialTool
13 Replies

10. Shell Programming and Scripting

· simerian · XML Extract

The script following in this thread allows XML data to be located and extracted in a variety of forms from an XML data stream. Using this utility, it is possible to extract all manner of XML subsets and allow data to be post inserted into the "original" XML at any logical point. The pipe is... (2 Replies)
Discussion started by: Simerian
2 Replies
Login or Register to Ask a Question