The UNIX and Linux Forums  

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
Google UNIX.COM


Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
How to parse through a file and based on condition form another output file sivasu.india UNIX for Advanced & Expert Users 6 02-28-2008 12:59 AM
Need help to parse the file navsharan Shell Programming and Scripting 3 01-17-2008 11:58 AM
Parse file sbasetty Shell Programming and Scripting 5 03-27-2007 10:27 AM
How to parse a XML file using PERL and XML::DOm girigopal Shell Programming and Scripting 0 06-27-2005 03:46 AM
using getopt to parse a file coolguyshail Shell Programming and Scripting 1 06-08-2005 03:58 AM

Reply
 
LinkBack Thread Tools Display Modes
  #1 (permalink)  
Old 04-12-2007
Registered User
 

Join Date: Apr 2007
Posts: 6
Parse XML file

Hi,

I need to parse the following XML data enclosed in <a> </a> XML tag using shell script.

<X>
.....
</X>
<a>
<b>
<c>data1</c>
<c>data2</c>
</b>
<d>
<c>data3</c>
</d>
</a>

<XX>
...
</XX>

Further I need to display the data in the following format

b
data1
data2
-----
d
data3

Could any body suggest a way to extract the data residing <a> </a> XML tags.

TIA,
Viki
Reply With Quote
Forum Sponsor
  #2 (permalink)  
Old 04-12-2007
Shell_Life's Avatar
Unix/Informix/4GL/SQL
 

Join Date: Mar 2007
Location: Bahia, Brazil
Posts: 695
Viki,
See if this would solve your problem:
sed -e 's!</.>!!' -e 's!<.>!!' xml_file
Reply With Quote
  #3 (permalink)  
Old 04-12-2007
Registered User
 

Join Date: Mar 2006
Location: Bangalore,India
Posts: 1,397
Code:
sed -n "/<a>/,/<\/a>/{/<\/*a>/d;s/^<\([^>]*\)>\([^<]*\)<\/\1>/\2/;s/^<\/.*$/--------------/;s/<\(.*\)>/\1/;p;}" file
Reply With Quote
  #4 (permalink)  
Old 04-12-2007
Registered User
 

Join Date: Apr 2007
Posts: 6
Hi anbu23,

Thanks for quick reply . I am getting the following output with the suggested 'sed command'.

> sed -n "/<a>/,/<\/a>/{/<\/*a>/d;s/^<\([^>]*\)>\([^<]*\)<\/\1>/\2/;s/^<\/.*$/--------------/;s/<\(.*\)>/\1/;p;}" c.xml
b
c>data1</c
c>data2</c
/b
d
c>data3</c
/d

where c.xml contain the following data.

> cat c.xml
<X>
.....
</X>
<a>
<b>
<c>data1</c>
<c>data2</c>
</b>
<d>
<c>data3</c>
</d>
</a>

<XX>
...
</XX>

The issue is to extract the XML tags i.e. "b" and "d" and then read the XML tag <c>.
Further store the data in a text file in the following format

b:data1 data2
d:data3

Could you please help me out.

TIA,
Viki
Reply With Quote
  #5 (permalink)  
Old 04-13-2007
Registered User
 

Join Date: Mar 2006
Location: Bangalore,India
Posts: 1,397
Code:
$ cat file
<X>
.....
</X>
<a>
<b>
<c>data1</c>
<c>data2</c>
</b>
<d>
<c>data3</c>
</d>
</a>

<XX>
...
</XX>
$ sed -n "/<a>/,/<\/a>/{/<\/*a>/d;s/^<\([^>]*\)>\([^<]*\)<\/\1>/\2/;s/^<\/.*$/--------------/;s/<\(.*\)>/\1/;p;}" file
b
data1
data2
--------------
d
data3
--------------
I am getting what you have asked.


Code:
$ awk -F"[<>]" ' /<a>/,/<\/a>/ {
> if ( $0 !~ /<\/*a>/ ) {
>       if ( $0 == "</" tag ">" ) { print str }
>       else if ( NF == 3 ) { str = $2 ":" ; tag=$2 }
>       else { str = str " " $3 }
> }
> } ' file
b: data1 data2
d: data3
Reply With Quote
  #6 (permalink)  
Old 04-13-2007
Registered User
 

Join Date: Apr 2007
Posts: 6
Hi anbu23,

it works....

Thanks,
Viki
Reply With Quote
Google UNIX.COM
Reply

Thread Tools
Display Modes




All times are GMT -7. The time now is 06:49 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited.
The UNIX and Linux Forums Content Copyright ©1993-2008 The CEP Blog All Rights Reserved -Ad Management by RedTyger Visit The Global Fact Book

Content Relevant URLs by vBSEO 3.2.0