print xml data without the tags.


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting print xml data without the tags.
# 1  
Old 07-15-2012
print xml data without the tags.

Hi All,

I'm trying to extract data from an xml file but without the codes. I've achieved it but i was wondering if there's a better way to do this.
sample data:
Code:
$ cat xmlfile
<code>
<to>tove</to>
<from>jani</from>
<heading>reminder</heading>
<body>dont forget me</body>
</code>


Code:
$ awk -F'>' '{print $2}' xmlfile | cut -d'<' -f1

tove
jani
reminder
dont forget me

# 2  
Old 07-15-2012
If you are using GNU awk, you can do all with one awk sweep.
Code:
awk -F'[<>]' '{print $3}' xmlfile

Note that this assumes there is at most one value per line.
Code:
awk -F'[<>]' '{print $3,$7,$11}' xmlfile

works for up to 3 per line.
To not print empty lines on lines with just one tag:
Code:
<code>

you could test whether $3 is empty:
Code:
awk -F'[<>]' '$3{print $3}' xmlfile

Which will also ignore lines with empty value:
Code:
<tag></tag>

For any more sophisticated XML parsing, you'll probably want to use perl or some other tool that has xml modules.

Last edited by mirni; 07-15-2012 at 10:55 PM..
This User Gave Thanks to mirni For This Post:
# 3  
Old 07-15-2012
thanks mirni for the reply. just one question, could you explain the de-limiter used here
Code:
 [<>]

. does that by default represent an xml tag ?
# 4  
Old 07-15-2012
awk doesn't know anything about xml.
The [<>] is a character group, it will split on either < or >. The [><] would do just the same.
With GNU awk you can use a regular expression for delimiter.
If it was [0-9], it would split on any digit.
# 5  
Old 07-16-2012
Quote:
Originally Posted by mirni
With GNU awk you can use a regular expression for delimiter.
As far as I know, every major AWK implementation treats FS as a regular expression when it consists of more than one character (it's required by POSIX).

Regards,
Alister
This User Gave Thanks to alister For This Post:
Login or Register to Ask a Question

Previous Thread | Next Thread

9 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

Extracting data between continuous non empty xml tags

Hi, I need help in extracting only the phone numbers between the continuous non empty xml tags in unix. I searched through a lot of forum but i did not get exact result for my query. Please help Given below is the sample pipe delimited file. I have a lot of tags before and after... (6 Replies)
Discussion started by: zen01234
6 Replies

2. Shell Programming and Scripting

How can I replace data in between xml tags to required format?

Hi All, I have a requirement to change the data in xml file to required format. Below is the scenario. Please find the attached Xml file which contains data that I need to convert.. <ReleaseIndicatorList><ReleaseIndicator>Alternative... (0 Replies)
Discussion started by: Prathyu
0 Replies

3. UNIX for Advanced & Expert Users

Shell Script to read XML tags and the data within that tag

Hi unix Gurus, I am really new to Unix Scripting. Please help me to create a shell script which reads the xml file and from that i need to fetch a particular information. For example <SOURCE BUSINESSNAME ="" DATABASETYPE ="Teradata" DBDNAME ="DWPROD3" DESCRIPTION ="" NAME... (2 Replies)
Discussion started by: SmilePlease
2 Replies

4. Shell Programming and Scripting

Shell script to extract data in repeating tags from xml

Hi, I am new to shell scripting. I need to extract data between repeating tags from an xml file and store the data in an array to process it further. <ns1:root xmlns:ns1="http://example.com/config"> <ns1:interface>in1</ns1:interface> <ns1:operation attribute1="true" attribute2="abd"... (2 Replies)
Discussion started by: sailendra
2 Replies

5. Shell Programming and Scripting

Data between XML Tags

<?xml version="1.0" encoding="iso-8859-1" ?> <TABLE> <TEST> <ID> 123 </ID> <name> abc </name> </TEST> <TEST> <ID> 123 </ID> <name> abc2 </name> </TEST> </TABLE> <TABLE> <TEST> <ID> 456 </ID> <name> def </name> </TEST> <TEST> ... (8 Replies)
Discussion started by: eskay
8 Replies

6. Shell Programming and Scripting

Print a pattern between the xml tags based on a search pattern

Hi all, I am trying to extract the values ( text between the xml tags) based on the Order Number. here is the sample input <?xml version="1.0" encoding="UTF-8"?> <NJCustomer> <Header> <MessageIdentifier>Y504173382</MessageIdentifier> ... (13 Replies)
Discussion started by: oky
13 Replies

7. Shell Programming and Scripting

How to update data between xml tags

Is there a way to modify Non Null data between <host> and </host> tags to a new value ?- may be using sed/awk? I tried this sed 's|.*<host>\(?*\)</host>.*|\<host>xxx</host>|' but it is updating the host which has null value - want opposite of this - Thanks in advance for you help!! For... (2 Replies)
Discussion started by: harry_todd
2 Replies

8. Shell Programming and Scripting

how to get data from xml files tags(from data tags)

i have a file like <fruits> <apple>redcolor<\apple> <bana:rolleyes:na>yellow color and it is<\banana> </fruits> i need a text between apple and bannana ans so on.... how to read a text between a tags it multiple tags with differnt names (9 Replies)
Discussion started by: pvr_satya
9 Replies

9. UNIX for Dummies Questions & Answers

Removing leading and trailing spaces of data between the tags in xml.

I am having xml document as below. <transactionid> 00 </transactionid> <tracknumber> 0 </tracknumber> <key> N/A </key> But the data contains leading and trailing spaces between the tags. Please let me know how can i remove these leading and trailing spaces between the tags.... (2 Replies)
Discussion started by: jhmr7
2 Replies
Login or Register to Ask a Question