Remove unwanted XML Tags


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting Remove unwanted XML Tags
# 1  
Old 12-18-2007
Remove unwanted XML Tags

I have set of sources and the respective resolution. Please advice how to resolve the same using Unix shell scripting.

Source 1:
=======
<ext:ContactInfo xmlns:ext="urn:AOL.FLOWS.Extensions">
<ext:InternetEmailAddress>AOL@AOL.COM</ext:InternetEmailAddress>
</ext:ContactInfo>

Resoultion 1:
=========
< InternetEmailAddress>AOL@AOL.COM</InternetEmailAddress>

Note 1
=====
Tags with ContactInfo should not come on to Target and ext: should be removed through out the Source.

Similarly

Source 2:
=======

<e
xt:InternetEmailAddress>

(Or)
<ext:Conta
ctInfo xmlns:ext="urn:AOL.FLOWS.Extensions">

Resolution 2
=========
<InternetEmailAddress> - ext: splitted across multiple lines should be removed, this can be splitted across any number of lines also

Source 3
=======

<IFeed>
<Organization>
<Participant>...
<ParticipantName>..</ParticipantName>
<ParticipantId>...</ParticipantId>
</ Participant>
</ Organization>
</IndicFeed>
<IFeed>
<Organization>
<Participant>...
<ParticipantName>..</ParticipantName>
<ParticipantId>...</ParticipantId>
</ Participant>
</ Organization>
</IFeed>

Resolution 3
=========
<IFeed>
<Organization>
<Participant>...
<ParticipantName>..</ParticipantName>
<ParticipantId>...</ParticipantId>
</ Participant>
<Participant>...
<ParticipantName>..</ParticipantName>
<ParticipantId>...</ParticipantId>
</ Participant>
</ Organization>
</IFeed>
# 2  
Old 12-22-2007
Here I got some "quick fixes" to two of your problems:

Prob1:
++++++
$ cat source1.xml
<ext:ContactInfo xmlns:ext="urn:AOL.FLOWS.Extensions">
<ext:InternetEmailAddress>AOL@AOL.COM</ext:InternetEmailAddress>
</ext:ContactInfo>

$ cat source1.xml | sed '/ext:ContactInfo/d'
<ext:InternetEmailAddress>AOL@AOL.COM</ext:InternetEmailAddress>

Prob3:
+++++++
$ cat source3.xml
<IFeed>
<Organization>
<Participant>.
<ParticipantName>..</ParticipantName>
<ParticipantId>.</ParticipantId>
</ Participant>
</ Organization>
</IndicFeed>
<IFeed>
<Organization>
<Participant>.
<ParticipantName>..</ParticipantName>
<ParticipantId>.</ParticipantId>
</ Participant>
</ Organization>
</IFeed>


The idea is to remove the all "Organization" tags except the 1st and last one. Similarly the "IFeed". Here we go

----
#!/bin/sh

#this function will remove all instances of any tag (except 1st and last)
f_remove() {
count=0
RMVPARAM=$1
grep -n "$RMVPARAM>" source3.xml | awk -F ":" '{print $1}'| sed -e '1d' -e '$d' | while read lnum
do
num=`expr $lnum - $count`
echo $num
sed "$num d" source3.xml > source3.xml.tmp
mv source3.xml.tmp source3.xml
count=`expr $count + 1`
done
}

f_remove Organization
f_remove IFeed
---

Now

$ cat source3.xml
<IFeed>
<Organization>
<Participant>.
<ParticipantName>..</ParticipantName>
<ParticipantId>.</ParticipantId>
</ Participant>
</IndicFeed>
<Participant>.
<ParticipantName>..</ParticipantName>
<ParticipantId>.</ParticipantId>
</ Participant>
</ Organization>
</IFeed>

%%</IndicFeed> can be removed directly, if not required :-)

//Jadu
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Read xml tags and then remove the tag using shell script

<Start> <Header> This is header section </Header> <Body> <Body_start> This is body section <a> <b> <c> <st>111</st> </c> <d> <st>blank</st> </d> </b> </a> </Body_start> <Body_section> This is body section (3 Replies)
Discussion started by: RJG
3 Replies

2. Shell Programming and Scripting

Removing unwanted tags from xml file

I have a XML file given as below: "<ProductUOMAlternativeDetails> <removetag> <UOMCode>EA</UOMCode> <numeratorForConversionToBaseUOM>1</numeratorForConversionToBaseUOM> <denominatorForConversionToBaseUOM>1</denominatorForConversionToBaseUOM> <length>0.59</length> <width>0.96</width> ... (3 Replies)
Discussion started by: vikingh
3 Replies

3. Shell Programming and Scripting

How to remove unwanted strings?

Hi Guys, Can someone give me a hand on how I can remove unwanted strings like "<Number>" and "</Number>" and retain only the numbers from the input file below. INPUT FILE: <Number>10050000</Number> <Number>1001340001</Number> <Number>1001750002</Number> <Number>100750003</Number>... (8 Replies)
Discussion started by: pinpe
8 Replies

4. Shell Programming and Scripting

Filter a .kml file (xml) to remove unwanted entries

Ok, i have a .kml file that that i want to trim down and get rid of the rubbish from. its formatted like so: <Placemark> <name><!]></name> <description><!</b><br/>Frequency: <b>2437</b><br/>Timestamp: <b>1304892397000</b><br/>Date: <b>2011-05-08... (11 Replies)
Discussion started by: Phear46
11 Replies

5. Shell Programming and Scripting

remove some XML tags

Hi all, I have a file which i have to remove some line from it, the lines that i have to remove from my file is as below: </new_name></w"s" langue="Fr-fr" version="1.0" encoding="UTF-8" ?> <New_name> and it is finding at the middle of my file, is there any command line in linux to do it or do... (1 Reply)
Discussion started by: id_2pc
1 Replies

6. Shell Programming and Scripting

remove unwanted text using perl

Hello..I have a text file that need to remove unwanted text. This is the original file. No. Time Source Destination Protocol Info 16 0.649949 10.1.1.101 209.225.11.237 HTTP POST /scripts/cms/xcms.asp HTTP/1.1 ... (9 Replies)
Discussion started by: taxi
9 Replies

7. Shell Programming and Scripting

remove XML parent start and end tags in same file

Hi All, Requirement: remove start and end tag of parent element <DummyLevel> <level1> </level1> <level2> </level2> <level3> </level3> <level4> </level4> <level5> </level5> <level6> </level7> </DummyLevel> I have to delete the first <dummylevel> and last </DummyLevel> tags from... (7 Replies)
Discussion started by: dstage2006
7 Replies

8. Shell Programming and Scripting

Remove unwanted lines

I have a .xml file, where i need some output. The xml file is like: Code: <?******?></ddddd><sssss>234</dfdffsdf><sdhjh>534</dfdfa>......... /Code I need the output like: code 234 534 . . . /code How can i do it? (5 Replies)
Discussion started by: anupdas
5 Replies

9. Solaris

Remove unwanted packages

I got a system which was installed with SUNWCXall cluster installed on it and i want remove unwanted software like GMNOME, Java Desktop System, Staroffice and numerous other softwares .. i want to do an automated removal of these packages where its uninstalled by itself ..from the is there any... (4 Replies)
Discussion started by: fugitive
4 Replies

10. Shell Programming and Scripting

Remove unwanted data?

Hi Can any one help me remove the unwanted data? I would want to remove the complete event id 4910 ( the type there is INFO), that means, I have to remove starting from 7th - 19th lines. can any one of you please help? Thanks, (24 Replies)
Discussion started by: hitmansilentass
24 Replies
Login or Register to Ask a Question