The UNIX and Linux Forums  
Hello and Welcome from United States to the UNIX and Linux Forums! Thank You for Visiting and Joining Our Global Community.

Go Back   The UNIX and Linux Forums > Top Forums > Shell Programming and Scripting
.
google unix.com



Shell Programming and Scripting Post questions about KSH, CSH, SH, BASH, PERL, PHP, SED, AWK and OTHER shell scripts and shell scripting languages here.

More UNIX and Linux Forum Topics You Might Find Helpful
Thread Thread Starter Forum Replies Last Post
Remove html tags with bash dejavu88 Shell Programming and Scripting 4 05-22-2008 01:58 PM
How to remove only html tags inside a file? btech_raju Linux 2 11-23-2007 12:25 PM
Remove unwanted data? hitmansilentass Shell Programming and Scripting 24 05-09-2007 10:50 AM
how to close the unwanted portnumbers. krishna176 SUN Solaris 1 03-25-2007 02:41 PM
cutting unwanted text sysgate Shell Programming and Scripting 2 11-29-2006 05:43 AM

Closed Thread
English Japanese Spanish French German Portuguese Italian Dutch Swedish Russian Norwegian Hungarian Hebrew Danish Bulgarian Greek Powered by Powered by Google
 
LinkBack Thread Tools Search this Thread Rate Thread Display Modes
  #1 (permalink)  
Old 12-18-2007
ambals123 ambals123 is offline
Registered User
  
 

Join Date: Dec 2007
Posts: 1
Remove unwanted XML Tags

I have set of sources and the respective resolution. Please advice how to resolve the same using Unix shell scripting.

Source 1:
=======
<ext:ContactInfo xmlns:ext="urn:AOL.FLOWS.Extensions">
<ext:InternetEmailAddress>AOL@AOL.COM</ext:InternetEmailAddress>
</ext:ContactInfo>

Resoultion 1:
=========
< InternetEmailAddress>AOL@AOL.COM</InternetEmailAddress>

Note 1
=====
Tags with ContactInfo should not come on to Target and ext: should be removed through out the Source.

Similarly

Source 2:
=======

<e
xt:InternetEmailAddress>

(Or)
<ext:Conta
ctInfo xmlns:ext="urn:AOL.FLOWS.Extensions">

Resolution 2
=========
<InternetEmailAddress> - ext: splitted across multiple lines should be removed, this can be splitted across any number of lines also

Source 3
=======

<IFeed>
<Organization>
<Participant>…
<ParticipantName>..</ParticipantName>
<ParticipantId>…</ParticipantId>
</ Participant>
</ Organization>
</IndicFeed>
<IFeed>
<Organization>
<Participant>…
<ParticipantName>..</ParticipantName>
<ParticipantId>…</ParticipantId>
</ Participant>
</ Organization>
</IFeed>

Resolution 3
=========
<IFeed>
<Organization>
<Participant>…
<ParticipantName>..</ParticipantName>
<ParticipantId>…</ParticipantId>
</ Participant>
<Participant>…
<ParticipantName>..</ParticipantName>
<ParticipantId>…</ParticipantId>
</ Participant>
</ Organization>
</IFeed>
  #2 (permalink)  
Old 12-22-2007
jaduks's Avatar
jaduks jaduks is offline
Registered User
  
 

Join Date: Aug 2007
Location: Assam,India
Posts: 166
Here I got some "quick fixes" to two of your problems:

Prob1:
++++++
$ cat source1.xml
<ext:ContactInfo xmlns:ext="urn:AOL.FLOWS.Extensions">
<ext:InternetEmailAddress>AOL@AOL.COM</ext:InternetEmailAddress>
</ext:ContactInfo>

$ cat source1.xml | sed '/ext:ContactInfo/d'
<ext:InternetEmailAddress>AOL@AOL.COM</ext:InternetEmailAddress>

Prob3:
+++++++
$ cat source3.xml
<IFeed>
<Organization>
<Participant>.
<ParticipantName>..</ParticipantName>
<ParticipantId>.</ParticipantId>
</ Participant>
</ Organization>
</IndicFeed>
<IFeed>
<Organization>
<Participant>.
<ParticipantName>..</ParticipantName>
<ParticipantId>.</ParticipantId>
</ Participant>
</ Organization>
</IFeed>


The idea is to remove the all "Organization" tags except the 1st and last one. Similarly the "IFeed". Here we go

----
#!/bin/sh

#this function will remove all instances of any tag (except 1st and last)
f_remove() {
count=0
RMVPARAM=$1
grep -n "$RMVPARAM>" source3.xml | awk -F ":" '{print $1}'| sed -e '1d' -e '$d' | while read lnum
do
num=`expr $lnum - $count`
echo $num
sed "$num d" source3.xml > source3.xml.tmp
mv source3.xml.tmp source3.xml
count=`expr $count + 1`
done
}

f_remove Organization
f_remove IFeed
---

Now

$ cat source3.xml
<IFeed>
<Organization>
<Participant>.
<ParticipantName>..</ParticipantName>
<ParticipantId>.</ParticipantId>
</ Participant>
</IndicFeed>
<Participant>.
<ParticipantName>..</ParticipantName>
<ParticipantId>.</ParticipantId>
</ Participant>
</ Organization>
</IFeed>

%%</IndicFeed> can be removed directly, if not required :-)

//Jadu
Closed Thread

Bookmarks

Thread Tools Search this Thread
Search this Thread:

Advanced Search
Display Modes Rate This Thread
Rate This Thread:

Posting Rules
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts

BB code is On
Smilies are On
[IMG] code is On
HTML code is Off
Trackbacks are On
Pingbacks are On
Refbacks are On




All times are GMT -4. The time now is 08:59 PM.


Powered by: vBulletin, Copyright ©2000 - 2006, Jelsoft Enterprises Limited. Language Translations Powered by .
vBCredits v1.4 Copyright ©2007 - 2008, PixelFX Studios
The UNIX and Linux Forums Content Copyright ©1993-2009. All Rights Reserved.Ad Management by RedTyger

Content Relevant URLs by vBSEO 3.2.0