How can I remove some xml tag lines using shell script?


 
Thread Tools Search this Thread
Top Forums Shell Programming and Scripting How can I remove some xml tag lines using shell script?
# 1  
Old 05-10-2013
How can I remove some xml tag lines using shell script?

Hi All,

My name is Prathyu and I am working as a ETL develper. I have one requirement to create a XML file based on the provided XSD file. As per the Datastage standards Key(repeatable) field does not contain any Null values so I am inserting some dummy tag line to that XML file.

Now I want to remove that dummy tag line from that file through unix script. Can any one please help me to write the script.

Below one shows how existed XML file looks and How I want to change that file.

Existed File :
Code:
<PerformanceScheduleFeed xmlns="http://AMC.PerformanceSchedule.1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://AMC.PerformanceSchedule.1.0">
<Originator>http://AMC.PerformanceSchedule.1.0</Originator>
<SystemName>Interface 212</SystemName>
<FeedIdentifier>C47FE513-88AC-A363-E040-8742600AAC67</FeedIdentifier>
<DateTimeCreated>2012-07-10T15:51:23</DateTimeCreated>
<PerformanceSchedulesFromDateTime>2013-04-05T00:00:00</PerformanceSchedulesFromDateTime>
<PerformanceSchedulesToDateTime>2013-04-11T00:00:00</PerformanceSchedulesToDateTime>
<InternalReleaseList>
<InternalRelease>
<TitleName>Croods, The</TitleName>
<TitleID>16341</TitleID>
<InternalReleaseID>38351</InternalReleaseID>
<RentrakID>56324</RentrakID>
<ReleaseIndicatorList>
<ReleaseIndicator>Prathyu</ReleaseIndicator>
<ReleaseIndicator>AMC Select</ReleaseIndicator>
<ReleaseIndicator>Animated</ReleaseIndicator>
</ReleaseIndicatorList>
<MediaFormatName>DIGITAL</MediaFormatName>
<MPAARatingCode>PG</MPAARatingCode>
<PerformanceScheduleList>
<PerformanceSchedule>
<TheatreNumber>6</TheatreNumber>
<AuditoriumID>1</AuditoriumID>
<ShowDateTime>2013-04-05T12:40:00.000</ShowDateTime>
<RadiantPerformanceID>62422</RadiantPerformanceID>
</PerformanceSchedule>
</PerformanceScheduleList>
</InternalRelease>
</InternalReleaseList>
</PerformanceScheduleFeed> 


I want to remove the tag line which highlited in red (Which contains "Prathyu" as value for ReleaseIndicator)
Code:
 
New File :
 
<PerformanceScheduleFeed xmlns="http://AMC.PerformanceSchedule.1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://AMC.PerformanceSchedule.1.0">
<Originator>http://AMC.PerformanceSchedule.1.0</Originator>
<SystemName>Interface 212</SystemName>
<FeedIdentifier>C47FE513-88AC-A363-E040-8742600AAC67</FeedIdentifier>
<DateTimeCreated>2012-07-10T15:51:23</DateTimeCreated>
<PerformanceSchedulesFromDateTime>2013-04-05T00:00:00</PerformanceSchedulesFromDateTime>
<PerformanceSchedulesToDateTime>2013-04-11T00:00:00</PerformanceSchedulesToDateTime>
<InternalReleaseList>
<InternalRelease>
<TitleName>Croods, The</TitleName>
<TitleID>16341</TitleID>
<InternalReleaseID>38351</InternalReleaseID>
<RentrakID>56324</RentrakID>
<ReleaseIndicatorList>
<ReleaseIndicator>AMC Select</ReleaseIndicator>
<ReleaseIndicator>Animated</ReleaseIndicator>
</ReleaseIndicatorList>
<MediaFormatName>DIGITAL</MediaFormatName>
<MPAARatingCode>PG</MPAARatingCode>
<PerformanceScheduleList>
<PerformanceSchedule>
<TheatreNumber>6</TheatreNumber>
<AuditoriumID>1</AuditoriumID>
<ShowDateTime>2013-04-05T12:40:00.000</ShowDateTime>
<RadiantPerformanceID>62422</RadiantPerformanceID>
</PerformanceSchedule>
</PerformanceScheduleList>
</InternalRelease>
</InternalReleaseList>
</PerformanceScheduleFeed>


Thanks in advance,
Prathyu

Last edited by Scott; 05-10-2013 at 01:10 PM.. Reason: Please use code tags and LESS formatting
# 2  
Old 05-10-2013
If the XML is really what you posted here, then
Code:
grep -iv "prathyu" input.xml > output.xml

If it doesn't resemble the XML you posted due to lack of linebreaks or whatever, this may not work.
# 3  
Old 05-10-2013
Thanks for the Quick reply..
I tried your command
grep -iv "Prathyu" AMC_Exhibitions_Test_2005.xml > /opt/IBMProjects/EDW/Outputs/DCIP/AMC_Exhibitions_Test_2006.xml

But it did not any records to output Xml file..Output file is just empty..

What my requirement is I want to remove
Code:
<ReleaseIndicator>Prathyu</ReleaseIndicator>
 
line from my Xml file and generate new Xml file with out the above line..


Last edited by Scott; 05-10-2013 at 06:45 PM.. Reason: Code tags and LESS FORMATTING
# 4  
Old 05-10-2013
I supect the "line" you want to remove is not actually a line. Please post your actual XML data, not prettied up, raw. (Obscure anything confidential of course.)
# 5  
Old 05-10-2013
This xml ig generated through Datastage and below is the xml structure how it looks..

Code:
<?xml version="1.0" encoding="UTF-8"?>
<PerformanceScheduleFeed xmlns="http://AMC.PerformanceSchedule.1.0" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://AMC.PerformanceSchedule.1.0">
<Originator>http://AMC.PerformanceSchedule.1.0</Originator>
<SystemName>Interface 212</SystemName>
<FeedIdentifier>C47FE513-88AC-A363-E040-8742600AAC67</FeedIdentifier>
<DateTimeCreated>2012-07-10T15:51:23</DateTimeCreated>
<PerformanceSchedulesFromDateTime>2013-04-05T00:00:00</PerformanceSchedulesFromDateTime>
<PerformanceSchedulesToDateTime>2013-04-11T00:00:00</PerformanceSchedulesToDateTime>
<InternalReleaseList>
<InternalRelease>
<TitleName>Croods, The</TitleName>
<TitleID>16341</TitleID>
<InternalReleaseID>38351</InternalReleaseID>
<RentrakID>56324</RentrakID>
<ReleaseIndicatorList>
<ReleaseIndicator>Prathyu</ReleaseIndicator>
<ReleaseIndicator>AMC Select</ReleaseIndicator>
<ReleaseIndicator>Animated</ReleaseIndicator>
</ReleaseIndicatorList>
<MediaFormatName>DIGITAL</MediaFormatName>
<MPAARatingCode>PG</MPAARatingCode>
<PerformanceScheduleList>
<PerformanceSchedule>
<TheatreNumber>6</TheatreNumber>
<AuditoriumID>1</AuditoriumID>
<ShowDateTime>2013-04-05T12:40:00.000</ShowDateTime>
<RadiantPerformanceID>62422</RadiantPerformanceID>
</PerformanceSchedule>
</PerformanceScheduleList>
</InternalRelease>

Here under /PerformanceScheduleFeed/InternalReleaseList/InternalRelease/ReleaseIndicatorList/ReleaseIndicator

the new hirarchy will open with name ReleaseIndicator which is the repeatable field so it has multiple values for single record..I defaulting one indicator value to "Prathyu" if it is 'Null' record while generating the Xml..but I do not want to show that indicator value in output file..so I want to remove entire record with starting tag <ReleaseIndicator> and ending tag </ReleaseIndicator> containing "Prathyu" as value..

And I am attaching total Xml file which I generated..

Last edited by Scott; 05-10-2013 at 06:48 PM.. Reason: Code tags; removed formatting
# 6  
Old 05-10-2013
I see no reason my code wouldn't have worked, and especially no reason that, if it didn't work, it would have generated an empty output file.

So run this carefully, following exactly:
Code:
grep -v "Prathyu" < inputfile > outputfile

...and if it doesn't work, tell me
1) exactly what you did, word for word, letter for letter, keystroke for keystroke, and
2) Any error resulting error messages or lack of them.
# 7  
Old 05-10-2013
I used the same code which you gave me..please see the commands below

Code:
[smikkineni@svd0dsgd02 ~]$ cd /opt/IBMProjects/EDW/Outputs/DCIP
[smikkineni@svd0dsgd02 DCIP]$ grep -v "Prathyu" < AMC_Exhibitions_Test_2005.xml > AMC_Exhibitions_Test_2006.xml
[smikkineni@svd0dsgd02 DCIP]$

And it created output file with 0 records.

If I use the code

Code:
grep -i "Prathyu" AMC_Exhibitions_Test_2005.xml

Unix is showing entire file..instead of showing only one line which contains "Prathyu"..

Last edited by Scott; 05-10-2013 at 06:46 PM.. Reason: Code tags
Login or Register to Ask a Question

Previous Thread | Next Thread

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Moving XML tag/contents after specific XML tag within same file

Hi Forum. I have an XML file with the following requirement to move the <AdditionalAccountHolders> tag and its content right after the <accountHolderName> tag within the same file but I'm not sure how to accomplish this through a Unix script. Any feedback will be greatly appreciated. ... (19 Replies)
Discussion started by: pchang
19 Replies

2. Shell Programming and Scripting

Read xml tags and then remove the tag using shell script

<Start> <Header> This is header section </Header> <Body> <Body_start> This is body section <a> <b> <c> <st>111</st> </c> <d> <st>blank</st> </d> </b> </a> </Body_start> <Body_section> This is body section (3 Replies)
Discussion started by: RJG
3 Replies

3. Shell Programming and Scripting

How to remove html tag which has multiple lines in SHELL?

I want to clean a html file. I try to remove the script part in the html and remove the rest of tags and empty lines. The code I try to use is the following: sed '/<script/,/<\/script>/d' webpage.html | sed -e 's/<*>//g' | sed '/^\s*$/d' > output.txt However, in this method, I can not... (10 Replies)
Discussion started by: YuhuiFeng
10 Replies

4. Shell Programming and Scripting

Print a closing XML tag shell script

I have a shell script that does everything I need it to do. But, when I was testing it I realized it doesn't print the closing XML tag.... Does anyone know how to incorporate printing the XML tag with my script? I am using AWK any help would be appreciated. (4 Replies)
Discussion started by: risarose87
4 Replies

5. Shell Programming and Scripting

Modify XML tag using shell script

Hi All Need some help with a unix shell script. I have a XML file as shown below: <Root> <Service> <endPoint type="SOAP" protocol="http"> <provider>ABCD</provider> <urlRewrite>/service/xyz/getAccountDetails</urlRewrite> <timeout>30</timeout> </endPoint> </Service> <Service> <endPoint... (3 Replies)
Discussion started by: abhilwa
3 Replies

6. UNIX for Advanced & Expert Users

Shell Script to read XML tags and the data within that tag

Hi unix Gurus, I am really new to Unix Scripting. Please help me to create a shell script which reads the xml file and from that i need to fetch a particular information. For example <SOURCE BUSINESSNAME ="" DATABASETYPE ="Teradata" DBDNAME ="DWPROD3" DESCRIPTION ="" NAME... (2 Replies)
Discussion started by: SmilePlease
2 Replies

7. Shell Programming and Scripting

How to add the multiple lines of xml tags before a particular xml tag in a file

Hi All, I'm stuck with adding multiple lines(irrespective of line number) to a file before a particular xml tag. Please help me. <A>testing_Location</A> <value>LA</value> <zone>US</zone> <B>Region</B> <value>Russia</value> <zone>Washington</zone> <C>Country</C>... (0 Replies)
Discussion started by: mjavalkar
0 Replies

8. Shell Programming and Scripting

shell command to remove some XML tag is needed

Hi all, I have a file which i have to remove some line from it, the lines that i have to remove from my file is as below: </new_name></w"s" langue="Fr-fr" version="1.0" encoding="UTF-8" ?> <New_name> and it is finding at the middle of my file, is there any command line in linux to do it or do... (10 Replies)
Discussion started by: id_2pc
10 Replies

9. Shell Programming and Scripting

How to remove some xml tag lines using shell script

I have existing XML file as below, now based on input string in shell script on workordercode i need to create a seprate xml file for e.g if we pass the input string as 184851 then it find the tag data from <workOrder>..</workOrder> and write to a new file and similarly next time if i pass the... (3 Replies)
Discussion started by: balrajg
3 Replies

10. Shell Programming and Scripting

How to remove xml namespace from xml file using shell script?

I have an xml file: <AutoData xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"> <Table1> <Data1 10 </Data1> <Data2 20 </Data2> <Data3 40 </Data3> <Table1> </AutoData> and I have to remove the portion xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" only. I tried using sed... (10 Replies)
Discussion started by: Gary1978
10 Replies
Login or Register to Ask a Question