Delete a record in a xml file using shell scripting
find pattern, delete line with pattern and 3 lines above and 8 lines below the pattern. The pattern is "isup". The entire record with starting tag <record> and ending tag </record> containing the pattern is to be deleted and the rest to be retained.
Last edited by pludi; 06-16-2012 at 06:07 PM..
Reason: code tags
First of all, what have you tried, and where are you stuck?
Second, that's XML, so it's probably not guaranteed to have the tag containing isup at the same line position every time, so a simple "remove 3 lines before, and 8 lines after" might not yield the desired result every time.
The script is working fine but when the first record itself contains isup , the initial tags are also deleted. How to overcome this and also the script should work for a batch of xml files in a folder
Kindly help. Thanks in advance
---------- Post updated at 11:30 AM ---------- Previous update was at 11:02 AM ----------
The script is running fine but when the isup pattern appears in the first record itself the initial tags also getting deleted . Also the script has to run for a batch of xml files . Kindly help
The initial tags are like below
Actually sed is the perfect tool to do this sort of things, but it is a little tricky to understand its workings. Let's start with this first and write the program later:
sed works line-oriented: it reads the first line of the input and then applies one command after the other to it, until it reaches the end of its script. Then the next line of input is read and this process starts over, until the last line of input, upon which it stops.
You see from this explanation, that "delete the x lines before" is difficult to do, because when sed gets to decide if a line is to be deleted it isn't it its scope any more. The solution is to make the part we want to examine "one line" somehow.
When i said sed "reads a line" i was not completely correct: there is a data structure called "pattern space", which actually holds this line. Every change sed does is done on this pattern space. If sed is called without the "-n" command line option the resulting content of the pattern space is printed automatically at the end of the script. If the pattern space becomes completely empty as a result of the manipulations the rest of the script is skipped and the process starts over with the next line of input. There is a special command ("N") to add the next line of input to the contents of this pattern space, separated by a linefeed character ("\n").
With this device it is simple to construct your filter: upon finding a line with "<record>" we add all following lines to the pattern space until encountering a line with "</record>". Because we add to the pattern space this "pattern space contains <record>" is true always from now on. (Here the line-oriented nature of sed is showing.) This will give us the whole XML "record" in our pattern space.
When we find the "</record>" in our search string we know we have read the whole record. Now we search for the search term in this and - when we find it - do NOT print this record, otherwise it gets printed (the "!" is a logical NOT). Afterwards the pattern space is deleted and the cycle starts over again. Because we haven't switched off the default "print" action with the "-n" command line switch all the pattern spaces (~lines) containing neither a "<record>" nor a "</record>" (that is: all the lines outside of <record>..</record> structures) are being printed automatically.
Regarding the "batch of input files": construct a loop a list of filenames or use "find" if there is some filemask you can apply:
If you can construct a file mask (like "*.file", etc.) to find the files you can use "find" to do the work. Save the sed-script in a file "script.sed" and:
Hi
I have a text file like below. THe content of the text will vary.
Entire text file have four consecutive lines followed with blank line.
I want to delete the occurrence of the two consicutive lines in the text file. I don't have pattern to match and delete. Just i need to delete all... (5 Replies)
Hi,,
I have requirement that i need to get DISTINCT values from a table and if there are two records i need to update it to one record and then need to submit INSERT statements by using the updated value as a parameter. Here is the example follows..
SELECT DISTINCT ID FROM OFFER_GROUP WHERE... (1 Reply)
Hi ,
I have input file as XML. following are input data
#complex.xml
Code:
<?xml version="1.0" encoding="UTF-8"?><TEST_doc xmlns="http://www.w3.org/2001/XMLSchema-instance"> <ENTRY uid="123456"> <protein> <name>PROT001</name> <organism>Human</organism> ... (1 Reply)
Hi evry1,
This is my 1st post in this forum.Pls help me
I want to extract some data froma xml file which has 2000 lines using shell scripting. Actually my xml file has some "audio and video codes" which i need to arrange in a column wise format after extracting it using shell scripting.I... (4 Replies)
Hi,
How to add trailer record at the end of the flat file in the unix ksh shell scripting
can you please let me know the procedure
Regards
Srikanth (3 Replies)
Hi,
I want to parse an XML File using Shell Script preferably by using awk command,
I/P file is :
<gn:ExternalGsmCell id="016P3A">
<gn:attributes>
<gn:mnc>410</gn:mnc>
<gn:mcc>310</gn:mcc>
<gn:lac>8016</gn:lac>
... (2 Replies)
Hi folks,
Need some help with XML to text file parsing , the following is the content of the XML File.
<xn:SubNetwork id="SNJNPRZDCR0R03">
<xn:MeContext id="PRSJU0005">
<xn:VsDataContainer id="PRSJU0005">
<xn:attributes>
... (6 Replies)
Hi,
I have a requirement in hand:
I have a file with millions of records say file 1.I have another file, say file 2 which has 2000 records in it. The requirement is to read file2 , and remove the read record from file 1 and move i to a seperate file, file 3.
For eg: Read file 2, get the... (5 Replies)