|
|||||||
| Forums | Search Forums | Register | Forum Rules | Man Pages | Albums | FAQ | Members | Calendar | Search | Today's Posts | Mark Forums Read |
| UNIX for Dummies Questions & Answers If you're not sure where to post a UNIX or Linux question, post it here. All UNIX and Linux newbies welcome !! |
|
|
|
Thread Tools | Search this Thread | Display Modes |
|
#1
|
|||
|
|||
|
Delete a record in a xml file using shell scripting
find pattern, delete line with pattern and 3 lines above and 8 lines below the pattern. The pattern is "isup". The entire record with starting tag <record> and ending tag </record> containing the pattern is to be deleted and the rest to be retained. Code:
<record>
<signallingstandard>ITU-N</signallingstandard>
<linkid>16068</linkid>
<si>sccp</si>
<mtp>
<opc>3020</opc>
<dpc>8034</dpc>
</mtp>
<sccp>
</sccp>
<map>
<opcode>36</opcode>
</map>
<msucount>1</msucount>
<octcount>83</octcount>
</record>
<record>
<signallingstandard>ITU-N</signallingstandard>
<linkid>37</linkid>
<si>isup</si>
<mtp>
<opc>8469</opc>
<dpc>10336</dpc>
</mtp>
<msucount>168</msucount>
<octcount>3069</octcount>
</record>
<record>
<signallingstandard>ITU-N</signallingstandard>
<linkid>46</linkid>
<si>sccp</si>
<mtp>
<opc>287</opc>
<dpc>24</dpc>
</mtp>
<sccp>
<cgpadigits>966540142007</cgpadigits>
<cdpadigits>919434099997</cdpadigits>
</sccp>
<msucount>1</msucount>
<octcount>53</octcount>
</record>Last edited by pludi; 06-16-2012 at 05:07 PM.. Reason: code tags |
| Sponsored Links | ||
|
|
#2
|
||||
|
||||
|
First of all, what have you tried, and where are you stuck?
Second, that's XML, so it's probably not guaranteed to have the tag containing isup at the same line position every time, so a simple "remove 3 lines before, and 8 lines after" might not yield the desired result every time. |
| Sponsored Links | ||
|
|
#3
|
|||
|
|||
|
I would suggest the following approach: Code:
sed 's:</record>:&*:g' file1 | awk 'index($0,"isup")==0{print $0}' RS='*'Would not work with nested tags and such. You may want to surround isup with > and < if you do not want to match it with tag contents or as a substring of some other value |
| The Following User Says Thank You to jawsnnn For This Useful Post: | ||
sdesstp (06-18-2012) | ||
|
#4
|
|||
|
|||
|
The script is working fine but when the first record itself contains isup , the initial tags are also deleted. How to overcome this and also the script should work for a batch of xml files in a folder Kindly help. Thanks in advance ---------- Post updated at 11:30 AM ---------- Previous update was at 11:02 AM ---------- The script is running fine but when the isup pattern appears in the first record itself the initial tags also getting deleted . Also the script has to run for a batch of xml files . Kindly help The initial tags are like below Code:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE ecapreport SYSTEM "ecapreport.dtd">
<ecapreport>
<stp>bsnlstpekm</stp>
<collector>ekmecap1A</collector>
<startdate>07082011</startdate>
<starttime>235500</starttime>
<enddate>08082011</enddate>
<endtime>000000</endtime>
<record>
<signallingstandard>ITU-I</signallingstandard>
<linkid>1225</linkid>
<si>isup</si>
<mtp>
<opc>004-009-004</opc>
<dpc>004-048-000</dpc>
</mtp>
<msucount>2</msucount>
<octcount>72</octcount>
</record>
<record>
<signallingstandard>ITU-I</signallingstandard>
<linkid>1225</linkid>
<si>isup</si>
<mtp>
<opc>004-009-004</opc>
<dpc>002-056-000</dpc>
</mtp>
<msucount>56</msucount>
<octcount>1009</octcount>
</record>Last edited by pludi; 06-21-2012 at 04:32 AM.. |
| Sponsored Links | |
|
|
#5
|
|||
|
|||
|
Actually sed is the perfect tool to do this sort of things, but it is a little tricky to understand its workings. Let's start with this first and write the program later: sed works line-oriented: it reads the first line of the input and then applies one command after the other to it, until it reaches the end of its script. Then the next line of input is read and this process starts over, until the last line of input, upon which it stops. You see from this explanation, that "delete the x lines before" is difficult to do, because when sed gets to decide if a line is to be deleted it isn't it its scope any more. The solution is to make the part we want to examine "one line" somehow. When i said sed "reads a line" i was not completely correct: there is a data structure called "pattern space", which actually holds this line. Every change sed does is done on this pattern space. If sed is called without the "-n" command line option the resulting content of the pattern space is printed automatically at the end of the script. If the pattern space becomes completely empty as a result of the manipulations the rest of the script is skipped and the process starts over with the next line of input. There is a special command ("N") to add the next line of input to the contents of this pattern space, separated by a linefeed character ("\n"). With this device it is simple to construct your filter: upon finding a line with "<record>" we add all following lines to the pattern space until encountering a line with "</record>". Because we add to the pattern space this "pattern space contains <record>" is true always from now on. (Here the line-oriented nature of sed is showing.) This will give us the whole XML "record" in our pattern space. When we find the "</record>" in our search string we know we have read the whole record. Now we search for the search term in this and - when we find it - do NOT print this record, otherwise it gets printed (the "!" is a logical NOT). Afterwards the pattern space is deleted and the cycle starts over again. Because we haven't switched off the default "print" action with the "-n" command line switch all the pattern spaces (~lines) containing neither a "<record>" nor a "</record>" (that is: all the lines outside of <record>..</record> structures) are being printed automatically. Code:
sed ':start
/<\/record>/ {
/isup/!p
d
}
/<record>/ {
N
b start
}' /path/to/inputfileRegarding the "batch of input files": construct a loop a list of filenames or use "find" if there is some filemask you can apply: Code:
#!/bin/ksh
typeset file=""
# this will process all files in the file "/path/to/list" and put the results in files named like the input files but with an appended ".processed"
cat /path/to/list | while read file ; do
sed ':start
/<\/record>/ {
/isup/!p
d
}
/<record>/ {
N
b start
}' $file > ${file}.processed
doneIf you can construct a file mask (like "*.file", etc.) to find the files you can use "find" to do the work. Save the sed-script in a file "script.sed" and: Code:
find /path/to/input/files -type f -name "*mask*" -exec file={} ; sed -f script.sed $file > ${file}.processed \;I hope this helps. bakunin Last edited by bakunin; 06-21-2012 at 06:12 AM.. |
| Sponsored Links | ||
|
![]() |
| Tags |
| delete lines above and below a pattern |
| Thread Tools | Search this Thread |
| Display Modes | |
More UNIX and Linux Forum Topics You Might Find Helpful
|
||||
| Thread | Thread Starter | Forum | Replies | Last Post |
| How to add trailer record at the end of the flat file in the unix ksh shell scripting? | srikanth_sagi | Shell Programming and Scripting | 3 | 05-18-2012 03:23 PM |
| How to delete 1 record in large file! | nikki1200 | Shell Programming and Scripting | 3 | 12-07-2011 02:17 AM |
| How to delete first record from all the file? | NirajThakar | Shell Programming and Scripting | 3 | 02-11-2011 03:51 AM |
| Shell script for searching a record,copy to a file and then delete it | kumara2010 | Shell Programming and Scripting | 5 | 06-16-2010 10:33 AM |
| How to delete a record from a csv file | Rajeev Agrawal | UNIX for Dummies Questions & Answers | 1 | 02-04-2006 12:43 PM |
|
|