I have a list of files which will have duplicate list of blocks of text. Following is a sample of the file, I have removed the sensitive information from the file.
All the code samples starts from <TR BGCOLOR="white"> and Ends with IP address and two html tags like this.
Code:
10.14.22.22
</TD>
</TR>
Multiple duplication can appear on the file and what I need is to go through the file and just remove the duplicated blocks from the file,
Given that it is a HTML file I need to keep the format of the file and only codeblock within these tags to be evalated.
I have tried many sample code (sed, awk and python) all results in removing other codes in the file (like other html tags).
Thanks in advance for any help
Code:
<TR BGCOLOR="white">
<TD>30Apr2015</TD>
<TD>17:39:08</TD>
<TD>NAME</TD>
<TD>firewall_policy</TD>
<TD>fw_policies</TD>
<TD>Modify Object</TD>
<TD><H3> XX - </H3> <br> SOME DATA HERE<br></TD>
<TD>p111111</TD>
</TR>
<TR BGCOLOR="white">
<TD>1May2015</TD>
<TD>9:06:34</TD>
<TD>NAME2</TD>
<TD>firewall_policy</TD>
<TD>fw_policies</TD>
<TD>Modify Object</TD>
<TD><H3> YY </H3> <br> SOME OTHER DATA HERE.<br></TD>
<TD>p222222</TD>
<TD>
10.14.22.22
</TD>
</TR>
<TR BGCOLOR="white">
<TD>30Apr2015</TD>
<TD>17:39:08</TD>
<TD>NAME</TD>
<TD>firewall_policy</TD>
<TD>fw_policies</TD>
<TD>Modify Object</TD>
<TD><H3> XX - </H3> <br> SOME DATA HERE<br></TD>
<TD>p111111</TD>
</TR>
<TR BGCOLOR="white">
<TD>1May2015</TD>
<TD>9:06:34</TD>
<TD>NAME2</TD>
<TD>firewall_policy</TD>
<TD>fw_policies</TD>
<TD>Modify Object</TD>
<TD><H3> YY </H3> <br> SOME OTHER DATA HERE.<br></TD>
<TD>p222222</TD>
<TD>
10.14.22.22
</TD>
</TR>
<TR BGCOLOR="white">
<TD>30Apr2015</TD>
<TD>04:39:10</TD>
<TD>NAME3</TD>
<TD>firewall_policy</TD>
<TD>fw_policies</TD>
<TD>Modify Object</TD>
<TD><H3> ZZ </H3> <br> SOME OTHER DATA XXXX HERE.<br></TD>
<TD>p333333</TD>
<TD>
10.14.33.33
</TD>
</TR>
Last edited by Don Cragun; 05-06-2015 at 04:50 AM..
Reason: Add CODE and ICODE tags.
T is an array indexed by the entire record $0, defined when first referenced, initially empty = FALSE. By negating, it becomes TRUE and executes the default action: print. As T[$0] is post-incremented, the next time(s) its negation will evaluate to FALSE and thus not print anymore.
Hi folks!
I have a file which contains a 1000 lines. On each line i have multiple occurrences ( 26 to be exact ) of pattern folder#/folder#.
# is depicting the line number in the file
some text here folder1/folder1 some text here folder1/folder1 some text here folder1/folder1 some text... (7 Replies)
So, I have text files,
one "fail.txt"
And one
"color.txt"
I now want to use a command line (DOS) to remove ANY line that is PRESENT IN BOTH from each text file.
Afterwards there shall be no duplicate lines. (1 Reply)
I sat down yesterday to write this script and have just realised that my methodology is broken........
In essense I have.....
----------------------------------------------------------------- (This line really is in the file)
Service ID: 12345 ... (7 Replies)
Hello all,
short story: I'm writing a script to add and remove dns records in dns files. Its on a RHEL 5.5
So far i've locked up the basic operations in a couple of functions:
- validate the parameters
- search for existant ip in file when adding
- search for existant name records in... (6 Replies)
I have 2 duplicate blocks in an inode and I want to get rid of one of them so that I can get into my pc. The message I get is Multiply-claimed block(s) in inode 5997500: 12690101 12690101. All help is appreciated. Thanks (7 Replies)
Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example:
input:
<string-array name="threeItems">
<item>item1</item>
<item>item2</item>
<item>item3</item>
</string-array>
<string-array name="twoItems">
<item>item1</item>
<item>item2</item>... (19 Replies)
Hi
I have been struggling with a script for removing duplicate messages from a shared mailbox.
I would like to search for duplicate messages based on the “Message-ID” string within the messages files.
I have managed to find the duplicate “Message-ID” strings and (if I would like) delete... (1 Reply)
Hi,
This is part of a large text file I need to separate out.
I'd like some help to build a shell script that will extract the text between sets of dashed lines, write that to a new file using the whole or part of the first text string as the new file name, then move on to the next one and... (7 Replies)
Hello,
I have a log file which is generated by a script which looks like this:
userid: 7
starttime: Sat May 24 23:24:13 CEST 2008
endtime: Sat May 24 23:26:57 CEST 2008
total time spent: 2.73072 minutes / 163.843 seconds
date: Sat Jun 7 16:09:03 CEST 2008
userid: 8
starttime: Sun May... (7 Replies)
Hello,
Hello Firends,
I have file like below. I want to remove selected blocks say abc,pqr,lst. how can i remove those blocks from file.
zone abc {
blah
blah
blah }
zone xyz {
blah
blah
blah }
zone pqr {
blah
blah
blah } (4 Replies)