I have a list of files which will have duplicate list of blocks of text. Following is a sample of the file, I have removed the sensitive information from the file.
All the code samples starts from <TR BGCOLOR="white"> and Ends with IP address and two html tags like this.
Multiple duplication can appear on the file and what I need is to go through the file and just remove the duplicated blocks from the file,
Given that it is a HTML file I need to keep the format of the file and only codeblock within these tags to be evalated.
I have tried many sample code (sed, awk and python) all results in removing other codes in the file (like other html tags).
Thanks in advance for any help
Last edited by Don Cragun; 05-06-2015 at 04:50 AM..
Reason: Add CODE and ICODE tags.
T is an array indexed by the entire record $0, defined when first referenced, initially empty = FALSE. By negating, it becomes TRUE and executes the default action: print. As T[$0] is post-incremented, the next time(s) its negation will evaluate to FALSE and thus not print anymore.
Hi folks!
I have a file which contains a 1000 lines. On each line i have multiple occurrences ( 26 to be exact ) of pattern folder#/folder#.
# is depicting the line number in the file
some text here folder1/folder1 some text here folder1/folder1 some text here folder1/folder1 some text... (7 Replies)
So, I have text files,
one "fail.txt"
And one
"color.txt"
I now want to use a command line (DOS) to remove ANY line that is PRESENT IN BOTH from each text file.
Afterwards there shall be no duplicate lines. (1 Reply)
I sat down yesterday to write this script and have just realised that my methodology is broken........
In essense I have.....
----------------------------------------------------------------- (This line really is in the file)
Service ID: 12345 ... (7 Replies)
Hello all,
short story: I'm writing a script to add and remove dns records in dns files. Its on a RHEL 5.5
So far i've locked up the basic operations in a couple of functions:
- validate the parameters
- search for existant ip in file when adding
- search for existant name records in... (6 Replies)
I have 2 duplicate blocks in an inode and I want to get rid of one of them so that I can get into my pc. The message I get is Multiply-claimed block(s) in inode 5997500: 12690101 12690101. All help is appreciated. Thanks (7 Replies)
Hello again, I am wanting to remove all duplicate blocks of XML code in a file. This is an example:
input:
<string-array name="threeItems">
<item>item1</item>
<item>item2</item>
<item>item3</item>
</string-array>
<string-array name="twoItems">
<item>item1</item>
<item>item2</item>... (19 Replies)
Hi
I have been struggling with a script for removing duplicate messages from a shared mailbox.
I would like to search for duplicate messages based on the “Message-ID” string within the messages files.
I have managed to find the duplicate “Message-ID” strings and (if I would like) delete... (1 Reply)
Hi,
This is part of a large text file I need to separate out.
I'd like some help to build a shell script that will extract the text between sets of dashed lines, write that to a new file using the whole or part of the first text string as the new file name, then move on to the next one and... (7 Replies)
Hello,
I have a log file which is generated by a script which looks like this:
userid: 7
starttime: Sat May 24 23:24:13 CEST 2008
endtime: Sat May 24 23:26:57 CEST 2008
total time spent: 2.73072 minutes / 163.843 seconds
date: Sat Jun 7 16:09:03 CEST 2008
userid: 8
starttime: Sun May... (7 Replies)
Hello,
Hello Firends,
I have file like below. I want to remove selected blocks say abc,pqr,lst. how can i remove those blocks from file.
zone abc {
blah
blah
blah }
zone xyz {
blah
blah
blah }
zone pqr {
blah
blah
blah } (4 Replies)