Filter or remove duplicate block of text without distinguishing marks or fields Post: 302563494

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove duplicate text

Hello, I have a log file which is generated by a script which looks like this: userid: 7 starttime: Sat May 24 23:24:13 CEST 2008 endtime: Sat May 24 23:26:57 CEST 2008 total time spent: 2.73072 minutes / 163.843 seconds date: Sat Jun 7 16:09:03 CEST 2008 userid: 8 starttime: Sun May...

2. Shell Programming and Scripting

Filter duplicate block of text using SED

Hi, I would like to print a block of text between 2 regular expression using Sed, This can be achieved by using the command as shown below, however my problem is the same block of text is repeated twice. I would like to eliminate the duplicate block of text. For Example If my file...

3. Shell Programming and Scripting

Remove duplicate files based on text string?

Hi I have been struggling with a script for removing duplicate messages from a shared mailbox. I would like to search for duplicate messages based on the “Message-ID” string within the messages files. I have managed to find the duplicate “Message-ID” strings and (if I would like) delete...

4. Shell Programming and Scripting

Filter/remove duplicate .dat file with certain criteria

I am a beginner in Unix. Though have been asked to write a script to filter(remove duplicates) data from a .dat file. File is very huge containig billions of records. contents of file looks like 30002157,40342424,OTC,mart_rec,100, ,0 30002157,40343369,OTC,mart_rec,95, ,0...

5. Shell Programming and Scripting

Find duplicate based on 'n' fields and mark the duplicate as 'D'

Hi, In a file, I have to mark duplicate records as 'D' and the latest record alone as 'C'. In the below file, I have to identify if duplicate records are there or not based on Man_ID, Man_DT, Ship_ID and I have to mark the record with latest Ship_DT as "C" and other as "D" (I have to create...

6. Windows & DOS: Issues & Discussions

Remove duplicate lines from text files.

So, I have text files, one "fail.txt" And one "color.txt" I now want to use a command line (DOS) to remove ANY line that is PRESENT IN BOTH from each text file. Afterwards there shall be no duplicate lines.

7. Shell Programming and Scripting

Remove duplicate lines from file based on fields

Dear community, I have to remove duplicate lines from a file contains a very big ammount of rows (milions?) based on 1st and 3rd columns The data are like this: Region 23/11/2014 09:11:36 41752 Medio 23/11/2014 03:11:38 4132 Info 23/11/2014 05:11:09 4323...

8. Shell Programming and Scripting

How to remove duplicate text blocks from a file?

Hi All I have a list of files which will have duplicate list of blocks of text. Following is a sample of the file, I have removed the sensitive information from the file. All the code samples starts from <TR BGCOLOR="white"> and Ends with IP address and two html tags like this. 10.14.22.22...

9. Shell Programming and Scripting

Remove duplicate occurrences of text pattern

Hi folks! I have a file which contains a 1000 lines. On each line i have multiple occurrences ( 26 to be exact ) of pattern folder#/folder#. # is depicting the line number in the file some text here folder1/folder1 some text here folder1/folder1 some text here folder1/folder1 some text...

10. Shell Programming and Scripting

Filter file to remove duplicate values in first column

Hello, I have a script that is generating a tab delimited output file. num Name PCA_A1 PCA_A2 PCA_A3 0 compound_00 -3.5054 -1.1207 -2.4372 1 compound_01 -2.2641 0.4287 -1.6120 3 compound_03 -1.3053 1.8495 ...

LEARN ABOUT MOJAVE

html::filter5.18

HTML::Filter(3) 					User Contributed Perl Documentation					   HTML::Filter(3)

NAME

       HTML::Filter - Filter HTML text through the parser

NOTE

       This module is deprecated. The "HTML::Parser" now provides the functionally of "HTML::Filter" much more efficiently with the the "default"
       handler.

SYNOPSIS

	require HTML::Filter;
	$p = HTML::Filter->new->parse_file("index.html");

DESCRIPTION

       "HTML::Filter" is an HTML parser that by default prints the original text of each HTML element (a slow version of cat(1) basically).  The
       callback methods may be overridden to modify the filtering for some HTML elements and you can override output() method which is called to
       print the HTML text.

       "HTML::Filter" is a subclass of "HTML::Parser". This means that the document should be given to the parser by calling the $p->parse() or
       $p->parse_file() methods.

EXAMPLES

       The first example is a filter that will remove all comments from an HTML file.  This is achieved by simply overriding the comment method to
       do nothing.

	 package CommentStripper;
	 require HTML::Filter;
	 @ISA=qw(HTML::Filter);
	 sub comment { }  # ignore comments

       The second example shows a filter that will remove any <TABLE>s found in the HTML file.	We specialize the start() and end() methods to
       count table tags and then make output not happen when inside a table.

	 package TableStripper;
	 require HTML::Filter;
	 @ISA=qw(HTML::Filter);
	 sub start
	 {
	    my $self = shift;
	    $self->{table_seen}++ if $_[0] eq "table";
	    $self->SUPER::start(@_);
	 }

	 sub end
	 {
	    my $self = shift;
	    $self->SUPER::end(@_);
	    $self->{table_seen}-- if $_[0] eq "table";
	 }

	 sub output
	 {
	     my $self = shift;
	     unless ($self->{table_seen}) {
		 $self->SUPER::output(@_);
	     }
	 }

       If you want to collect the parsed text internally you might want to do something like this:

	 package FilterIntoString;
	 require HTML::Filter;
	 @ISA=qw(HTML::Filter);
	 sub output { push(@{$_[0]->{fhtml}}, $_[1]) }
	 sub filtered_html { join("", @{$_[0]->{fhtml}}) }

SEE ALSO

       HTML::Parser

COPYRIGHT

       Copyright 1997-1999 Gisle Aas.

       This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself.

perl v5.18.2							    2013-03-25							   HTML::Filter(3)

10 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Remove duplicate text

Discussion started by: dejavu88

2. Shell Programming and Scripting

Filter duplicate block of text using SED

Discussion started by: dkumar91

3. Shell Programming and Scripting

Remove duplicate files based on text string?

Discussion started by: spangberg

4. Shell Programming and Scripting

Filter/remove duplicate .dat file with certain criteria

Discussion started by: mukeshguliao