Unix/Linux Go Back    

RedHat 9 (Linux i386) - man page for html::filter (redhat section 3)

Linux & Unix Commands - Search Man Pages
Man Page or Keyword Search:   man
Select Man Page Set:       apropos Keyword Search (sections above)

HTML::Filter(3) 	       User Contributed Perl Documentation		  HTML::Filter(3)

       HTML::Filter - Filter HTML text through the parser

       This module is deprecated. "HTML::Parser" now provides the functionally of "HTML::Filter"
       much more efficiently with the the "default" handler.

	require HTML::Filter;
	$p = HTML::Filter->new->parse_file("index.html");

       "HTML::Filter" is an HTML parser that by default prints the original text of each HTML
       element (a slow version of cat(1) basically).  The callback methods may be overridden to
       modify the filtering for some HTML elements and you can override output() method which is
       called to print the HTML text.

       "HTML::Filter" is a subclass of "HTML::Parser". This means that the document should be
       given to the parser by calling the $p->parse() or $p->parse_file() methods.

       The first example is a filter that will remove all comments from an HTML file.  This is
       achieved by simply overriding the comment method to do nothing.

	 package CommentStripper;
	 require HTML::Filter;
	 sub comment { }  # ignore comments

       The second example shows a filter that will remove any <TABLE>s found in the HTML file.
       We specialize the start() and end() methods to count table tags and then make output not
       happen when inside a table.

	 package TableStripper;
	 require HTML::Filter;
	 sub start
	    my $self = shift;
	    $self->{table_seen}++ if $_[0] eq "table";

	 sub end
	    my $self = shift;
	    $self->{table_seen}-- if $_[0] eq "table";

	 sub output
	     my $self = shift;
	     unless ($self->{table_seen}) {

       If you want to collect the parsed text internally you might want to do something like

	 package FilterIntoString;
	 require HTML::Filter;
	 sub output { push(@{$_[0]->{fhtml}}, $_[1]) }
	 sub filtered_html { join("", @{$_[0]->{fhtml}}) }


       Copyright 1997-1999 Gisle Aas.

       This library is free software; you can redistribute it and/or modify it under the same
       terms as Perl itself.

perl v5.8.0				    1999-12-09				  HTML::Filter(3)
Unix & Linux Commands & Man Pages : ©2000 - 2018 Unix and Linux Forums

All times are GMT -4. The time now is 10:17 PM.