sed uses regexes, not globs, which explains part of your difficulty.
In shell globbing, * means "anything else". In regex, it means "zero or more of the previous". So files* would match files, filess, filesssssssssssssssss, but wouldn't match files\
The regex equivalent would be .*, where . is a special character mean "match anything". But I'd try something a little trickier, to match > so that part of the regex doesn't scan outside the tag it started in. [] let you specify a range to include or exclude. [A-Z] would match a single letter in A-Z range. [^A-Z] would match a single character not in the A-Z range. [^>] would match anything that's not an end-of-tag character.
So, [^>]* would match zero or more non-> characters, swallowing up the rest of the tag and stopping right before >.
Hi,
I am trying to strip html tags of a string for example
<TD>no problem</TD>
the sesult should be
no problem
but could never get rid off all the tags
sed 's/<..D>//g'
Please help, I am new (3 Replies)
I am cleaning up HTML with sed. With the regexp
<a name="+"></a><h>*<span class="mw-headline" >+</span></h>
I can find the tags I need. But when I place them in a sed command, sed fails. So I started building up from a smaller command. This is where I am now:
sed -r -e s/"<a... (3 Replies)
Hello,
I am using sed as follows -
sed 's/CONTACT SYSTEMS! Some payments have been rejected/<B><font color="red" size="5.0pt"CONTACT SYSTEMS! Some payments have been rejected</font></B>/' $REPORT_FILE
But while executing this, I am getting the error as -
sed: command garbled
&... (5 Replies)
Hi, I am working on transforming html code text into the .vert text format. I want to use linux utility sed. I have this regexp which should do the work: s/ \(?!*>\)/\n/g. I use it like this with sed: echo "you <we try> there" | sed 's/ \(?!*>\)/\n/g' ... The demanded output should be:
you
<we... (5 Replies)
I have pasted the contents of a log file (swmbackup.wrkstn.1262071383.sales2a) below:
Workstation: sales2a<BR
Vault sales2a-hogwarts will be initialized.<BR
<font color="red"There was a problem mounting /mnt/sales2a/desktop$ </FONT<BR
<font color="red"There was a problem mounting... (4 Replies)
I generally save a lot of web pages for reading offline which works out great for school. Now I have to spend a lot of time on the bus and I am looking for the best way to read some of these webpages using my Nokia 7610.
I have uploaded the files to my phone, but they are deadly deadly slow to... (2 Replies)
Hi please help me with this .
I have a file test.txt with following content
$cat test.txt
<td>$test</td>
<h2>$test2</h2>
and I have a ksh with following content
$cat test.ksh
#!/bin/ksh
test=3
test2=4
while read line
do
echo $line
done < test.html
I am expecting the output as (4 Replies)
Hi
I've searched for it for few hours now and i can't seem to find anything working like i want. I've got webpage, saved in file par with form like this:
<html><body><form name='sendme' action='http://example.com/' method='POST'>
<textarea name='1st'>abc123def678</textarea>
<textarea... (9 Replies)
I need all the end tags of </font> to be replaced with new line yet enclosing tag to be retained </font>. Please help me in this regard.
Input:
<font>abc</font>def<font>ghi</font>
Output:
<font>abc</font>
def
<font>ghi</font> (3 Replies)
Hi,
im trying to read a Temperature value from html code.
So far i have managed to reduce the whole html page down to this single line with the following sed command:sed -n '/Temperature/p' $temp_temperature | tee temp_string
<TD width='350'>Temperature :</td><td>25... (2 Replies)
Discussion started by: naittis
2 Replies
LEARN ABOUT DEBIAN
html::stripscripts::parser
Parser(3pm) User Contributed Perl Documentation Parser(3pm)NAME
HTML::StripScripts::Parser - XSS filter using HTML::Parser
SYNOPSIS
use HTML::StripScripts::Parser();
my $hss = HTML::StripScripts::Parser->new(
{
Context => 'Document', ## HTML::StripScripts configuration
Rules => { ... },
},
strict_comment => 1, ## HTML::Parser options
strict_names => 1,
);
$hss->parse_file("foo.html");
print $hss->filtered_document;
OR
print $hss->filter_html($html);
DESCRIPTION
This class provides an easy interface to "HTML::StripScripts", using "HTML::Parser" to parse the HTML.
See HTML::Parser for details of how to customise how the raw HTML is parsed into tags, and HTML::StripScripts for details of how to
customise the way those tags are filtered.
CONSTRUCTORS
new ( {CONFIG}, [PARSER_OPTIONS] )
Creates a new "HTML::StripScripts::Parser" object.
The CONFIG parameter has the same semantics as the CONFIG parameter to the "HTML::StripScripts" constructor.
Any PARSER_OPTIONS supplied will be passed on to the HTML::Parser init method, allowing you to influence the way the input is parsed.
You cannot use PARSER_OPTIONS to set the "HTML::Parser" event handlers (see "Events" in HTML::Parser) since
"HTML::StripScripts::Parser" uses all of the event hooks itself. However, you can use "Rules" (see "Rules" in HTML::StripScripts) to
customise the handling of all tags and attributes.
METHODS
See HTML::Parser for input methods, HTML::StripScripts for output methods.
"filter_html()"
"filter_html()" is a convenience method for filtering HTML already loaded into a scalar variable. It combines calls to
"HTML::Parser::parse()", "HTML::Parser::eof()" and "HTML::StripScripts::filtered_document()".
$filtered_html = $hss->filter_html($html);
SUBCLASSING
The "HTML::StripScripts::Parser" class is subclassable. Filter objects are plain hashes. The hss_init() method takes the same arguments
as new(), and calls the initialization methods of both "HTML::StripScripts" and "HTML::Parser".
See "SUBCLASSING" in HTML::StripScripts and "SUBCLASSING" in HTML::Parser.
SEE ALSO
HTML::StripScripts, HTML::Parser, HTML::StripScripts::LibXML
BUGS
None reported.
Please report any bugs or feature requests to bug-html-stripscripts-parser@rt.cpan.org, or through the web interface at
<http://rt.cpan.org>.
AUTHOR
Original author Nick Cleaton <nick@cleaton.net>
New code added and module maintained by Clinton Gormley <clint@traveljury.com>
COPYRIGHT
Copyright (C) 2003 Nick Cleaton. All Rights Reserved.
Copyright (C) 2007 Clinton Gormley. All Rights Reserved.
LICENSE
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.10.1 2009-11-05 Parser(3pm)