03-14-2008
removing html tags via parameter expansion
Hi all-
I have a variable that contains a web page:
echo $STUFF
<html> <head> <title>my page</title></head> <body> blah blah etc..
Can I use the shell's parameter expansion abilities to remove just the tags?
I thought that FIXHTML=${STUFF//<*>/} might do it, but it didn't seem to work.
Do I need to escape the < and > or something like that?
Thanks!!
-Rev66
10 More Discussions You Might Find Interesting
1. Shell Programming and Scripting
I generally save a lot of web pages for reading offline which works out great for school. Now I have to spend a lot of time on the bus and I am looking for the best way to read some of these webpages using my Nokia 7610.
I have uploaded the files to my phone, but they are deadly deadly slow to... (2 Replies)
Discussion started by: naphelge
2 Replies
2. Shell Programming and Scripting
Say you have this numeric variable that can be set by the user but you never want it to leave a certain range when it gets printed. How could you use parameter expansion such that it will never expand outside of that boundary? Thanks
---------- Post updated at 11:09 PM ---------- Previous update... (3 Replies)
Discussion started by: stevenswj
3 Replies
3. UNIX for Advanced & Expert Users
Hello Unix Gurus
I am having a problem with one of the files that i am generating using a Unix Script. This Unix Scripts connects to the MY SQL Server and loads the data into a Text file. While generating the Text file for one of the tables the value in one of the column is as follows.
<p>... (3 Replies)
Discussion started by: chetan.mudike
3 Replies
4. Shell Programming and Scripting
I store different variance of the below in an xml file. and apparently, xml has an issue loading up data like this because it contains html tags. i would like to preserve this data as it is, but unfortunately, xml says i cant.
so i have to strip out all the html tags.
the examples i found... (9 Replies)
Discussion started by: SkySmart
9 Replies
5. Shell Programming and Scripting
I tried to find elegant (or at least simple) way to remove all but couple of html tags from html file, but all examples I found dealt with removing all the tags.
The logic of the script would be:
- if there is <li> or <ul> on the line, do nothing (=write same line to output)
- if there is:... (0 Replies)
Discussion started by: juubuntu
0 Replies
6. Homework & Coursework Questions
Use and complete the template provided. The entire template must be completed. If you don't, your post may be deleted!
1. The problem statement, all variables and given/known data:
You will write a script that will remove all HTML tags from an HTML document and remove any consecutive... (3 Replies)
Discussion started by: tburns517
3 Replies
7. Shell Programming and Scripting
I have made the following examples that print various parameter expansions
text: iv-hhz-sac/hpac/hhz.d/iv.hpac..hhz.d.2016.250.070018.sac
(text%.*): iv-hhz-sac/hpac/hhz.d/iv.hpac..hhz.d.2016.250.070018
(text%%.*): iv-hhz-sac/hpac/hhz
(text#*.): d/iv.hpac..hhz.d.2016.250.070018.sac... (2 Replies)
Discussion started by: kristinu
2 Replies
8. Shell Programming and Scripting
#!/bin/bash
SNMPW='/usr/bin/snmpwalk'
while read h i
do
loc=$($SNMPW -v3 -u 'Myusername' -l authPriv -a SHA -A 'Password1' -x AES -X 'Password2' $i sysLocation.0 2>/dev/null)
loc=${loc:-" is not snmpable."}
loc=${loc##*: }
loc=${loc//,/}
echo "$i,$h,$loc"
done < $1
My question is ... ... (1 Reply)
Discussion started by: sumguy
1 Replies
9. Shell Programming and Scripting
I am trying to become more fluent with the interworking of bash and minimize the number of external calls.
Sample Data. This will be the response of the snmp query.
SNMPv2-MIB::sysName.0 = STRING: SomeHostName
SNMPv2-MIB::sysObjectID.0 = OID: SNMPv2-SMI::enterprises.9.1.1745... (5 Replies)
Discussion started by: sumguy
5 Replies
10. Shell Programming and Scripting
Hello All,
Could you please do help me here as I would like to perform parameter expansion in shell over a parameter expansion.
Let's say I have following variable.
path="/var/talend/nat/cdc"
Now to get only nat I could do following.
path1="${path%/*}"
path1="${path1##*/}"
Here... (8 Replies)
Discussion started by: RavinderSingh13
8 Replies
LEARN ABOUT DEBIAN
html::stripscripts::parser
Parser(3pm) User Contributed Perl Documentation Parser(3pm)
NAME
HTML::StripScripts::Parser - XSS filter using HTML::Parser
SYNOPSIS
use HTML::StripScripts::Parser();
my $hss = HTML::StripScripts::Parser->new(
{
Context => 'Document', ## HTML::StripScripts configuration
Rules => { ... },
},
strict_comment => 1, ## HTML::Parser options
strict_names => 1,
);
$hss->parse_file("foo.html");
print $hss->filtered_document;
OR
print $hss->filter_html($html);
DESCRIPTION
This class provides an easy interface to "HTML::StripScripts", using "HTML::Parser" to parse the HTML.
See HTML::Parser for details of how to customise how the raw HTML is parsed into tags, and HTML::StripScripts for details of how to
customise the way those tags are filtered.
CONSTRUCTORS
new ( {CONFIG}, [PARSER_OPTIONS] )
Creates a new "HTML::StripScripts::Parser" object.
The CONFIG parameter has the same semantics as the CONFIG parameter to the "HTML::StripScripts" constructor.
Any PARSER_OPTIONS supplied will be passed on to the HTML::Parser init method, allowing you to influence the way the input is parsed.
You cannot use PARSER_OPTIONS to set the "HTML::Parser" event handlers (see "Events" in HTML::Parser) since
"HTML::StripScripts::Parser" uses all of the event hooks itself. However, you can use "Rules" (see "Rules" in HTML::StripScripts) to
customise the handling of all tags and attributes.
METHODS
See HTML::Parser for input methods, HTML::StripScripts for output methods.
"filter_html()"
"filter_html()" is a convenience method for filtering HTML already loaded into a scalar variable. It combines calls to
"HTML::Parser::parse()", "HTML::Parser::eof()" and "HTML::StripScripts::filtered_document()".
$filtered_html = $hss->filter_html($html);
SUBCLASSING
The "HTML::StripScripts::Parser" class is subclassable. Filter objects are plain hashes. The hss_init() method takes the same arguments
as new(), and calls the initialization methods of both "HTML::StripScripts" and "HTML::Parser".
See "SUBCLASSING" in HTML::StripScripts and "SUBCLASSING" in HTML::Parser.
SEE ALSO
HTML::StripScripts, HTML::Parser, HTML::StripScripts::LibXML
BUGS
None reported.
Please report any bugs or feature requests to bug-html-stripscripts-parser@rt.cpan.org, or through the web interface at
<http://rt.cpan.org>.
AUTHOR
Original author Nick Cleaton <nick@cleaton.net>
New code added and module maintained by Clinton Gormley <clint@traveljury.com>
COPYRIGHT
Copyright (C) 2003 Nick Cleaton. All Rights Reserved.
Copyright (C) 2007 Clinton Gormley. All Rights Reserved.
LICENSE
This module is free software; you can redistribute it and/or modify it under the same terms as Perl itself.
perl v5.10.1 2009-11-05 Parser(3pm)