Sponsored Content
Top Forums UNIX for Dummies Questions & Answers Remove all HTML, scripts and styles? Post 302638385 by Molly.P. on Thursday 10th of May 2012 08:04:23 AM
Old 05-10-2012
Remove all HTML, scripts and styles?

Hi all,

How might I go about writing a program that will read all input as an HTML file, and subsequently strip all HTML, embedded scripts and style sheets from its input, leaving only text as the output?

I am a beginner, so the simpler, the better.

Thanks for any advice Smilie
 

9 More Discussions You Might Find Interesting

1. Shell Programming and Scripting

Access shell scripts from HTML page

Hi, I need (have been asked/order/instructed) to migrate the access of a number of ksh scripts into a html/web page environment. Currently access is with the user logging onto a unix box and accessing the scripts that way. The users are not unix people so I have restricted the access solely to... (4 Replies)
Discussion started by: nhatch
4 Replies

2. Shell Programming and Scripting

Remove html tags with bash

Hello, is there a way to go through a file and remove certain html tags with bash? If it needs sed or awk, that'll do too. The reason why I want this is, because I have a monitor script which generates a logfile in HTML and every time it generates a logfile, the tags are reproduced. The tags... (4 Replies)
Discussion started by: dejavu88
4 Replies

3. Shell Programming and Scripting

html withing shell scripts,how??

Hi can anybody guide me to write html programs using shell script. FYI: I use ksh. Thanks in advance, Divya (6 Replies)
Discussion started by: divzz
6 Replies

4. Shell Programming and Scripting

command to remove attribute of an html tag

Is there any shell command to clean an html tag of its attributes. For ex <p align ="center"> with <p>. Thanks for your help!! (2 Replies)
Discussion started by: parshant_bvcoe
2 Replies

5. Shell Programming and Scripting

HTML code remove

Hello, I have one file which has been inserted intermittently with HTML web page. I would like to remove all text between "<html xmlns="http://www.w3.org/1999/xhtml">" and </html> tags. Can any one please suggest me sed regular expression for it. Thanks (3 Replies)
Discussion started by: nrbhole
3 Replies

6. Shell Programming and Scripting

Remove external urls from .html file

Hi everyone. I have an html file with lines like so: link href="localFolder/..."> link href="htp://..."> img src="localFolder/..."> img src="htp://..."> I want to remove the links with http in the href and imgs with http in its src. I'm having trouble removing them because there... (4 Replies)
Discussion started by: CowCow339
4 Replies

7. Shell Programming and Scripting

How to remove urls from html files

Does anybody know how to remove all urls from html files? all urls are links with anchor texts in the form of <a href="http://www.anydomain.com">ANCHOR</a> they may start with www or not. Goal is to delete all urls and keep the ANCHOR text and if possible to change tags around anchor to... (2 Replies)
Discussion started by: georgi58
2 Replies

8. Shell Programming and Scripting

How to remove string inside html tag <a>

Does anybody know how i can remove string from <a> tag? There are several hundred posts in a few forums that need to be cleaned up. The precise situation is ---------- <a href="http://mydomain.com/cgi-bin/anyboard.cgi?fvp=/family/sexuality_and_spirituality/&cmd=rA&cG=43"> ------------- my... (6 Replies)
Discussion started by: georgi58
6 Replies

9. Shell Programming and Scripting

How to remove the values inside the html tags?

Hi, I have a txt file which contain this: <a href="linux">Linux</a> <a href="unix">Unix</a> <a href="oracle">Oracle</a> <a href="perl">Perl</a> I'm trying to extract the text in between these anchor tag and ignoring everything else using grep. I managed to ignore the tags but unable to... (6 Replies)
Discussion started by: KCApple
6 Replies
HTML::WikiConverter::Kwiki(3pm) 			User Contributed Perl Documentation			   HTML::WikiConverter::Kwiki(3pm)

NAME
HTML::WikiConverter::Kwiki - Convert HTML to Kwiki markup SYNOPSIS
use HTML::WikiConverter; my $wc = new HTML::WikiConverter( dialect => 'Kwiki' ); print $wc->html2wiki( $html ); DESCRIPTION
This module contains rules for converting HTML into Kwiki markup. See HTML::WikiConverter for additional usage details. AUTHOR
David J. Iberri, "<diberri at cpan.org>" BUGS
Please report any bugs or feature requests to "bug-html-wikiconverter-kwiki at rt.cpan.org", or through the web interface at <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=HTML-WikiConverter-Kwiki>. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. SUPPORT
You can find documentation for this module with the perldoc command. perldoc HTML::WikiConverter::Kwiki You can also look for information at: o AnnoCPAN: Annotated CPAN documentation <http://annocpan.org/dist/HTML-WikiConverter-Kwiki> o CPAN Ratings <http://cpanratings.perl.org/d/HTML-WikiConverter-Kwiki> o RT: CPAN's request tracker <http://rt.cpan.org/NoAuth/Bugs.html?Dist=HTML-WikiConverter-Kwiki> o Search CPAN <http://search.cpan.org/dist/HTML-WikiConverter-Kwiki> COPYRIGHT &; LICENSE Copyright 2006 David J. Iberri, all rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. perl v5.10.0 2006-07-28 HTML::WikiConverter::Kwiki(3pm)
All times are GMT -4. The time now is 02:51 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy