Sponsored Content
Full Discussion: Script to delete HTML tag
Top Forums Shell Programming and Scripting Script to delete HTML tag Post 302575084 by zongo on Sunday 20th of November 2011 11:15:15 AM
Old 11-20-2011
Script to delete HTML tag

Guys,

I have a little script that I got of the internet and that I use in Squid to block ads.
I used that script with linux but now i have moved my servers to freebsd. I have a step learning curve there but it is fun: Back to the script issue.

The script used to work i with linux but freebsd is a bit different.
This line is causing me issue
Code:
# cat /tmp/temp_ad_file | grep "(^|\.)" > "/usr/local/etc/squid/squid.adservers"

If I use the line above it in the script below, the destination folder is going to be completely emptied. The goal is to get rid of the HTML tags "(^|\.)" in the list that is given by http address "pgl.yoyo.org" for bad ad website. Then it is used by squid proxy.
The line above is unusable. The script works well if i modified the line above without the pipe, grep and the tags
Code:
# cat /tmp/temp_ad_file > "/usr/local/etc/squid/squid.adservers"

Then the list is being updated correctly and not emptied but still with the HTML tags in it.
Code:
#!/bin/sh
# Get new ad server list
/usr/local/bin/wget -O /tmp/temp_ad_file \
        'http://pgl.yoyo.org/adservers/serverlist.php?hostformat=squid-dstdom-regex;showintro=0&mimetype=plaintext'
# Clean HTML headers out of the list
cat /tmp/temp_ad_file > "/usr/local/etc/squid/squid.adservers"
# cat /tmp/temp_ad_file | grep "(^|\.)" > "/usr/local/etc/squid/squid.adservers"
# Refresh Squid
/usr/local/sbin/squid -k reconfigure
# Remove tmp file
rm -rf /tmp/temp_ad_file

Any help is much appreciated

Kind Regards,

Last edited by Scott; 11-20-2011 at 02:23 PM.. Reason: Code tags
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

2. Shell Programming and Scripting

how to use html tag in shell scripting

Hai friends I have a small doubt.. how can we use html tag in shell scripting code : echo "<html>" echo "<body>" echo " welcome to peace world " echo "</body>" echo "</html>" output displayed like this: <html> <body> welcome to peace world </body> </html> (5 Replies)
Discussion started by: jrex1983
5 Replies

3. Shell Programming and Scripting

How can i delete html attributes from tag ?

Input: <table class="pixelBorderTable faqTable" width="100%" border="1" cellpadding="3" cellspacing="0"> <tbody><tr> <td class="pixelBorderTableHeaderTd" valign="top" width="20%" bgcolor="#666666"><p>&nbsp;</p></td> <td class="pixelBorderTableHeaderTd" valign="top"... (1 Reply)
Discussion started by: cola
1 Replies

4. Shell Programming and Scripting

extracting Line between HTML tag

Hi everyone: I want to extract string which is in between certain html tag. e.g. I tried with grep,cut, awk but could not find exact syntax for this one. :wall: PS>Sorry about bad english. (8 Replies)
Discussion started by: newlook2011
8 Replies

5. Shell Programming and Scripting

how to delete certain java script from html files using sed

I am cleaning forum posts to convert them in offline reading version with clean html text. All files are with html extension and reside in one folder. There is some java script i would like to remove, which looks like <script LANGUAGE="JavaScript1.1"> <!-- function mMz() { var mPz = "";... (2 Replies)
Discussion started by: georgi58
2 Replies

6. Shell Programming and Scripting

Add the html tag first and last line the file

Hi, i have 30 html files and i want to add the html tag first (<html>) and end of the line </html> tag..How to do it in script. Thanks, (7 Replies)
Discussion started by: bmk
7 Replies

7. Shell Programming and Scripting

Extracting a string from html tag

Hi I am new to string extractions in shell script... I am trying to extract a string such as #1753 from html tag looks like below. <a class="model-link tl-tr" href="lastSuccessfulBuild/">Last successful build (#1753), 40 min ago</a> and want the value as 1753 Could someone help me to... (3 Replies)
Discussion started by: hicharbo
3 Replies

8. Shell Programming and Scripting

Search for a html tag and print the entire tag

I want to print from <fruits> to </fruits> tag which have <fruit> as mango. Also i want both <fruits> and </fruits> in output. Please help eg. <fruits> <fruit id="111">mango<fruit> . another 20 lines . </fruits> (3 Replies)
Discussion started by: Ashik409
3 Replies

9. Shell Programming and Scripting

Print Value between desired html tag

Hi, I have a html line as below :-... (6 Replies)
Discussion started by: satishmallidi
6 Replies

10. UNIX for Beginners Questions & Answers

Multiline html tag parse shell script

Hello, I want to parse the contents of a multiline html tag ex: <html> <body> <p>some other text</p> <div> <p class="margin-bottom-0"> text1 <br> text2 <br> <br> text3 </p> </div> </body> (15 Replies)
Discussion started by: SorcRR
15 Replies
HTML::FormatText(3)					User Contributed Perl Documentation				       HTML::FormatText(3)

NAME
HTML::FormatText - Format HTML as plaintext SYNOPSIS
require HTML::TreeBuilder; $tree = HTML::TreeBuilder->new->parse_file("test.html"); require HTML::FormatText; $formatter = HTML::FormatText->new(leftmargin => 0, rightmargin => 50); print $formatter->format($tree); DESCRIPTION
The HTML::FormatText is a formatter that outputs plain latin1 text. All character attributes (bold/italic/underline) are ignored. Formatting of HTML tables and forms is not implemented. You might specify the following parameters when constructing the formatter: leftmargin (alias lm) The column of the left margin. The default is 3. rightmargin (alias rm) The column of the right margin. The default is 72. SEE ALSO
HTML::Formatter COPYRIGHT
Copyright (c) 1995-2002 Gisle Aas, and 2002- Sean M. Burke. All rights reserved. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. This program is distributed in the hope that it will be useful, but without any warranty; without even the implied warranty of merchantability or fitness for a particular purpose. AUTHOR
Current maintainer: Sean M. Burke <sburke@cpan.org> Original author: Gisle Aas <gisle@aas.no> perl v5.12.1 2004-06-02 HTML::FormatText(3)
All times are GMT -4. The time now is 10:23 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy