Sponsored Content
Full Discussion: To Break data out of HTML
Top Forums Shell Programming and Scripting To Break data out of HTML Post 302197002 by era on Tuesday 20th of May 2008 03:23:49 AM
Old 05-20-2008
If you really want a "best way", that would be a proper HTML parser.

Assuming you wish to stay with something lighter, like sed or awk, perhaps you can elaborate on what is wrong with the sed you have tried so far. (The <font> tag in your example script does not occur in the HTML sample you posted, but I guess that's beside the point here.)
 

10 More Discussions You Might Find Interesting

1. Programming

coverting html data to text in 'c'

hi, iam reading the webpage using curl socket. so iam geting the data in html format so how can convert html data to text data ,so i can move forward. thank u, sree (3 Replies)
Discussion started by: phani_sree
3 Replies

2. UNIX for Dummies Questions & Answers

extract data from html tables

hi i need to use unix to extract data from several rows of a table coded in html. I know that rows within a table have the tags <tr> </tr> and so i thought that my first step should be to to delete all of the other html code which is not contained within these tags. i could then use this method... (8 Replies)
Discussion started by: Streetrcr
8 Replies

3. Shell Programming and Scripting

Converting HTML data into a spreadsheet

Hi, I have a perl script that prints some data in the form of a table (HTML table) Now, I want to be able to convert this data into a report on an Excel sheet. How can I do this? Regards, Garric (4 Replies)
Discussion started by: garric
4 Replies

4. Shell Programming and Scripting

data break split

I am trying to figure out how to split a file when the data in the new line is different from the current line using a shell script? For eg.. if my input file contains the following 2341123 ABCAD 2341123 ANCAED 2341123 AVADV 3343434 ASDVAV 3343434 ASDFADF 4231232 ADACVAV 4231232... (3 Replies)
Discussion started by: gmatsoon
3 Replies

5. Shell Programming and Scripting

Turn HTML data into delimited text

I have a file I've already partially pruned with grep that has data like: <a href="MasterDetailResults.asp?textfield=a&Application=3D Home Architect 4">3D Home Architect 4</a> </td> Approved </td> -- <a href="MasterDetailResults.asp?textfield=a&Application=3d Home... (6 Replies)
Discussion started by: macxcool
6 Replies

6. Shell Programming and Scripting

SED to extract HTML text data, not quite right!

I am attempting to extract weather data from the following website, but for the Victoria area only: Text Forecasts - Environment Canada I use this: sed -n "/Greater Victoria./,/Fraser Valley./p" But that phrasing does not sometimes get it all and think perhaps the website has more... (2 Replies)
Discussion started by: lagagnon
2 Replies

7. Shell Programming and Scripting

extract data with awk from html files

Hello everyone, I'm new to this forum and i am new as a shell scripter. my problem is to have html files in a directory and I would like to extract from these some data that lies between two different lines Here's my situation <td align="default"> oxidizability (mg / l): data_to_extract... (6 Replies)
Discussion started by: sbobotex
6 Replies

8. Shell Programming and Scripting

Using sed or awk to turn data into html

Hi there, I'm wondering the best way to go about this. I have a file which is fairly specific in its format, but it has some options in it that mess up what I need. I'll give you an example of a couple of lines: Bob D Thomas D/F Alice A/F Michael A/D/F John Michael B Bachman Turner A/D... (7 Replies)
Discussion started by: melancthon
7 Replies

9. Shell Programming and Scripting

Script to fetch data from HTML

Hi All, There is a link from were I usually search somthing and fetch the data from. Is there any way to automate it through a script if I mention search criteria in a note pad. I mean the script to search the content on the notepad and resutls should be placed into another file. ... (2 Replies)
Discussion started by: indradev
2 Replies

10. Shell Programming and Scripting

Creating html table from data in file

Hi. I need to create html table from file which contains data. No awk please :) In example, ->cat file num1 num2 num3 23 3 5 2 3 4 (between numbers and words single TAB). after running mycode i need to get (heading is the first line): <table>... (2 Replies)
Discussion started by: Manu1234567
2 Replies
HTML::RewriteAttributes(3pm)				User Contributed Perl Documentation			      HTML::RewriteAttributes(3pm)

NAME
HTML::RewriteAttributes - concise attribute rewriting SYNOPSIS
$html = HTML::RewriteAttributes->rewrite($html, sub { my ($tag, $attr, $value) = @_; # delete any attribute that mentions.. return if $value =~ /COBOL/i; $value =~ s/rocks/rules/g; return $value; }); # writing some HTML email I see.. $html = HTML::RewriteAttributes::Resources->rewrite($html, sub { my $uri = shift; my $content = render_template($uri); my $cid = generate_cid_from($content); $mime->attach($cid => content); return "cid:$cid"; }); # up for some HTML::ResolveLink? $html = HTML::RewriteAttributes::Links->rewrite($html, "http://search.cpan.org"); # or perhaps HTML::LinkExtor? HTML::RewriteAttributes::Links->rewrite($html, sub { my ($tag, $attr, $value) = @_; push @links, $value; $value; }); DESCRIPTION
"HTML::RewriteAttributes" is designed for simple yet powerful HTML attribute rewriting. You simply specify a callback to run for each attribute and we do the rest for you. This module is designed to be subclassable to make handling special cases eaiser. See the source for methods you can override. METHODS
"new" You don't need to call "new" explicitly - it's done in "rewrite". It takes no arguments. "rewrite" HTML, callback -> HTML This is the main interface of the module. You pass in some HTML and a callback, the callback is invoked potentially many times, and you get back some similar HTML. The callback receives as arguments the tag name, the attribute name, and the attribute value (though subclasses may override this -- HTML::RewriteAttributes::Resources does). Return "undef" to remove the attribute, or any other value to set the value of the attribute. SEE ALSO
HTML::Parser, HTML::ResolveLink, Email::MIME::CreateHTML, HTML::LinkExtor THANKS
Some code was inspired by, and tests borrowed from, Miyagawa's HTML::ResolveLink. AUTHOR
Shawn M Moore, "<sartak@bestpractical.com>" LICENSE
Copyright 2008-2010 Best Practical Solutions, LLC. HTML::RewriteAttributes is distributed under the same terms as Perl itself. perl v5.10.1 2010-11-18 HTML::RewriteAttributes(3pm)
All times are GMT -4. The time now is 08:05 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy