Sponsored Content
Top Forums Shell Programming and Scripting SED to extract HTML text data, not quite right! Post 302391181 by lagagnon on Saturday 30th of January 2010 10:53:33 PM
Old 01-30-2010
SED to extract HTML text data, not quite right!

I am attempting to extract weather data from the following website, but for the Victoria area only:

Text Forecasts - Environment Canada

I use this:

Code:
 sed -n "/Greater Victoria./,/Fraser Valley./p"

But that phrasing does not sometimes get it all and think perhaps the website has more than one linefeed, carriage return, whatever, that messes up my coding. Any ideas appreciated.

Larry
 

10 More Discussions You Might Find Interesting

1. UNIX for Dummies Questions & Answers

How do I extract text only from html file without HTML tag

I have a html file called myfile. If I simply put "cat myfile.html" in UNIX, it shows all the html tags like <a href=r/26><img src="http://www>. But I want to extract only text part. Same problem happens in "type" command in MS-DOS. I know you can do it by opening it in Internet Explorer,... (4 Replies)
Discussion started by: los111
4 Replies

2. UNIX for Dummies Questions & Answers

extract data from html tables

hi i need to use unix to extract data from several rows of a table coded in html. I know that rows within a table have the tags <tr> </tr> and so i thought that my first step should be to to delete all of the other html code which is not contained within these tags. i could then use this method... (8 Replies)
Discussion started by: Streetrcr
8 Replies

3. UNIX for Advanced & Expert Users

sed to extract HTML content

Hiya, I am trying to extract a news article from a web page. The sed I have written brings back a lot of Javascript code and sometimes advertisments too. Can anyone please help with this one ??? I need to fix this sed so it picks up the article ONLY (don't worry about the title or date .. i got... (2 Replies)
Discussion started by: stargazerr
2 Replies

4. Shell Programming and Scripting

Extract URLs from HTML code using sed

Hello, i try to extract urls from google-search-results, but i have problem with sed filtering of html-code. what i wont is just list of urls thay apears between ........<p><a href=" and next following " in html code. here is my code, i use wget and pipelines to filtering. wget works, but... (13 Replies)
Discussion started by: L0rd
13 Replies

5. Shell Programming and Scripting

extract data with awk from html files

Hello everyone, I'm new to this forum and i am new as a shell scripter. my problem is to have html files in a directory and I would like to extract from these some data that lies between two different lines Here's my situation <td align="default"> oxidizability (mg / l): data_to_extract... (6 Replies)
Discussion started by: sbobotex
6 Replies

6. Shell Programming and Scripting

help with sed needed to extract content from html tags

Hi I've searched for it for few hours now and i can't seem to find anything working like i want. I've got webpage, saved in file par with form like this: <html><body><form name='sendme' action='http://example.com/' method='POST'> <textarea name='1st'>abc123def678</textarea> <textarea... (9 Replies)
Discussion started by: seb001
9 Replies

7. Shell Programming and Scripting

extract complex data from html table rows

I have bash, awk, and sed available on my portable device. I need to extract 10 fields from each table row from a web page that looks like this: </tr> <tr> <td>28 Apr</td> <td><a... (6 Replies)
Discussion started by: rickgtx
6 Replies

8. Shell Programming and Scripting

awk -- Extract data from html within multiple tags as reference

Hi, I'm trying to get some data from an html file, but the problem is before it can extract the information I have multiple patterns that need to be passed through. https://www.unix.com/shell-programming-scripting/150711-extract-data-awk-html-files.html Is a similar problem. The only... (5 Replies)
Discussion started by: counfhou
5 Replies

9. Shell Programming and Scripting

Awk/sed HTML extract

I'm extracting text between table tags in HTML <th><a href="/wiki/Buick_LeSabre" title="Buick LeSabre">Buick LeSabre</a></th> using this: awk -F "</*th>" '/<\/*th>/ {print $2}' auto2 > auto3 then this (text between a href): sed -e 's/\(<*>\)//g' auto3 > auto4 How to shorten this into one... (8 Replies)
Discussion started by: p1ne
8 Replies

10. Shell Programming and Scripting

Extract text from html using perl or awk

I am trying to extract text after keywords fron an html file. The keywords are reportLink":, "barcodedSamples": {", "barcodedSamples": {". Both the perl and awk run but the output is just the entire index.html not the desired output. Also for the reportLink": only the text after the second / until... (5 Replies)
Discussion started by: cmccabe
5 Replies
URI::URL(3)						User Contributed Perl Documentation					       URI::URL(3)

NAME
URI::URL - Uniform Resource Locators SYNOPSIS
$u1 = URI::URL->new($str, $base); $u2 = $u1->abs; DESCRIPTION
This module is provided for backwards compatibility with modules that depend on the interface provided by the "URI::URL" class that used to be distributed with the libwww-perl library. The following differences exist compared to the "URI" class interface: o The URI::URL module exports the url() function as an alternate constructor interface. o The constructor takes an optional $base argument. The "URI::URL" class is a subclass of "URI::WithBase". o The URI::URL->newlocal class method is the same as URI::file->new_abs. o URI::URL::strict(1) o $url->print_on method o $url->crack method o $url->full_path: same as ($uri->abs_path || "/") o $url->netloc: same as $uri->authority o $url->epath, $url->equery: same as $uri->path, $uri->query o $url->path and $url->query pass unescaped strings. o $url->path_components: same as $uri->path_segments (if you don't consider path segment parameters) o $url->params and $url->eparams methods o $url->base method. See URI::WithBase. o $url->abs and $url->rel have an optional $base argument. See URI::WithBase. o $url->frag: same as $uri->fragment o $url->keywords: same as $uri->query_keywords o $url->localpath and friends map to $uri->file. o $url->address and $url->encoded822addr: same as $uri->to for mailto URI o $url->groupart method for news URI o $url->article: same as $uri->message SEE ALSO
URI, URI::WithBase COPYRIGHT
Copyright 1998-2000 Gisle Aas. perl v5.18.2 2012-02-11 URI::URL(3)
All times are GMT -4. The time now is 06:50 AM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy