Sponsored Content
Top Forums Shell Programming and Scripting Extract URL from RSS Feed in AWK Post 302449057 by rdcwayx on Saturday 28th of August 2010 06:11:29 AM
Old 08-28-2010
Code:
RS=FS, good idea

 

6 More Discussions You Might Find Interesting

1. What is on Your Mind?

Post Your Favorite UNIX/Linux Related RSS Feed Links

Hello, I am planning to revise the RSS News subforum areas, here: News, Links, Events and Announcements - The UNIX Forums ... maybe with a subforum for each OS specific news, like HP-UX, Solaris, RedHat, OSX, etc. RSS subforums.... Please post your favorite OS specific RSS (RSS2) link... (0 Replies)
Discussion started by: Neo
0 Replies

2. Shell Programming and Scripting

replace last form feed with line feed

Hi I have a file with lots of line feeds and form feeds (page break). Need to replace last occurrence of form feed (created by - echo "\f" ) in the file with line feed. Please advise how can i achieve this. TIA Prvn (5 Replies)
Discussion started by: prvnrk
5 Replies

3. Shell Programming and Scripting

SED extract url - please help a lamer

Hello everybody. I have lines that looks something like this: <done16=""118"" done18=""$ title=""thisisatitle"" href=""/JoeBanana" alt=""Joe""><done16=""118"" done18=""$ title=""thisisatitle"" href=""/GeraldGiraffe" alt=""Gerald""> What kind of SED command would I need to use to extract... (4 Replies)
Discussion started by: digi
4 Replies

4. Shell Programming and Scripting

How to extract url from html page?

for example, I have an html file, contain <a href="http://awebsite" id="awebsite" class="first">website</a>and sometime a line contains more then one link, for example <a href="http://awebsite" id="awebsite" class="first">website</a><a href="http://bwebsite" id="bwebsite"... (36 Replies)
Discussion started by: 14th
36 Replies

5. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Here is what I have so far: find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}' Here is an example content of a PHP or HTM(HTML) file: <iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1... (18 Replies)
Discussion started by: striker4o
18 Replies

6. Shell Programming and Scripting

How to use GREP to extract URL from file

Hi All , Here is what I want to do: Given a line: 98.70.217.222 - - "GET /liveupdate-aka.symantec.com/1340071490jtun_nav2k8enn09m25.m25?h=abcdefgh HTTP/1.1" 200 159229484 "-" "hBU1OhDsPXknMepDBJNScBj4BQcmUz5TwAAAAA" "-" 1. Get the URL component: ... (2 Replies)
Discussion started by: Naks_Sh10
2 Replies
XML::RSS::Headline(3pm) 				User Contributed Perl Documentation				   XML::RSS::Headline(3pm)

NAME
XML::RSS::Headline - Persistant XML RSS Encapsulation VERSION
2.2 SYNOPSIS
Headline object to encapsulate the headline/URL combination of a RSS feed. It provides a unique id either by way of the URL or by doing an MD5 checksum on the headline (when URL uniqueness fails). CONSTRUCTOR
XML::RSS::Headline->new( headline => $headline, url => $url ) XML::RSS::Headline->new( item => $item ) A XML::RSS::Headline object can be initialized either with headline/url or with a parse XML::RSS item structure. The argument 'head- line_as_id' is optional and takes a boolean as its value. METHODS
$headline->id The id is our unique identifier for a headline/url combination. Its how we can keep track of which headlines we have seen before and which ones are new. The id is either the URL or a MD5 checksum generated from the headline text (if $headline->headline_as_id is true); $headline->multiline_headline This method returns the headline as either an array or array reference based on context. It splits headline on newline characters into the array. $headline->item( $item ) Init the object for a parsed RSS item returned by XML::RSS. $headline->set_first_seen $headline->set_first_seen( Time::HiRes::time() ) Set the time of when the headline was first seen. If you pass in a value it will be used otherwise calls Time::HiRes::time(). $headline->first_seen The time (in epoch seconds) of when the headline was first seen. $headline->first_seen_hires The time (in epoch seconds and milliseconds) of when the headline was first seen. GET
/SET ACCESSOR METHODS $headline->headline $headline->headline( $headline ) The rss headline/title. HTML::Entities::decode_entities is used when the headline is set. (not sure why XML::RSS doesn't do this) $headline->url $headline->url( $url ) The rss link/url. URI->canonical is called to attempt to normalize the URL $headline->description $headline->description( $description ) The description of the RSS headline. $headline->headline_as_id $headline->headline_as_id( $bool ) A bool value that determines whether the URL will be the unique identifier or the if an MD5 checksum of the RSS title will be used instead. (when the URL doesn't provide absolute uniqueness or changes within the RSS feed) This is used in extreme cases when URLs aren't always unique to new healines (Use Perl Journals) and when URLs change within a RSS feed (www.debianplanet.org / debianplanet.org / search.cpan.org,search.cpan.org:80) $headline->timestamp $headline->timestamp( Time::HiRes::time() ) A high resolution timestamp that is set using Time::HiRes::time() when the object is created. AUTHOR
Jeff Bisbee, "<jbisbee at cpan.org>" BUGS
Please report any bugs or feature requests to "bug-xml-rss-feed at rt.cpan.org", or through the web interface at <http://rt.cpan.org/NoAuth/ReportBug.html?Queue=XML-RSS-Feed>. I will be notified, and then you'll automatically be notified of progress on your bug as I make changes. SUPPORT
You can find documentation for this module with the perldoc command. perldoc XML::RSS::Headline You can also look for information at: * AnnoCPAN: Annotated CPAN documentation <http://annocpan.org/dist/XML-RSS-Feed> * CPAN Ratings <http://cpanratings.perl.org/d/XML-RSS-Feed> * RT: CPAN's request tracker <http://rt.cpan.org/NoAuth/Bugs.html?Dist=XML-RSS-Feed> * Search CPAN <http://search.cpan.org/dist/XML-RSS-Feed> ACKNOWLEDGEMENTS
Special thanks to Rocco Caputo, Martijn van Beers, Sean Burke, Prakash Kailasa and Randal Schwartz for their help, guidance, patience, and bug reports. Guys thanks for actually taking time to use the code and give good, honest feedback. COPYRIGHT &; LICENSE Copyright 2006 Jeff Bisbee, all rights reserved. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself. SEE ALSO
XML::RSS::Feed, XML::RSS::Headline::PerlJobs, XML::RSS::Headline::Fark, XML::RSS::Headline::UsePerlJournals, POE::Component::RSSAggregator perl v5.8.8 2006-07-17 XML::RSS::Headline(3pm)
All times are GMT -4. The time now is 06:35 PM.
Unix & Linux Forums Content Copyright 1993-2022. All Rights Reserved.
Privacy Policy