Extract URL from RSS Feed in AWK Post: 302449047

6 More Discussions You Might Find Interesting

1. What is on Your Mind?

Post Your Favorite UNIX/Linux Related RSS Feed Links

Hello, I am planning to revise the RSS News subforum areas, here: News, Links, Events and Announcements - The UNIX Forums ... maybe with a subforum for each OS specific news, like HP-UX, Solaris, RedHat, OSX, etc. RSS subforums.... Please post your favorite OS specific RSS (RSS2) link...

2. Shell Programming and Scripting

replace last form feed with line feed

Hi I have a file with lots of line feeds and form feeds (page break). Need to replace last occurrence of form feed (created by - echo "\f" ) in the file with line feed. Please advise how can i achieve this. TIA Prvn

3. Shell Programming and Scripting

SED extract url - please help a lamer

Hello everybody. I have lines that looks something like this: <done16=""118"" done18=""$ title=""thisisatitle"" href=""/JoeBanana" alt=""Joe""><done16=""118"" done18=""$ title=""thisisatitle"" href=""/GeraldGiraffe" alt=""Gerald""> What kind of SED command would I need to use to extract...

4. Shell Programming and Scripting

How to extract url from html page?

for example, I have an html file, contain <a href="http://awebsite" id="awebsite" class="first">website</a>and sometime a line contains more then one link, for example <a href="http://awebsite" id="awebsite" class="first">website</a><a href="http://bwebsite" id="bwebsite"...

5. UNIX for Dummies Questions & Answers

Awk: print all URL addresses between iframe tags without repeating an already printed URL

Here is what I have so far: find . -name "*php*" -or -name "*htm*" | xargs grep -i iframe | awk -F'"' '/<iframe*/{gsub(/.\*iframe>/,"\"");print $2}' Here is an example content of a PHP or HTM(HTML) file: <iframe src="http://ADDRESS_1/?click=5BBB08\" width=1 height=1...

6. Shell Programming and Scripting

How to use GREP to extract URL from file

Hi All , Here is what I want to do: Given a line: 98.70.217.222 - - "GET /liveupdate-aka.symantec.com/1340071490jtun_nav2k8enn09m25.m25?h=abcdefgh HTTP/1.1" 200 159229484 "-" "hBU1OhDsPXknMepDBJNScBj4BQcmUz5TwAAAAA" "-" 1. Get the URL component: ...

LEARN ABOUT DEBIAN

xml::feed

XML::Feed(3pm)						User Contributed Perl Documentation					    XML::Feed(3pm)

NAME

       XML::Feed - Syndication feed parser and auto-discovery

SYNOPSIS

	   use XML::Feed;
	   my $feed = XML::Feed->parse(URI->new('http://example.com/atom.xml'))
	       or die XML::Feed->errstr;
	   print $feed->title, "
";
	   for my $entry ($feed->entries) {
	   }

	   ## Find all of the syndication feeds on a given page, using
	   ## auto-discovery.
	   my @feeds = XML::Feed->find_feeds('http://example.com/');

DESCRIPTION

       XML::Feed is a syndication feed parser for both RSS and Atom feeds. It also implements feed auto-discovery for finding feeds, given a URI.

       XML::Feed supports the following syndication feed formats:

       o   RSS 0.91

       o   RSS 1.0

       o   RSS 2.0

       o   Atom

       The goal of XML::Feed is to provide a unified API for parsing and using the various syndication formats. The different flavors of RSS and
       Atom handle data in different ways: date handling; summaries and content; escaping and quoting; etc. This module attempts to remove those
       differences by providing a wrapper around the formats and the classes implementing those formats (XML::RSS and XML::Atom::Feed). For
       example, dates are handled differently in each of the above formats. To provide a unified API for date handling, XML::Feed converts all
       date formats transparently into DateTime objects, which it then returns to the caller.

USAGE

   XML::Feed->new($format)
       Creates a new empty XML::Feed object using the format $format.

	   $feed = XML::Feed->new('Atom');
	   $feed = XML::Feed->new('RSS');
	   $feed = XML::Feed->new('RSS', version => '0.91');

   XML::Feed->parse($stream)
   XML::Feed->parse($stream, $format)
       Parses a syndication feed identified by $stream and returns an XML::Feed obhect. $stream can be any one of the following:

       o   Scalar reference

	   A reference to string containing the XML body of the feed.

       o   Filehandle

	   An open filehandle from which the feed XML will be read.

       o   File name

	   The name of a file containing the feed XML.

       o   URI object

	   A URI from which the feed XML will be retrieved.

       $format allows you to override format guessing.

   XML::Feed->find_feeds($uri)
       Given a URI $uri, use auto-discovery to find all of the feeds linked from that page (using <link> tags).

       Returns a list of feed URIs.

   XML::Feed->identify_format($xml)
       Given the xml of a feed return what format it is in ("Atom", or some version of "RSS").

   $feed->convert($format)
       Converts the XML::Feed object into the $format format, and returns the new object.

   $feed->splice($other_feed)
       Splices in all of the entries from the feed $other_feed into $feed, skipping posts that are already in $feed.

   $feed->format
       Returns the format of the feed ("Atom", or some version of "RSS").

   $feed->title([ $title ])
       The title of the feed/channel.

   $feed->base([ $base ])
       The url base of the feed/channel.

   $feed->link([ $uri ])
       The permalink of the feed/channel.

   $feed->tagline([ $tagline ])
       The description or tagline of the feed/channel.

   $feed->description([ $description ])
       Alias for $feed->tagline.

   $feed->author([ $author ])
       The author of the feed/channel.

   $feed->language([ $language ])
       The language of the feed.

   $feed->copyright([ $copyright ])
       The copyright notice of the feed.

   $feed->modified([ $modified ])
       A DateTime object representing the last-modified date of the feed.

       If present, $modified should be a DateTime object.

   $feed->generator([ $generator ])
       The generator of the feed.

   $feed->self_link ([ $uri ])
       The Atom Self-link of the feed:

       <http://validator.w3.org/feed/docs/warning/MissingAtomSelfLink.html>

       A string.

   $feed->entries
       A list of the entries/items in the feed. Returns an array containing XML::Feed::Entry objects.

   $feed->items
       A synonym (alias) for <$feed->entries>.

   $feed->add_entry($entry)
       Adds an entry to the feed. $entry should be an XML::Feed::Entry object in the correct format for the feed.

   $feed->as_xml
       Returns an XML representation of the feed, in the format determined by the current format of the $feed object.

PACKAGE VARIABLES

       $XML::Feed::Format::RSS::PREFERRED_PARSER
	   If you want to use another RSS parser class than XML::RSS (default), you can change the class by setting $PREFERRED_PARSER variable in
	   the XML::Feed::Format::RSS package.

	       $XML::Feed::Format::RSS::PREFERRED_PARSER = "XML::RSS::LibXML";

	   Note: this will only work for parsing feeds, not creating feeds.

	   Note: Only "XML::RSS::LibXML" version 0.3004 is known to work at the moment.

       $XML::Feed::MULTIPLE_ENCLOSURES
	   Although the RSS specification states that there can be at most one enclosure per item some feeds break this rule.

	   If this variable is set then "XML::Feed" captures all of them and makes them available as a list.

	   Otherwise it returns the last enclosure parsed.

	   Note: "XML::RSS" version 1.44 is needed for this to work.

VALID FEEDS

       For reference, this cgi script will create valid, albeit nonsensical feeds (according to "http://feedvalidator.org" anyway) for Atom 1.0
       and RSS 0.90, 0.91, 1.0 and 2.0.

	   #!perl -w

	   use strict;
	   use CGI;
	   use CGI::Carp qw(fatalsToBrowser);
	   use DateTime;
	   use XML::Feed;

	   my $cgi  = CGI->new;
	   my @args = ( $cgi->param('format') || "Atom" );
	   push @args, ( version => $cgi->param('version') ) if $cgi->param('version');

	   my $feed = XML::Feed->new(@args);
	   $feed->id("http://".time.rand()."/");
	   $feed->title('Test Feed');
	   $feed->link($cgi->url);
	   $feed->self_link($cgi->url( -query => 1, -full => 1, -rewrite => 1) );
	   $feed->modified(DateTime->now);

	   my $entry = XML::Feed::Entry->new();
	   $entry->id("http://".time.rand()."/");
	   $entry->link("http://example.com");
	   $entry->title("Test entry");
	   $entry->summary("Test summary");
	   $entry->content("Foo");
	   $entry->modified(DateTime->now);
	   $entry->author('test@example.com (Testy McTesterson)');
	   $feed->add_entry($entry);

	   my $mime = ("Atom" eq $feed->format) ? "application/atom+xml" : "application/rss+xml";
	   print $cgi->header($mime);
	   print $feed->as_xml;

LICENSE

       XML::Feed is free software; you may redistribute it and/or modify it under the same terms as Perl itself.

AUTHOR &; COPYRIGHT
       Except where otherwise noted, XML::Feed is Copyright 2004-2008 Six Apart, cpan@sixapart.com. All rights reserved.

SUBVERSION

       The latest version of XML::Feed can be found at

	   http://code.sixapart.com/svn/XML-Feed/trunk/

perl v5.14.2							    2012-03-21							    XML::Feed(3pm)