Linux and UNIX Man Pages

Linux & Unix Commands - Search Man Pages

www::robotrules::anydbm_file(3) [redhat man page]

WWW::RobotRules::AnyDBM_File(3) 			User Contributed Perl Documentation			   WWW::RobotRules::AnyDBM_File(3)

NAME
WWW::RobotRules::AnyDBM_File - Persistent RobotRules SYNOPSIS
require WWW::RobotRules::AnyDBM_File; require LWP::RobotUA; # Create a robot useragent that uses a diskcaching RobotRules my $rules = new WWW::RobotRules::AnyDBM_File 'my-robot/1.0', 'cachefile'; my $ua = new WWW::RobotUA 'my-robot/1.0', 'me@foo.com', $rules; # Then just use $ua as usual $res = $ua->request($req); DESCRIPTION
This is a subclass of WWW::RobotRules that uses the AnyDBM_File package to implement persistent diskcaching of robots.txt and host visit information. The constructor (the new() method) takes an extra argument specifying the name of the DBM file to use. If the DBM file already exists, then you can specify undef as agent name as the name can be obtained from the DBM database. SEE ALSO
WWW::RobotRules, LWP::RobotUA AUTHORS
Hakan Ardo <hakan@munin.ub2.lu.se>, Gisle Aas <aas@sn.no> libwww-perl-5.65 2001-10-26 WWW::RobotRules::AnyDBM_File(3)

Check Out this Related Man Page

LWP::RobotUA(3) 					User Contributed Perl Documentation					   LWP::RobotUA(3)

NAME
LWP::RobotUA - A class for Web Robots SYNOPSIS
require LWP::RobotUA; $ua = new LWP::RobotUA 'my-robot/0.1', 'me@foo.com'; $ua->delay(10); # be very nice, go slowly ... # just use it just like a normal LWP::UserAgent $res = $ua->request($req); DESCRIPTION
This class implements a user agent that is suitable for robot applications. Robots should be nice to the servers they visit. They should consult the /robots.txt file to ensure that they are welcomed and they should not make requests too frequently. But, before you consider writing a robot take a look at <URL:http://info.webcrawler.com/mak/projects/robots/robots.html>. When you use a LWP::RobotUA as your user agent, then you do not really have to think about these things yourself. Just send requests as you do when you are using a normal LWP::UserAgent and this special agent will make sure you are nice. METHODS
The LWP::RobotUA is a sub-class of LWP::UserAgent and implements the same methods. In addition the following methods are provided: $ua = LWP::RobotUA->new($agent_name, $from, [$rules]) Your robot's name and the mail address of the human responsible for the robot (i.e. you) are required by the constructor. Optionally it allows you to specify the WWW::RobotRules object to use. $ua->delay([$minutes]) Set the minimum delay between requests to the same server. The default is 1 minute. $ua->use_sleep([$boolean]) Get/set a value indicating whether the UA should sleep() if requests arrive too fast (before $ua->delay minutes has passed). The default is TRUE. If this value is FALSE then an internal SERVICE_UNAVAILABLE response will be generated. It will have an Retry-After header that indicates when it is OK to send another request to this server. $ua->rules([$rules]) Set/get which WWW::RobotRules object to use. $ua->no_visits($netloc) Returns the number of documents fetched from this server host. Yes I know, this method should probably have been named num_visits() or something like that. :-( $ua->host_wait($netloc) Returns the number of seconds (from now) you must wait before you can make a new request to this host. $ua->as_string Returns a string that describes the state of the UA. Mainly useful for debugging. SEE ALSO
LWP::UserAgent, WWW::RobotRules COPYRIGHT
Copyright 1996-2000 Gisle Aas. This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself. libwww-perl-5.65 2001-04-27 LWP::RobotUA(3)
Man Page