Bug#618281: ITP: libwww-robotrules-perl -- database of robots.txt-derived permissions
Package: wnpp
Owner: Nicholas Bamber <nicholas@periapt.co.uk>
Severity: wishlist
X-Debbugs-CC: debian-devel@lists.debian.org,debian-perl@lists.debian.org
* Package name : libwww-robotrules-perl
Version : 6.01
Upstream Author : Gisle Aas <gisle@activestate.com>
* URL : http://search.cpan.org/dist/WWW-RobotRules/
* License : Artistic or GPL-1+
Programming Lang: Perl
Description : database of robots.txt-derived permissions
WWW::RobotRules parses /robots.txt files as specified in "A Standard for
Robot Exclusion", at <http://www.robotstxt.org/wc/norobots.html> Webmasters
can use the /robots.txt file to forbid conforming robots from accessing
parts
of their web site.
The parsed files are kept in a WWW::RobotRules object, and this object
provides methods to check if access to a given URL is prohibited. The same
WWW::RobotRules object can be used for one or more parsed /robots.txt files
on any number of hosts.
Reply to: