[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#618281: ITP: libwww-robotrules-perl -- database of robots.txt-derived permissions



Package: wnpp
Owner: Nicholas Bamber <nicholas@periapt.co.uk>
Severity: wishlist
X-Debbugs-CC: debian-devel@lists.debian.org,debian-perl@lists.debian.org

* Package name    : libwww-robotrules-perl
  Version         : 6.01
  Upstream Author : Gisle Aas <gisle@activestate.com>
* URL             : http://search.cpan.org/dist/WWW-RobotRules/
* License         : Artistic or GPL-1+
  Programming Lang: Perl
  Description     : database of robots.txt-derived permissions

WWW::RobotRules parses /robots.txt files as specified in "A Standard for
Robot Exclusion", at <http://www.robotstxt.org/wc/norobots.html> Webmasters
can use the /robots.txt file to forbid conforming robots from accessing parts
of their web site.

The parsed files are kept in a WWW::RobotRules object, and this object
provides methods to check if access to a given URL is prohibited. The same
WWW::RobotRules object can be used for one or more parsed /robots.txt files
on any number of hosts.



Reply to: