Bug#949198: (no subject)
Subject: ITP: parsero -- Audit tool for robots.txt of a site
Package: wnpp
Owner: Thiago Andrade Marques <thmarques@gmail.com>
Severity: wishlist
* Package name : parsero
Version : 0.0+git20140929.e5b585a
Upstream Author : Javier Nieto <javier.nieto@behindthefirewalls.com>
* URL : https://github.com/behindthefirewalls/Parsero/
* License : (GPL-2+
Programming Lang: Python3
Description : Audit tool for robots.txt of a site
Parsero reads the Robots.txt file of a web server and looks at the Disallow entries.
The Disallow entries tell the search engines what directories or files hosted on a
web server must not be indexed. For example, "Disallow: /portal/login" means that the
content on www.example.com/portal/login it's not allowed to be indexed by crawlers
like Google, Bing, Yahoo... This is the way the administrator have to not share
sensitive or private information with the search engines.
Reply to: