[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#949198: (no subject)



Subject: ITP: parsero -- Audit tool for robots.txt of a site
Package: wnpp
Owner: Thiago Andrade Marques <thmarques@gmail.com>
Severity: wishlist

* Package name    : parsero
  Version         : 0.0+git20140929.e5b585a
  Upstream Author : Javier Nieto <javier.nieto@behindthefirewalls.com>
* URL             : https://github.com/behindthefirewalls/Parsero/
* License         : (GPL-2+
  Programming Lang: Python3
  Description     : Audit tool for robots.txt of a site

Parsero reads the Robots.txt file of a web server and looks at the Disallow entries.
The Disallow entries tell the search engines what directories or files hosted on a
web server must not be indexed. For example, "Disallow: /portal/login" means that the
content on www.example.com/portal/login it's not allowed to be indexed by crawlers
like Google, Bing, Yahoo... This is the way the administrator have to not share
sensitive or private information with the search engines.


Reply to: