Bug#657278: ITP: python-scrapelib -- library for scraping websites
Package: wnpp
Severity: wishlist
Owner: Alex Chiang <achiang@canonical.com>
* Package name : python-scrapelib
Version : 0.5.6
Upstream Author : Michael Stephens <mstephens@sunlightfoundation.com>
James Turk <jturk@sunlightfoundation.com>
* URL : https://github.com/sunlightlabs/scrapelib
* License : BSD-3-clause
Programming Lang: Python
Description : library for scraping websites
It builds those binary packages:
python-scrapelib - library for scraping websites
scrapeshell - ipython shell to examine python-scrapelib results
Long description:
At its simplest provides a replacement for urllib2’s urlopen functionality
but can do much more.
.
Advantages of using scrapelib over urllib2 or httplib2 include:
.
* HTTP, HTTPS, FTP requests via an identical API
* HTTP caching, compression and cookies
* redirect following
* request throttling
* robots.txt compliance (optional)
* robust error handling
Reply to: