[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#657278: ITP: python-scrapelib -- library for scraping websites



Package: wnpp
Severity: wishlist
Owner: Alex Chiang <achiang@canonical.com>

* Package name    : python-scrapelib
  Version         : 0.5.6
  Upstream Author : Michael Stephens <mstephens@sunlightfoundation.com>
                    James Turk <jturk@sunlightfoundation.com>
* URL             : https://github.com/sunlightlabs/scrapelib
* License         : BSD-3-clause
  Programming Lang: Python
  Description     : library for scraping websites

It builds those binary packages:

python-scrapelib - library for scraping websites
scrapeshell - ipython shell to examine python-scrapelib results

Long description:
 At its simplest provides a replacement for urllib2’s urlopen functionality
 but can do much more.
 .
 Advantages of using scrapelib over urllib2 or httplib2 include:
 .
   * HTTP, HTTPS, FTP requests via an identical API
   * HTTP caching, compression and cookies
   * redirect following
   * request throttling
   * robots.txt compliance (optional)
   * robust error handling



Reply to: