[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#820171: RFP: pdftable -- extract tables from PDF files



Package: wnpp
Severity: wishlist

* Package name    : pdftable
  Version         : 1.0
  Upstream Author : Kyle Cronan <kyle at pbx org>
* URL             : http://pdftable.sourceforge.net/
* License         : GPL v3
  Programming Lang: Python
  Description     : extract tables from PDF files

Pdftable is a python module and command line utility that analyzes XML
output from the program pdftohtml in order to extract tables from PDF
files and output them as CSV data. It makes it easier to automate the
process of parsing tabular data contained within reports, ledgers, or
other data sets that are only published in PDF.

-- 
Happy hacking
Petter Reinholdtsen


Reply to: