[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Converting PDF documents to text

Michael Satterwhite writes:
> Does anyone know of a good utility to convert a PDF document to text

toncho/~ apt-cache show pstotext         
Package: pstotext
Priority: optional
Section: text
Installed-Size: 110
Maintainer: J.H.M. Dassen (Ray) <jdassen@debian.org>
Architecture: i386
Version: 1.9-1
Depends: gs | gs-aladdin (>= 3.51), libc6 (>= 2.3.2.ds1-4)
Filename: pool/main/p/pstotext/pstotext_1.9-1_i386.deb
Size: 32294
MD5sum: a159e4b756759beeae003700d31487d1
Description: Extract text from PostScript and PDF files
 pstotext extracts text (in the ISO 8859-1 character set) from a PostScript
 or PDF (Portable Document Format) file. Thus, pstotext is similar to the
 ps2ascii program that comes with ghostscript. The output of pstotext is
 however better than that of ps2ascii, because pstotext deals better with
 punctuation and ligatures.

John Hasler

Reply to: