Re: search through postscript documents?
Antonio Rodriguez wrote:
> On Thu, Mar 03, 2005 at 12:15:40PM +0100, Joerg Reckers wrote:
> > Is there a way(program) to search for expressions in a postscript
> > and to copy + paste words out of a ghostview-program to text?
> > As i am using Kghostview now, and i am missing these features, so i
> > on this list. :-)
> > thanks, joerg
> Package: pstotext
> Priority: optional
> Section: text
> Installed-Size: 110
> Maintainer: J.H.M. Dassen (Ray) <firstname.lastname@example.org>
> Architecture: i386
> Version: 1.9-1
> Depends: gs | gs-aladdin (>= 3.51), libc6 (>= 2.3.2.ds1-4)
> Filename: pool/main/p/pstotext/pstotext_1.9-1_i386.deb
> Size: 32294
> MD5sum: a159e4b756759beeae003700d31487d1
> Description: Extract text from PostScript and PDF files
> pstotext extracts text (in the ISO 8859-1 character set) from a
> or PDF (Portable Document Format) file. Thus, pstotext is similar to
> ps2ascii program that comes with ghostscript. The output of pstotext
> however better than that of ps2ascii, because pstotext deals better
> punctuation and ligatures.
I have a pdf file produced by a recent version of InDesign CS. The
utility pdf2ps will produce a Postscript file that is readable using
GV. However from that point on everything fails. There seems to be no
way to convert the file to ASCII except by cutting pages and pasting
them into e.g., Gvim.
I have tried creating a subset of the pages and then converting that.
What I get is just the EOP characters.
This is the second such file I have had trouble with. It may have
something to do with PostScript 1.5. In any case this is a customer's
file and I can't very well ask him to resave his PDF to an earlier
I have tried versions of Ghostscript on Slackware and on Knoppix, a
Debian derivative. I have downloaded and installed Ghostscript 8.50. I
have installed the latest pstotext. Nothing works.