Re: Query about OCR package(s)
On Mon, 14 Jul 2008, Bret Busby wrote:
Is an OCR package available for Debian 4.0, in .deb form, that can read from
PDF files, to allow text to be extracted from PDF files?
In looking at what is available in Synaptic, I could not find such a package.
Thank you in anticipation.
Since sending the above message, due to something that happened, where I
had to cite material that was in a PDF document that was published on
the Internet, I clicked on the link for the PDF document, in Iceape
(which I use for accessing web pages when I want to write an email using
material in the web pages, as Iceape includes the email facility, like
the Netscape and Mozilla suites), and the document viewer (Adobe Reader
8.0) opened the document within the tab, and I was able to simply mark
and copy and paste the text, as if it was simply text in an HTML web
page, or in a word processor document.
So, Adobe Reader 8.0 provides the text extraction, or copying, that I
sought, so an OCR application that imports text from PDF files, is now
probably redundant (other than that it could function as a smaller,
standalone, application, but this seems to be adequate).
Now, if only I could print from Adobe Reader 8.0 (I can print PDF files
from Evince 0.4.0, but not from Adobe 8.0)...
"So once you do know what the question actually is,
you'll know what the answer means."
- Deep Thought,
Chapter 28 of Book 1 of
"The Hitchhiker's Guide to the Galaxy:
A Trilogy In Four Parts",
written by Douglas Adams,
published by Pan Books, 1992