There is already such a tool in Debian: eBook-speaker.
eBook-speaker can read many text-formats and use tesseract for OCR and is
available in seven languages.
eBook-speaker can read:
AportisDoc (.pdb) (.prc)
ASCII mail text
ASCII text (.txt)
awk script text
Bourne-Again shell script text
Broadband eBooks (BBeB) (.lrf) (.lrx)
C source text
Composite Document File (Microsoft Office Word) (.doc) (.xls)
EPUB ebook data (.epub)
GIF image data (.gif)
GNU gettext message catalogue
HTML document (.html)
ISO-8859 text (.txt)
JPEG image data (.jpg)
Microsoft Office Document
Microsoft Reader eBook Data (.lit)
Microsoft Windows HtmlHelp Data (.chm)
Microsoft Word Document
Microsoft Word 2007+ (.docx)
Mobipocket E-book (.prc) (.mobi)
MS Windows HtmlHelp Data (.chm)
Netpbm PPM data (.ppm)
OpenDocument Text (.odt)
PDF document (.pdf)
PeanutPress PalmOS (.pdb)
Perl script text
Plucker PalmOS document (.pdb)
PNG image data (.png)
POSIX shell script text
PostScript document (.ps)
Rich Text Format (.rtf)
Tenex C shell script text
troff or preprocessor text (e.g. Linux man-pages)
UTF-8 Unicode mail text
UTF-8 Unicode text
XML document text (.xml)
For example: I put a letter onto my scanner and give:
The letter will be scanned and after a moment it will be spoken using
Please give it a try.
On Thu, Apr 28, 2016 at 08:59:08AM +0200, Christian Schoepplein wrote:
> I'd be also interested in OCR for linux. OCR is one of the less reasons
> why I still have to use Windows or the Mac.
> On Mi, Apr 27, 2016 at 07:35:26 -0500, Nick Gawronski wrote:
> >Hi, Yes I would very much be interested in such an option. Not just for
> >the GUI but for small systems like the raspberrypi and the camera module
> >having a program that could snap the image then run the OCR engine then
> >read and save the text into a text file of course keeping the image. If
> >tools already exist for the console to do the OCR I would like to know
> >about them. Nick Gawronski
> >On 4/27/2016 1:01 PM, MENGUAL Jean-Philippe wrote:
> >> Hi,
> >> After test of various OCR, I feel that Tesseract, the most advanced OCR
> >> engine on Linux, hasn't noawadays all ways to be as performant as
> >> commercial utilities. Even if it's wrapped in some tools like Lios
> >> or gimagereader, the performance is still difficult to use for "basic"
> >> users (I mean, the Windows users who don't have any technical knowledge
> >> or who use computer just for needs).
> >> That's why I had a look at what provide proprietary world, waiting for
> >> having money enough to create a full OCR suite, free and based on
> >> Tesseract. Create or improve, as Lios and gimagereader are
> >> excellent points of beginning, but some things are hard to understand
> >> for our users in GUI (after tests).
> >> And we needed a quick solution, so that the GNU/Linux OS could be usable
> >> by everyone now, including OCR matter, so that they buy service and
> >> finance our devs projects for free software. But I wonder now if some
> >> usual GNU/Linux users here could be interested by such a product. What
> >> we reach now is a suite for 200E, including:
> >> - Abbyy FindReader 11, unlimited in number of pages thanks to an
> >> agreement between Abbyy and Hypra based on the fact we do a free program
> >> and designed for blind people with specifific needs in OCR,
> >> - A package to run it on MATE. 2 ways:
> >> * from an image file, right-click, choose the proper option
> >> * from a scanner: we give a command to create a binding (as ours in
> >> linked against Compiz).
> >> I precise that the utility could also use Tesseract if FindReader is
> >> missing, but in such case, it will be free.
> >> Would some users interested by such solution? I "like" it as it
> >> introduces OCR on GNU/Linux and enable some unusual users to come.
> >> Waiting for a full "libre" solution, accessible for such people.
> >> Regards,
> Christian Schoepplein - <chris (at) schoeppi.net> - http://schoeppi.net
Sent from Ubuntu 15.10
- From: MENGUAL Jean-Philippe <email@example.com>
- Re: OCR
- From: Nick Gawronski <firstname.lastname@example.org>
- Re: OCR
- From: Christian Schoepplein <email@example.com>