Re: OCR

To: Christian Schoepplein <chris@schoeppi.net>, Debian Accessibility Team <debian-accessibility@lists.debian.org>
Cc: MENGUAL Jean-Philippe <mengualjeanphi@free.fr>, Nick Gawronski <nick@nickgawronski.com>
Subject: Re: OCR
From: Jos Lemmens <jos@jlemmens.nl>
Date: Thu, 28 Apr 2016 09:47:18 +0200
Message-id: <[🔎] 20160428074638.GA2646@jlemmens.nl>
Reply-to: Jos Lemmens <jos@jlemmens.nl>
In-reply-to: <[🔎] 20160428065908.GA18189@v.cs-x.de>
References: <[🔎] 5720FE97.5050203@free.fr> <[🔎] 57215ACE.7020207@nickgawronski.com> <[🔎] 20160428065908.GA18189@v.cs-x.de>

Hello,

There is already such a tool in Debian: eBook-speaker.

eBook-speaker can read many text-formats and use tesseract for OCR and is
available in seven languages.

eBook-speaker can read:

       AportisDoc (.pdb) (.prc)
       ASCII mail text
       ASCII text (.txt)
       awk script text
       BBeB_ebook_data (.lrf)
       Bourne-Again shell script text
       Broadband eBooks (BBeB) (.lrf) (.lrx)
       C source text
       Composite Document File (Microsoft Office Word) (.doc) (.xls)
       DAISY3 DTBook
       EPUB ebook data (.epub)
       GIF image data (.gif)
       GNU gettext message catalogue
       GutenPalm zTXT
       HTML document (.html)
       ISO-8859 text (.txt)
       JPEG image data (.jpg)
       Microsoft Office Document
       Microsoft Reader eBook Data (.lit)
       Microsoft Windows HtmlHelp Data (.chm)
       Microsoft Word Document
       Microsoft Word 2007+ (.docx)
       Mobipocket E-book (.prc) (.mobi)
       MS Windows HtmlHelp Data (.chm)
       Netpbm PPM data (.ppm)
       OpenDocument Text (.odt)
       Pascal source
       PDF document (.pdf)
       PeanutPress PalmOS (.pdb)
       Perl script text
       Plucker PalmOS document (.pdb)
       PNG image data (.png)
       POSIX shell script text
       PostScript document (.ps)
       Python script
       Rich Text Format (.rtf)
       Tenex C shell script text
       troff or preprocessor text (e.g. Linux man-pages)
       UTF-8 Unicode mail text
       UTF-8 Unicode text
       WordPerfect (.wp)
       XML document text (.xml)

For example: I put a letter onto my scanner and give:

   eBook-speaker -s

The letter will be scanned and after a moment it will be spoken using
espeak.

Please give it a try.

   Regards,        

      Jos.

On Thu, Apr 28, 2016 at 08:59:08AM +0200, Christian Schoepplein wrote:
> Hi,
>
> I'd be also interested in OCR for linux. OCR is one of the less reasons
> why I still have to use Windows or the Mac.
>
> Cheers,
>
>   Schoepp
>
>
> On Mi, Apr 27, 2016 at 07:35:26 -0500, Nick Gawronski wrote:
> >Hi, Yes I would very much be interested in such an option.  Not just for
> >the GUI but for small systems like the raspberrypi and the camera module
> >having a program that could snap the image then run the OCR engine then
> >read and save the text into a text file of course keeping the image.  If
> >tools already exist for the console to do the OCR I would like to know
> >about them.  Nick Gawronski
> >
> >On 4/27/2016 1:01 PM, MENGUAL Jean-Philippe wrote:
> >> Hi,
> >>
> >> After test of various OCR, I feel that Tesseract, the most advanced OCR
> >> engine on Linux, hasn't noawadays all ways to be as performant as
> >> commercial utilities. Even if it's wrapped in some tools like Lios
> >> or gimagereader, the performance is still difficult to use for "basic"
> >> users (I mean, the Windows users who don't have any technical knowledge
> >> or who use computer just for needs).
> >>
> >> That's why I had a look at what provide proprietary world, waiting for
> >> having money enough to create a full OCR suite, free and based on
> >> Tesseract. Create or improve, as Lios and gimagereader are
> >> excellent points of beginning, but some things are hard to understand
> >> for our users in GUI (after tests).
> >>
> >> And we needed a quick solution, so that the GNU/Linux OS could be usable
> >> by everyone now, including OCR matter, so that they buy service and
> >> finance our devs projects for free software. But I wonder now if some
> >> usual GNU/Linux users here could be interested by such a product. What
> >> we reach now is a suite for 200E, including:
> >> - Abbyy FindReader 11, unlimited in number of pages thanks to an
> >> agreement between Abbyy and Hypra based on the fact we do a free program
> >> and designed for blind people with specifific needs in OCR,
> >> - A package to run it on MATE. 2 ways:
> >> * from an image file, right-click, choose the proper option
> >> * from a scanner: we give a command to create a binding (as ours in
> >> linked against Compiz).
> >>
> >> I precise that the utility could also use Tesseract if FindReader is
> >> missing, but in such case, it will be free.
> >>
> >> Would some users interested by such solution? I "like" it as it
> >> introduces OCR on GNU/Linux and enable some unusual users to come.
> >> Waiting for a full "libre" solution, accessible for such people.
> >>
> >> Regards,
> >>
> >>
> >>
> >
>
> --
> Christian Schoepplein - <chris (at) schoeppi.net> - http://schoeppi.net

--

   Sent from Ubuntu 15.10

   -------------------------------
   Jos Lemmens
   The Netherlands
   E-mail: jos@jlemmens.nl
   Homepage: www.jlemmens.nl

Reply to:

References:
- OCR
  - From: MENGUAL Jean-Philippe <mengualjeanphi@free.fr>
- Re: OCR
  - From: Nick Gawronski <nick@nickgawronski.com>
- Re: OCR
  - From: Christian Schoepplein <chris@schoeppi.net>

Prev by Date: Heads up: PulseAudio mutes when you reconnect your line cable
Next by Date: Re: OCR
Previous by thread: Re: OCR
Next by thread: Re: OCR
Index(es):
- Date
- Thread