[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Book scanning frontend application?



Hello Mario & list,

On Thu, Dec 31, 2009 at 05:11:29PM +0100, Mario Lang wrote:
> Hi.
> 
> It appears that ocropus and tesseract look pretty promising.  I am
> wondering, does anyone know of an active project to develop a typical
> free software book scanning frontend application like OpenBook or
> similar products from the commercial world?  I know that Emacspeak has
> some
> code to interface ocropus, but that is a little bit too much tied into
> Emacspeak for my current tastes.

Actually, the ADRIANE desktop (part of each KNOPPIX live CD version 6.0
and up) has integrated OCR support using scanimage, ocropus and
tesseract with fully automatic configuration, for over a year now. The
focus here lies on ease-of-use for beginners, not expert features, i.e.
scanning and recognizing a page is a single-keystroke procedure, and
postediting/correction plus saving/appending to a textfile is another
one. adriane-ocr is a dialog-based bash shellscript GUI and works
together with ADRIANEs notebook/archiving/printing. Configuration
options such as page orientation and scanning DPI can be changed in the
ADRIANE setup.

> Any hints?  If no, we should probably develop such a thing, do you have
> wishes for a feature list?

Though scanning of a complete book (even with twoside layout, thanks to
ocropus layout analysis) is much quicker and easier in adriane-ocr in
comparison to OpenBook (which we occasionally tried as a reference), I
think it still makes sense to develop an accessible frontend for expert
users as well who want to work with a variety of ocropus- or
tesseract-supported options beyond a simple "just scan the book and save
it to a text file" approach.

>  To me, a frontend needs to:
>  * Keep track of page numbers, allowing me to delete pages and renumber
>    them.
>  * Provide speech output and scanning in background.  I.e., while speech
>    is reading the text, scanning new pages should not interrupt speech.
>    This is very comfortable when reading a book, you can turn pages
>    during listening to the text with a minimum of interaction, i.e.,
>    just a single key press per page.
>  * Allow to edit the text so that scanning errors can be corrected.
>    Ideally, with a feedback mechanism that populates a dictionary for the
>    OCR engine.
>  * A pronouncation dictionary, ideally with a submission system so that
>    we can collect good improvements from users and eventually incorporate
>    them into the engines we were using (like espeak).

The above is not yet covered by adriane-ocr, though it could be added to the
bash scripts. If you like, please have a look into the adriane-ocr package at
http://debian-knoppix.alioth.debian.org/ (or try it from a current
KNOPPIX-ADRIANE live CD).

> Since I am not a US citizen, I am not terrible interested in Bookshare
> integration, but I guess that such features would be desireable as well.

What is "Bookshare"?

Regards
-Klaus


Reply to: