Re: OCR questions

To: debian-user@lists.debian.org
Subject: Re: OCR questions
From: "Nelson Castillo" <nelsoneci@gmail.com>
Date: Sat, 21 Jul 2007 19:54:40 -0500
Message-id: <[🔎] 2accc2ff0707211754r203f1eq3ac10e860b222d1c@mail.gmail.com>
In-reply-to: <[🔎] 20070722004557.GA8281@buddy.mtntop.home>
References: <877iqe4s98.fsf@gmail.com> <20070608085703.cbbb955a.celejar@gmail.com> <20070609045125.GH4974@localhost.localdomain> <[🔎] 87644dhd4y.fsf_-_@gmail.com> <[🔎] 20070721181027.GA2633@dementia.proulx.com> <[🔎] 87zm1pv708.fsf@gmail.com> <[🔎] 20070721205309.GA6722@localhost> <[🔎] 20070721234018.GA26937@debian.org> <[🔎] 2accc2ff0707211713g768d0ff1he76ae350f24676b@mail.gmail.com> <[🔎] 20070722004557.GA8281@buddy.mtntop.home>

On 7/21/07, Wayne Topa <linuxone@intergate.com> wrote:

Nelson Castillo(nelsoneci@gmail.com) is reported to have said:
> On 7/21/07, Osamu Aoki <osamu@debian.org> wrote:
> >On Sat, Jul 21, 2007 at 10:53:09PM +0200, Florian Kulzer wrote:
> >> On Sat, Jul 21, 2007 at 22:25:43 +0200, Rodolfo Medina wrote:
> >> Why not use the Debian package? It is called "tesseract-ocr".
> >
> >Yes.  But it is old 1.02 version and has FTBFS bug.
>
> Yes, it's old. I installed from sources but I don't get the charsets.
>
> tesseract test.tiff out
> Unable to load unicharset file /usr/local/share/tessdata/eng.unicharset
>
> How do I get them?

1.  apt-cache search tesseract-ocr
tesseract-ocr - Command line OCR tool
tesseract-ocr-data - Command line OCR tool data

2.  aptitude install tesseract-ocr tesseract-ocr-data

3.  less /usr/share/doc/tesseract-ocr/README

This in in testing.  YMMV if your running etch.


Hi.

I run sid. I wanted the latest version. The Debian installation is OK.
But it's old.
Now I just noticed that the language files are not installed by default.

I just found this:

 To be completely language independent, there is *no* language
 data with the source, so you have to download a separate language
 file to get it to work at

http://groups.google.com/group/tesseract-ocr/browse_thread/thread/2b11730eae611b40/2a780e0d6227cb02#2a780e0d6227cb02

Regards.

--
http://arhuaco.org
http://emQbit.com

Reply to:

Follow-Ups:
- Re: OCR questions
  - From: Osamu Aoki <osamu@debian.org>

References:
- OCR questions (was: How to acquire text so to edit it?)
  - From: Rodolfo Medina <rodolfo.medina@gmail.com>
- Re: OCR questions (was: How to acquire text so to edit it?)
  - From: bob@proulx.com (Bob Proulx)
- Re: OCR questions
  - From: Rodolfo Medina <rodolfo.medina@gmail.com>
- Re: OCR questions
  - From: Florian Kulzer <florian.kulzer+debian@icfo.es>
- Re: OCR questions
  - From: Osamu Aoki <osamu@debian.org>
- Re: OCR questions
  - From: "Nelson Castillo" <nelsoneci@gmail.com>
- Re: OCR questions
  - From: Wayne Topa <linuxone@intergate.com>

Prev by Date: Re: OCR questions
Next by Date: CPU Speed
Previous by thread: Re: OCR questions
Next by thread: Re: OCR questions
Index(es):
- Date
- Thread