[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#813562: Test suite failure



Thanks for your help. Output order is due to multiprocessing.

That nailed it. tesseract 3.04.01 changed its output when asked to determine page orientation. It's an improved, but it breaks parsing.

I will throw together a patch to make the appropriate distinctions.


$ tess-3.04.01 -psm 0 tests/resources/linn-west.jpg stdout
Page number: 0
Orientation in degrees: 270
Rotate: 90
Orientation confidence: 29.34
Script: Latin
Script confidence: 45.33

$ tess-3.04.00 -psm 0 tests/resources/linn-west.jpg stdout
Orientation: 3
Orientation in degrees: 90
Orientation confidence: 29.34
Script: 1
Script confidence: 45.33



On Fri, Feb 19, 2016 at 16:28 Sean Whitton <spwhitton@spwhitton.name> wrote:
Hello,

On Fri, Feb 19, 2016 at 10:45:51PM +0000, James R Barlow wrote:
> In any case, could you try running this:
> ocrmypdf --rotate-pages tests/resources/cardinal.pdf out.pdf
>
> In cardinal.pdf the same page is rotated in each cardinal direction. out.pdf
> should have all pages facing up. Is this the case? The output will also give
> information on rotation status:
> INFO - 1: page is facing ⇧, confidence 18.69
> INFO - 3: page is facing ⇩, confidence 21.86 - correcting rotation
> INFO - 4: page is facing ⇦, confidence 20.71 - correcting rotation
> INFO - 2: page is facing ⇨, confidence 21.63 - correcting rotation
> INFO - 3: rotating image layer 180 degrees
> INFO - 2: rotating image layer 90 degrees
> INFO - 4: rotating image layer 270 degrees

No, it gets it wrong.  Result attached, and the output:

,----
| root@artemis:/build/ocrmypdf-4.0.1# ocrmypdf --rotate-pages tests/resources/cardinal.pdf out.pdf
| INFO -    1: page is facing ⇧, confidence 18.69
| INFO -    2: page is facing ⇦, confidence 21.63 - correcting rotation
| INFO -    3: page is facing ⇩, confidence 21.86 - correcting rotation
| INFO -    4: page is facing ⇨, confidence 20.71 - correcting rotation
| INFO -    2: rotating image layer 270 degrees
| INFO -    3: rotating image layer 180 degrees
| INFO -    4: rotating image layer 90 degrees
`----

(note that the order it processes the pages in is different to your example)

> It would also help to try in python3:
>
> >>> import ocrmypdf.leptonica as lp
> >>> lp.getLeptonicaVersion()
>
> ...to see if there's anything unusual about how debian sid is reporting the
> leptonica version.

,----
| root@artemis:/build/ocrmypdf-4.0.1# cd /usr/lib/python3/dist-packages
| root@artemis:/usr/lib/python3/dist-packages# python3
| Python 3.5.1+ (default, Jan 13 2016, 15:09:18)
| [GCC 5.3.1 20160101] on linux
| Type "help", "copyright", "credits" or "license" for more information.
| >>> import ocrmypdf.leptonica as lp
| >>> lp.getLeptonicaVersion()
| 'leptonica-1.73'
`----

--
Sean Whitton

Reply to: