Audiveris and tesseract [Hervé Bitteur] Re: Subject: some suggestions
This is future stuff since audiveris is not yet packaged for Debian.
However, since audiveris 3.2 is around the corner, I decided to forward
this request for help in case anyone is interested to work on this.
It is about Audiveris and Tesseract-OCR integration on Linux.
Audiveris is an open-source Java-based optical music recognition
engine. Primarily it features a graphical user interface, however there
is also a way to have it batch-process scanned pages.
--- Begin Message ---
Thank you for your mail, see my comments in line.
I'm very busy working on 3.2.
Postner a écrit :
> Subject: some suggestions
> Hello Hervé,
> I have tried version 3.1 of Audiveris for scanning some pages of a
> piano étude, and was impressed how accurately it already works. But
> when I tried to scan a full orchestral score it abandonned on a null
> pointer, so I was curious wether the trunk version perhaps is more
> advanced on this point.
3.1 is about 1 1/2 year old now. It's time to switch to the new one, see
> When I tried to compile and run version 3.2 out of CVS, I discovered
> that there is no LINUX version of the "Tesjeract" JNI-class you are
> using. Although there is a Tesseract-OCR package for the Ubuntu
> distribution, it is not trivial to write a JNI wrapper under LINUX,
> because Tesseract is distributed only as an executable application
> (the Tesseract-DLL is written for visual-C and exists only under
> Windows). Therefore I did not investigate further, but if you are
> interested I could try to find a way how to get Tesseract connected to
> JAVA under LINUX.
Yesterday (September 18), I committed to CVS a cleaner version with
respect to the connection to Tesseract.
In short, there is now a CharsRetriever interface for which TesseractOCR
class should allocate the proper implementation, depending on the
hosting OS (Mac, Linux or Windows).
For the time being, only the WindowsCharsRetriever implementation exists
(it uses Tesjeract and tessdll).
On other OSes, the OCR allocation should gracefully fail and Audiveris
should keep on running without the OCR engine : the remaining part of
handling of textual glyphs will work, letting the user enter the precise
content for each such glyph via the glyph board, which is better than
So indeed if you could have a look at this new CharsRetriever interface
and provide a Linux implementation, that would be great!
Please keep me posted, perhaps I can help you.
I can release a 3.2 beta version this weekend, so that you can have all
pieces at your disposal.
> On the home page you mentioned the integration with a "Visual Music
> Editor". Do you know Lilypond? It is not at all "visual", but it is
> the best open source music notation system. It would be interesting to
> output directly the Lilypond-files without going through musicXM.
> Thereby no need for injecting score entities, because scanning errors
> can easily be corrected in the Lilypond files. What do you think of it?
I kind of disagree here. The purpose of a pivot format like MusicXML is
precisely to avoid the need to develop numerous point-to-point
connections between programs. For example, this is how Audiveris
(scanner) and XenoPlay (sequencer) are integrated today.
Googling "lilypond musicxml" already points you to converters between
these 2 formats...
> Best regards,
> Harald Postner
To unsubscribe, e-mail: email@example.com
For additional commands, e-mail: firstname.lastname@example.org
--- End Message ---
⡍⠁⠗⠊⠕ | Debian Developer <URL:http://debian.org/>
.''`. | Get my public key via finger email@example.com
: :' : | 1024D/7FC1A0854909BCCDBE6C102DDFFC022A6B113E44
`- <URL:http://delysid.org/> <URL:http://www.staff.tugraz.at/mlang/>