[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Opinions on libhyphenate (a clean rewrite of libhyphen)?



Hi,

2008/3/24 Rene Engelhard <rene@openoffice.org>:
> [ more or less fullquoting for upstream ]
>
>  Hi,
>
>  Steve Wolter wrote:
>  > after some not-so-pleasant tries to use and then to patch libhyphen
>  > (back then it was called libhnj) early last year, I've reimplemented
>  > the algorithm in C++ in a library called libhyphenate.
>  >
>  > I deem libhyphenate considerably easier to use. In addition to all
>  > libhyphen features, it supports system-central storage of hyphenation
>  > pattern files, hyphenation at all possible hyphenation points, hyphe-
>  > nation of text such that it optimally fits a given width (in characters)
>  > and hyphenation using the libhyphen-style hyphens array.
>  >
>  > In addition, it fixes the libhyphen TODO of handling UTF-8 characters
>  > and the not yet filed bug that, for some languages, an hyphenation-free
>  > zone at the start and end of each word is needed to hyphenate correctly.
>  >
>  > If you want to have a look yourself and test this (bold) claim, the
>  > source code can be found at:
>  > http://swolter.sdf1.org/libhyphenate_1-current.tar.gz
>  >
>
>  > In order to avoid having two libraries around doing essentially the
>  > same thing, I've reimplemented the public libhnj/libhyphen interface
>  > for libhyphenate. You can find the implementation at:
>  > http://swolter.sdf1.org/libhyphen-hyphenate-1.0.tar.gz
>  >
>  > What do you think of the work?
>
>  I don't think we (as the Debian maintainers) are the persons to decide
>  whether/when upstream will switch zo libhyphenate.
>
>  I think we should involve upstream (and the author of libhyphen) in this
>  (Cced)
>  Lazlo/tl, what you do you think of this?

If I right know, the original LibHnj library has also algorithms for
justification, also with variable width characters. You can check it
in the LibHnj package of Debian.
The aim of the hyphenation development at the Lingucomponent project
is developing a competitive hyphenation algorithm and library for
OpenOffice.org and other applications. Justification (typesetting
paragraphs) is a different task (maybe with more complex problems, for
example see this illustrated paper about mathematical typesetting of
TeX: http://www.tug.org/TUGboat/Articles/tb27-1/tb86jackowski.pdf).

I have checked your code with non-standard hyphenation (only the
currect distribution), but it doesn't work for me.

Hyphen 2.4 (http://downloads.sourceforge.net/hunspell/hyphen-2.4.tar.gz)
has inner hyphenmin support (the main reason of your development),
moreover, it has other new features also for better German
hyphenation: compound word hyphenation and compound hyphenmins. I
believe, it is a significant improvement in pattern based hyphenation
of the languages with arbitrary number of compounds.

There is a related NLP library for spelling dictionary and spelling
engine centralization: Enchant from the Abiword project, see
http://www.abisource.com/projects/enchant/.
I believe, extending its API and code with hyphenation is better for
system-central storage, reducing the cost of the common tasks
(portability issues, dependency problems, registration and listing of
the available dictionaries, handling private dictionaries, character
encoding issues etc.). It would be great, if you could help in this
task.

Also a good task to add hyphenation to the Mozilla code base and its
applications. The basic requirements is the license (GPL/LGPL/MPL
tri-license) and the CSS standard for hyphenation. (See
http://www.w3.org/TR/css3-text/#hyphenate for the upcoming hyphenation
support in CSS). Hyphen library would have a big advantage in the
Mozilla integration, it is only 37 kB. (Small code size is crucial for
Mozilla development.)

Regards,
László


>
>  > If you find it workable, I'd love to try and test whether it works
>  > properly in the current OpenOffice environment. If not, however,
>
>  Well, it's OpenOffice.org, but yes, you can try. If it's indeed a 1:1
>  replacement and you can make sure it works as intended we can think of
>  trying it. But I don't see the need for a hurry for it now.
>
>  > I'd like to point out that libhyphenate needs a Debian sponsor ;-).
>
>  This should be doable :)
>
>  Regards.
>
>  Rene
>
> -----BEGIN PGP SIGNATURE-----
>  Version: GnuPG v1.4.6 (GNU/Linux)
>
>  iD8DBQFH6B8m+FmQsCSK63MRAtgWAJ92QxqROym7qOdPD7hU6IpzHsYHGACdHVSk
>  i7ROmTPSUpXzwiLLF2jpejk=
>  =tktP
>  -----END PGP SIGNATURE-----
>
>


Reply to: