[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: OO.o source and new hunspell dicts packaging



Le samedi 26 avril 2008 à 18:15 +0200, Rene Engelhard a écrit :
> [ WTF did you remove the Cc to debian-openoffice? ]
Sorry, I did not notice you added it and I just hit "reply to sender".
> 
> Hi,
> 
> Milan Bouchet-Valat wrote:
> > I'm thinking of the new French dictionary that has been updated to use
> > hunspell and which is very nice. But I can see about 25 dicts in the
> 
> And what are they built from? The myspell version was built from a
> external wordlist and so myspell-fr is built and packages from there
> directly.
I don't really understand what you mean by "they" and "there". What I
know is that, at least for French, we have a new hunspell dict that has
been made available in OO.o 2.4 and is the one that is present in
openoffice.org-dictionaries. This one is much smarter than the current
myspell ones.

> > source while only 10 "binaries" are produced.
> 
> Because the rest is built by the extra packages (de from igerman98, nl
> from dutch, etc. Other ones are packages completely external)
OK, so if I understand how this works, you have two sources for the same
dict, so you prefer to "build" them from their independent sources. This
explains why they are not used in oo.o-dictionaries

> > Yes. openoffice.org-dictionaries contains a few dicts that are not
> > used.
> 
> Öet's look at the openoffice.org-dictionaries dirs:
> 
> af-ZA -> packaged
> cs-CZ -> packaged
> de_* -> packaged from igerman98
> en_* -> packaged
> es_ES -> packaged from espa-nol; README says it's a MySpell dictionary
> et_EE -> packaged from ispell-et; README says "OpenOffice.org
> spellchecker" and it's from 2004, so probably a MySpell dictionary
> fr_* -> ok, that's already the hunspell version...
So this one is not present in Debian, which is a lack.

> hu_HU -> packaged from magyarispell; README inside OOo says that it's
> the same soure
> it_IT -> packaged
> lt_LT -> packaged from ispell-lt
> ne_NP -> packaged
> nl_NL -> packaged from dutch; README in OOos code says "Format: MySpell"
> pl_PL -> packaged from ipolish
> sk_SK -> packaged from myspell-sk
> sl_SI -> packaged from myspell-sl
> sv_SE -> packaged from myspell-sv
> sw_TZ -> packaged
> th_TH -> packaged
> zu_ZA -> hmm, indeed missing.
Packaged, according to your next mail.

> For the varios external myspell-* source packages those maintainers
> shoud upload a hunspell version in addition if there is one, not my job
> (I far, though, that some of them might have uploaded hunspell dicts as
> myspell-*, though...)
> 
> So what's about your claim that 15is not used? They are just packaged
> externally, where they belong. I find it better building those
> dictionaries directly from their source instead of just shipping the
> "binaries"...
I'd just missed the possibility of two sources existing. They're not
used *from oo.o-dicts*, but they're used with special sources. No
problem so, sorry for my too quick conclusion.

What did mislead me is that Fedora has many hunspell packages, and since
in French a new hunspell dict appeared, I thought this was the same for
many languages. But maybe Fedora did just take their myspell dicts and
made them hunspell ones without changing them: this is just good and
allows switching progressively to true hunspell dicts.

Maybe then I should just check we don't miss some new hunspell dictionaries and then file bugs for each package.



> > I'm wondering where the dictionaries in openoffice.org-dictionaries come
> > from. On the OO.o wiki, many more dicts are listed than in the source
> 
> That are the dictionaries which are in OOos source.
> 
> > package, and their package is much buigger (58 MB). Are they only bad
> > versions?
> 
> wiki =! OpenOffice.org source.
> 
> That's not the dictionaries which are in OOos source. They come from
> wherever. And many of them are legally questionable/violate licenses or
> are for whatever else reason not in OOos source.
Sure, this is quite complex to manage. So OO.o does not ship with so many
dicts as they may appear to. It seems that upstream, they're working towards
integrating more and better spellchecker, and hyphenation/thesaurus.

So in the end, is the fr_FR hunspell version the only dictionary not to be
packaged? I was a little ambitious about the needed amount of work... ;-)
Is there any problems about making it a new package so we can benefit from
the updated spellchecker?

Thansk again




Reply to: