[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: enable searching East Asian words at search.debian.org



;At Tue, 13 May 2003 15:24:15 +1000,
Craig Small wrote:
> I'm working on 3.2.10 that will have the charset support.  Do I also
> need to include chasen? Without chasen, mnogosearch will not understand
> a Japanese "word"?

In general, Japanese text is not separated by space characters in any
word threshold. So you need to do morphological analysis for splitting
words.

By the way, there are another issue in ChaSen. It's dictionary
(ipadic) is licensed under two languages. The license written in
Japanese is DFSG-free, but the aother license written in English is
questionable.
http://lists.debian.org/debian-legal/2001/debian-legal-200104/msg00062.html
(I think it is the gray area...)

I tried to talk upstream to change English license, but the licenser,
an governmental organization, was already dissolved, so I couldn't do
that.

Now I'm trying to make another DFSG-free dictionary for ChaSen. If I
can do it, I'll move ipadic package to non-free and ITP the new one.

The another solution is to use libkakasi instead libchasen. It is
completely free.
-- 
NOKUBI Takatsugu
E-mail: knok@daionet.gr.jp
	knok@namazu.org / knok@debian.org



Reply to: