[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: search.debian.org is online



From: csmall@enc.com.au (Craig Small)
Subject: Re: search.debian.org is online
Date: Thu, 16 Jan 2003 09:35:00 +1100

> On Wed, Jan 15, 2003 at 03:32:43PM +0900, Tomohiro KUBOTA wrote:
> > 
> > I'd like the mnoGoSearch of search.debian.org to be recompiled
> > with extra-charsets enabled, because it (I expect) immediately
> > benefits Korean.  (Note that Korean doesn't have the problem 2).
> > Since it doesn't need the newer version of mnoGoSearch with ChaSen
> > support (CVS version 3.2.8, to solve problem 2), it can be done now!
> 
> Except we're using UTF-8, so it shouldn't matter, I think.

mnoGoSearch uses Unicode internally for their indexing and searching
in the current configuration, as you wrote.  Thus, it needs to convert
HTML files into Unicode before processing them and it needs converters.
The default compilation of mnoGoSearch omits converters to Unicode from
east Asian encodings (ISO-2022-JP, EUC-KR, Big5, GB2312), and this is
why it cannot index nor search east Asian pages.

Compilation with the ./configure option will enable this.
Though Japanese and Chinese have further problem (problem 2), Korean
should be solved by this.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/




Reply to: