[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: enable searching East Asian words at search.debian.org



Hi,

No reply for more than one week.  Someone please reply.
There are Chinese, Japanese, and Korean translation of www.debian.org
but search.debian.org cannot search words in these languages.

Please do the following:

1. Install libchasen-dev, libchasen0, and ipadic packages to klecker.
2. Add me (kubota@debian.org) as a user of postgresql database at klecker.
2. Create a postgresql database for which I have write permission at klecker.

Then I can prove the improvement (or bugfix, I regard) in the last mail
which I cite the whole contents of.


From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Subject: enable searching East Asian words at search.debian.org
Date: Sat, 26 Apr 2003 09:45:48 +0900 (JST)

> Hi,
> 
> So far search.debian.org doesn't support East Asian languages
> (Chinese, Japanese, and Korean).  I.e., it cannot search Chinese,
> Japanese, nor Korean words.
> 
> I have recently researched this problem and I think I found
> how to fix it.  I tested at my personal machine without 24hr
> internet connection and it works almost fine.
> 
>  1. install libchasen-dev, libchasen0, and ipadic packages.
>  2. recompile mnogosearch (version 3.2.8 or later) with
>     --enable-chasen --with-extra-charsets=all option for ./configure .
>  3. invoke "indexer -C" and then "indexer" to rebuild the search database.
> 
> Could someone do this?  Or, can I have a database (postgresql) access
> (write access) permission at klecker to prove this?
> 
> 
> Explanation:
> 
> Chasen packages are needed to extract words from Japanese texts.
> Japanese texts don't use whitespaces between words.  --enable-chasen
> (since version 3.2.8) option for mnogosearch enables usage of chasen
> from mnogosearch.
> 
> Though mnogosearch is Unicode-based software and potentially supports
> East Asian languages, support of these languages is disabled by default.
> To enable this, --with-extra-charsets=all is needed.
> 
> Since the current search database in search.debian.org doesn't have
> any east Asian words, it is needed to rebuild the whole database.
> (Of course it is enough to rebuild database only for *.{ja,ko,zh-cn,
> zh-hk,zh-tw}.html pages but I don't know if it is possible to this.)

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/




Reply to: