Re: search.debian.org is online
Hi,
Some additions.
From: Tomohiro KUBOTA <debian@tmail.plala.or.jp>
Subject: Re: search.debian.org is online
Date: Mon, 30 Dec 2002 19:53:31 +0900 (JST)
> > > Note that, if this problem is fixed, Korean people will benefit very
> > > much even if the word-separation problem is not fixed.
> > I don't understand. Are you saying that Korean uses two-byte characters
> > but doesn't have spaces in words and should be ok now?
>
> The current version of Debian search site has two problems for east Asian
> languages:
I meant that Korean uses two-byte characters but DOES have spaces between
woreds and should be ok now. (Chinese and Japanese use two-byte characters
and DON'T have spaces between words.)
> However, two-byte search doesn't always fail. For example, I reported
> in http://lists.debian.org/debian-www/2002/debian-www-200212/msg00256.html
> that I can search my name. I guess the condition when a search succeeds
> or fails depends on whether the Japanese word is written in normal EUC-JP
> encoding or in HTML "&#xxxx;" expression where xxxx is UTF-8 codepoint.
> When the word is written in "&#xxxx;" expression, the search succeeds
> while the word is written in normal EUC-JP encoding, the search fails.
s/EUC-JP/ISO-2022-jp/
Note that Japanese WML sources are written either in EUC-JP or ISO-2022-JP.
However, Japanesee HTML in Debian web site are all written in ISO-2022-JP.
---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
Reply to: