[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Multibyte encoding - what should a package provide?

On Fri, 10 Sep 1999, Tomohiro KUBOTA wrote:

> > kubota> Please note, Unicode is not popular at all in Asia. I am sure
> > 
> > why is it not popular?  what are the reasons?  i keep hearing this, but
> I said 'Asia', but I know only for Japan, Korea, and China.
> How about other countries in Asia?  Are there any member from
> these countries in Debian Project?  If there are, please add
> comments.

It's okay; many people say "Asia" when they really mean East Asia
(China/Japan/Korea(s)/Taiwan/Hong Kong).
I cannot speak for anyone else, but there is:
  - TCVN in Vietnam (8 bit; alphabetic)
  - VISCII outside Vietnam for expatriates (8 bit; alphabetic)
  - TIS-620 in Thailand (8 bit; alphabetic)
      part of Unicode, and there's also a HOWTO file for Linux
  - ISCII in India for all/most of the Indic scripts (8 bit; alphabetic)
      part of Unicode
  - some system for Tamil (an Indic script) in Singapore (?)
  - ISCII (different acronym) in Iran
  - KPS in North Korea (16 bit; alphabetic/logographic)
      maybe not used
  plus certainly many others...
Some countries/languages do not have standards yet (e.g., Mongolia, etc),
or do not have much of an IT infrastructure (e.g., Khmer, Miao, etc)
so Unicode might actually be their first standardized character set.

> 2. Japan, Korea, and China have similar but different characters which 
>    have the same origin.  Unicode unified similar characters for a 
>    technical reason -- 16bit is insufficient.  Though Japanese, Korean,
>    and Chinese have similar characters, they are different.  Some of
>    us don't care, and some cares -- for example, people whose name
>    cannot be correctly expressed by Unicode, who research languages,
>    and so on.

As for unavailable characters, Unicode doesn't (yet) have the characters
some placenames in Hong Kong, or characters for writing Cantonese
(Kantongo).  But even the Big5 character set which Hong Kong has borrowed
from Taiwan is insufficient, so Hong Kong has added an extension called
GCCS with 3,049 characters.  (All of Big5 is in Unicode, but not all of

Thomas Chan

Reply to: