Re: Asian Problems with Unicode

To: debian-devel@lists.debian.org
Subject: Re: Asian Problems with Unicode
From: Tomohiro KUBOTA <kubota@surfchem0.riken.go.jp>
Date: Mon, 13 Sep 1999 18:17:15 +0900
Message-id: <[🔎] 19990913181715U.kubota@surfchem0.riken.go.jp>
In-reply-to: Your message of "11 Sep 1999 09:11:34 -0700" <[🔎] 87hfl1pgw9.fsf@mata.intrigue.com>
References: <[🔎] 87hfl1pgw9.fsf@mata.intrigue.com>

Robert Coie:
> Aside from the concerns which have been brought up so far, another
> potential reason for lack of adoption of Unicode is the inefficiency
> of UTF-8 as a storage format (at least for Japanese text).  One of the
> design goals of UTF-8 was upwards compatibility with 7-bit ASCII.
> Another was context-free parsing (i.e. a byte's meaning can be
> determined without reference to the bytes surrounding it).  While both
> of these goals have merit, an unfortunate side-effect is that
> characters that take up 2 bytes in various Japanese character sets
> take up 3 bytes in UTF-8.

> This can be worked around by saving in UCS-2 instead, but then ASCII
> users complain, as characters that previously took 1 byte to store now 
> take 2.

I think this inefficiency is a reasonable and acceptable 
for using a universal stateless codeset.  If you can't
accept such an inefficiency, you can use ISO 2022, the
another universal stateful codeset.



David Starner:
> First place, are these standards mutually exclusive? Is it a problem in
> practice to work with both?

At first, we don't have a conversion software yet.  But such a software
would come soon though I don't knoew.

Next, I can use multiple codesets.  I am already using multiple codesets
because Windows/Macintosh (SHIFT-JIS) and Unix (EUC-Japan) uses different 
codesets and the network needs another 7bit codesets (ISO-2022-JP).  
These codesets are incompatible but they can be converted one another
by use of a simple equation.  Thus I can use.  
However, if important softwares such as the kernel, libraries, 
gettext, terminal emulators, and so on decided to use Unicode 
as a only codeset, I would have to give away all softwares which 
depend on these important softwares.

Yes, it is welcome that a software which already support various codeset 
adds Unicode to its list.

---
Tomohiro KUBOTA <kubota@debian.or.jp>

Reply to:

Follow-Ups:
- Re: Asian Problems with Unicode
  - From: sen_ml@eccosys.com

References:
- Re: Asian Problems with Unicode
  - From: Robert Coie <rac@mata.intrigue.com>

Prev by Date: Re: Migrating to GPG - A mini-HOWTO
Next by Date: Re: Deficiencies in Debian
Previous by thread: Re: Asian Problems with Unicode
Next by thread: Re: Asian Problems with Unicode
Index(es):
- Date
- Thread