Re: Asian Problems with Unicode

To: Robert Coie <rac@mata.intrigue.com>, debian-devel@lists.debian.org
Subject: Re: Asian Problems with Unicode
From: David Starner <dvdeug@x8b4e53cd.dhcp.okstate.edu>
Date: Sat, 11 Sep 1999 04:44:13 -0500
Message-id: <[🔎] 19990911044413.A5118@x8b4e53cd.dhcp.okstate.edu>
Reply-to: dstarner98@aasaa.ofe.org
In-reply-to: <[🔎] 199909110009.RAA05787@mata.intrigue.com>
References: <[🔎] 19990910002219E.1000@eccosys.com> <[🔎] 199909110009.RAA05787@mata.intrigue.com>

On Fri, Sep 10, 1999 at 05:09:12PM -0700, Robert Coie wrote:
> Aside from the concerns which have been brought up so far, another
> potential reason for lack of adoption of Unicode is the inefficiency
> of UTF-8 as a storage format (at least for Japanese text).  One of the
> design goals of UTF-8 was upwards compatibility with 7-bit ASCII.
> Another was context-free parsing (i.e. a byte's meaning can be
> determined without reference to the bytes surrounding it).  While both
> of these goals have merit, an unfortunate side-effect is that
> characters that take up 2 bytes in various Japanese character sets
> take up 3 bytes in UTF-8.
> 
> This can be worked around by saving in UCS-2 instead, but then ASCII
> users complain, as characters that previously took 1 byte to store now 
> take 2.

First place, are these standards mutually exclusive? Is it a problem in
practice to work with both?

Second, this isn't a big deal. I don't believe most people have huge 
amounts of uncompressed text laying about, at least not enough to 
make a doubling of the space make a real difference. As for compressed
text, almost any compressor should get the text down to about the
same space usage. (Feel free to prove me wrong here with real numbers.)

David Starner - dstarner98@aasaa.ofe.org

Reply to:

Follow-Ups:
- Re: Asian Problems with Unicode
  - From: Robert Coie <rac@mata.intrigue.com>

References:
- Re: Multibyte encoding - what should a package provide?
  - From: sen_ml@eccosys.com
- Re: Asian Problems with Unicode
  - From: Robert Coie <rac@mata.intrigue.com>

Prev by Date: Re: Strategy: DNS server in main for potato?
Next by Date: Re: nfs-root mail-transport-agent??
Previous by thread: Re: Asian Problems with Unicode
Next by thread: Re: Asian Problems with Unicode
Index(es):
- Date
- Thread