Re: Multibyte encoding - what should a package provide?
Hi,
From: sen_ml@eccosys.com
Subject: Re: Multibyte encoding - what should a package provide?
Date: Fri, 10 Sep 1999 00:22:19 +0900
> kubota> Please note, Unicode is not popular at all in Asia. I am sure
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> kubota> there are very very few people using Unicode in Japan. Instead,
> kubota> EUC-JP is popular for UNIX and SHIFT-JIS is the OS's coding
> kubota> system for Windows/Macintosh in Japan.
>
> why is it not popular? what are the reasons? i keep hearing this, but
> i haven't come across an enumeration of those reasons.
>
> any pointers to documents related to this would be much appreciated.
I said 'Asia', but I know only for Japan, Korea, and China.
How about other countries in Asia? Are there any member from
these countries in Debian Project? If there are, please add
comments.
1. Japan, Korea, and China have their own standard character codes.
Unicode has no relation to them. Unicode does not respect
compatibility to these standard codes.
2. Japan, Korea, and China have similar but different characters which
have the same origin. Unicode unified similar characters for a
technical reason -- 16bit is insufficient. Though Japanese, Korean,
and Chinese have similar characters, they are different. Some of
us don't care, and some cares -- for example, people whose name
cannot be correctly expressed by Unicode, who research languages,
and so on.
Large-scale softwares such as Tcl/Tk 8.1 sometimes uses Unicode as
an INTERNAL codeset. These softwares have automatic code-conversion
faculty which works for every input/output (against keyboard, display,
file, and everything). Such an imprementation is acceptable because
users need not treat Unicode.
There is a codeset which can express many languages at the same
time --- ISO 2022-* series. It respects compatibility to
character sets (ASCII, Japanese, Korean, ...) it includes.
However, ISO 2022-* is STATEFUL codeset and it might be a complex
work to imprement ISO 2022-*.
An ideal multilingualized software should have an ability to choose
Unicode, ISO 2022-*, and other local codesets, as Mule and (X)Emacs do.
As I said a several days ago at this mailing list, I am writing a
document on I18N. I have already released drafts at Debian JP
mailing list twice (but the drafts have empty chapters yet) and
discussion is now running.
---
Tomohiro KUBOTA <kubota@debian.or.jp>
Reply to: