Re: questions on webwml/english/templete/debian/cdimage.wml
At Sun, 13 Jan 2002 14:49:15 +0100 (CET),
peter karlsson wrote:
>
> Tomohiro KUBOTA:
>
> > Because the algorithm transliterations is not very good.
>
> I know.
>
> > And, many people in the world have to use a small subset of softwares
> > only because such softwares support their native languages.
>
> We're talking about the web pages here, the only software that need
> Unicode support here are the browsers, and most of them do have it (at
> varying degrees).
>
> > Oh, very good. Please note that east Asian will need not only display
> > support but also input support, i.e., XIM support.
>
> Yes, I'm very aware of that as well (although my direct experience with
> IMs is limited). I have worked with the Unicode-adaption of our browser
> for over a year.
>
> > (note there is a rival; ISO-2022 is a multilingual encoding scheme
> > with much longer history).
>
> Yeah, and it's a mess, to be honest. This kind of "state-driven" (for
> lack of a better word) encodings where you cannot easily sync (as you
> can with UTF-8) is not something I like (the same goes for HZ, which is
> just a "simplified" form of ISO-2022).
Note that browsers cannot be free from "state" even if they use Unicode.
For example, rendering of Unicode unified CJK Han Ideographs (which are
logically same character from a certain point of view but large part of
them have significantly different glyphs) needs "state" of "language".
Thus, though it is true ISO-2022 is very complex, please note Unicode
is not so simple. If Unicode were less simpler than human natural
languages, it means that Unicode has defects.
> > I am also wrestling with a problem that Unicode doesn't have a
> > relyable mapping table from/to Japanese legacy encodings.
>
> That's because of some poor design of the legacy encodings, not
> Unicode, with multiple mappings of some characters.
Never. Before appearance of Unicode, these encodings were identical,
except for small number of private additional characters. For example,
Shift_JIS and CP932 is identical if we don't think about conversion
to/from Unicode. Most Japanese people even don't know the name of
"CP932" and they think they are using Shift_JIS. What they think
is correct. However, when Unicode comes, it stated "what you are
using with Windows is CP932, not Shift_JIS." Unicode is the origin
of this confusion by introducing many legacy encodings into Japan.
(I am saying about the chapter of "Conversion tables differ between
venders" in my document
http://www.debian.or.jp/~kubota/unicode-symbols.html .)
---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
"Introduction to I18N" http://www.debian.org/doc/manuals/intro-i18n/
Reply to: