[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Asian Problems with Unicode

i found some discussion about problems w/ unicode (among other things)
-- it's in japanese though.  anyone know of a translation?


i also found that there are some people working on 'utf-2000' (don't
expect to find it described anywhere) -- this might appear in emacs at
some point.  this has an interleaved translation into english from


btw, thank you, Thomas Chan <thomas@atlas.datexx.com> for providing
those examples.  (as a side note, i note that the 6th example is, imo,
a good example of a poor choice for unification -- the zh_cn
rendering(?) of 'bone' probably wouldn't go over very well in japan --
in my opinion as well as in the opinion of a number of japanese folks
i've talked to [though of course, that still leaves it at an opinion
level] -- i wonder who was involved in the unification process...)

the following involves speculation, read at your own risk :-)

given that ucs2 doesn't provide backward compatibility for many
existing asian standards (so, as was mentioned, sorting is also
unpleasant), and that for a multilingual asian document display (and
consequently printing) will be a problem, one partial way out seems to
be to take existing iso-2022-* characters and map them in to the
remaining space of ucs4 in a way that provides backward compatibility
for conversion.  

depending on where characters are mapped to, the number of bytes it
takes to represent a kanji under this scheme in utf8 may not be that
different compared to ucs2-mapped-to-utf8 representations.  also, i
can see the kanji portion of ucs2 not being used for certain types of
applications -- in particular, some multilingual applications where it
might be necessary to express and tell apart kanji from various
different locales.

i'll just shut up now, and go have a look at:


Reply to: