[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: UTF-8 locales



On Sat, Nov 18, 2000 at 08:01:11PM -0600, David Starner wrote:
> Which includes the Chinese and Japenese, who need the characters found
> in the Supplementary Ideographic Planes, which means 4 byte characters.

Afaik UTF8 is not able to encode 32bit unicode? I thought this is because
the "living" languages are all restricted to 16bit? Hmm... i might be wrong.
Does that mean Java does not support asian languages with its 16bit Unicode?

<blockquote cite="http://www.unicode.org/unicode/standard/principles.html";>

  While 65,000 characters are sufficient for encoding most of the many
  thousands of characters used in major languages of the world, the Unicode
  standard and ISO 10646 provide an extension mechanism called UTF-16 that
  allows for encoding as many as a million more characters, without use of
  complex modes or escape codes.  This is sufficient for all known character
  encoding requirements, including full coverage of all historic scripts of
  the world.

</blockquote>

As I understand it, all living languages are contained in the "not-extended"
16bit set. No?

Greetings
Bernd
-- 
  (OO)      -- Bernd_Eckenfels@Wendelinusstrasse39.76646Bruchsal.de --
 ( .. )  ecki@{inka.de,linux.de,debian.org} http://home.pages.de/~eckes/
  o--o     *plush*  2048/93600EFD  eckes@irc  +497257930613  BE5-RIPE
(O____O)  When cryptography is outlawed, bayl bhgynjf jvyy unir cevinpl!



Reply to: