[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: UTF-8 locales



On Sun, Nov 19, 2000 at 10:50:54PM +0100, Bernd Eckenfels wrote:
> On Sat, Nov 18, 2000 at 08:01:11PM -0600, David Starner wrote:
> > Which includes the Chinese and Japenese, who need the characters found
> > in the Supplementary Ideographic Planes, which means 4 byte characters.
> 
> Afaik UTF8 is not able to encode 32bit unicode? 

No, UTF-8 can encode 32-bit Unicode. UTF-16 can only encode 21.1-bit Unicode,
so Unicode got chopped off there, but everyone rounds that up to 4 bytes.

> I thought this is because
> the "living" languages are all restricted to 16bit? Hmm... i might be wrong.
> Does that mean Java does not support asian languages with its 16bit Unicode?

The major 'living' languages are in the Basic Multilingual Plane, which is
16-bit Unicode. Japanese and Chinese are supported by characters in the BMP 
as well as any pre-1995 CJK standard, but they are in the process of 
standardizing ~50,000 ideographs for Chinese and Japanese outside the BMP. 
They're mostly very rare characters, but it's really nessecary for full 
support of CJK in Unicode. Java doesn't currently support them, but
plans to when they actually get added to the standard.

-- 
David Starner - dstarner98@aasaa.ofe.org
http://dvdeug.dhis.org
Looking for a Debian developer in the Stillwater, Oklahoma area 
to sign my GPG key

Attachment: pgp7qD3NbGoTE.pgp
Description: PGP signature


Reply to: