[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: moving to unicode



On Mon, Feb 06, 2006 at 03:01:41PM +0100, ???ek Kry?tof wrote:
> I just second this. Only IMO the UCS2 (fixed two bytes per character) would be much more appropriate to a modern UNICODE system. The variable length (2 to 3 bytes ) UTF-8 encoding can marginally save some space (depending on language) but introduces nasty overhead to character handling - even the most trivial string functions have to check for character boundaries (e.g. even detecting the string length itself is not a trivial operation in UTF-8 !!! or having a fixed length buffer you can never tell in advance how many characters will fit into it - it depends on the language again).
> 
> Windows used to have mulitbyte characters in the past (Win95,98) but luckily managed to get rid of this with Windows NT and higher and now both the kernel and userspace is UCS2. Why should Linux again enter the blind alley of Windows 95?
> 
> Cheers
> Krystof

Have youi looked at Unicode lately?  It isn't a sizteen-bit code 
anymore. (Was it ever?)  It doesn't fit in two bytes.  If you chop it 
to two, you miss the vast majority of traditional Chinese characters, as 
well as (I believe) character sets such as Tolkien's Elvish.

-- hendrik



Reply to: