RE: moving to unicode

To: <debian-user@lists.debian.org>
Subject: RE: moving to unicode
From: Žáček Kryštof <Krystof.Zacek@i.cz>
Date: Mon, 6 Feb 2006 16:15:26 +0100
Message-id: <[🔎] DA430F01FCE3E14EB8DE9D02C35C7FAA01B57A@scbu01.cb.i.cz>

I am afraid the problem is still there - hidden behind the functions and wrappers and makes the UTF-8 perform poorly compared to UCS2. The libc/Qt/whatever code will still have to do the nasty things I mentioned previously behind the scenes leading to performance/clarity/design inferiority.

There is just one big reason for UTF-8 - the ASCII compatibility. The rest is all disadvantages. Anyway, UTF-8 is better then nothing (ISO-88x) and there is a big friction of the ASCII Internet/world to changing things drastically. I would not dare to call UTF8 a modern system  - it is a trade off between having UNICODE and the old ASCII at the same time and it is payed for by implementation uggliness and complexity.


> -----Original Message-----
> From: ROBERTOJIMENOCA@terra.es [mailto:ROBERTOJIMENOCA@terra.es] 
> Sent: Monday, February 06, 2006 3:57 PM
> To: Žáček Kryštof; debian-user@lists.debian.org
> Subject: Re: moving to unicode
> 
> ?á?ek Kry?tof wrote:
> > I just second this. Only IMO the UCS2 (fixed two bytes per 
> character) would be much more appropriate to a modern UNICODE 
> system. The variable length (2 to 3 bytes ) UTF-8 encoding 
> can marginally save some space (depending on language) but 
> introduces nasty overhead to character handling - even the 
> most trivial string functions have to check for character 
> boundaries (e.g. even detecting the string length itself is 
> not a trivial operation in UTF-8 !!! or having a fixed length 
> buffer you can never tell in advance how many characters will 
> fit into it - it depends on the language again).
> > 
> > Windows used to have mulitbyte characters in the past 
> (Win95,98) but luckily managed to get rid of this with 
> Windows NT and higher and now both the kernel and userspace 
> is UCS2. Why should Linux again enter the blind alley of Windows 95?
> 
> You have to check some current sources.
> UTF-8 is the defacto modern encoding standard.
> There are lots of functions in current software libraries 
> libc, glib, and QT that encapsulate all the string handling 
> so the problems you mention don't exist.
> 
> > > -----Original Message-----
> > > From: ROBERTOJIMENOCA@terra.es [mailto:ROBERTOJIMENOCA@terra.es]
> > > Sent: Monday, February 06, 2006 2:40 PM
> > > To: debian-user@lists.debian.org
> > > Subject: Re: moving to unicode
> > > 
> > > Adam James wrote:
> > > > On Mon, 2006-01-23 at 16:04 +0100, Lubos Vrbka wrote:
> > > > > is there any up-to-date document how to move a debian 
> system to 
> > > > > utf8 (both console and X)? i found some info on web, 
> however it 
> > > > > seems to be quite old (~4 years)... a pointer to a 
> list of what 
> > > > > doesn't work with
> > > > > utf8 would be really nice, too...
> > > > 
> > > > I found the following documents helpful with regard to UTF-8:
> > > > 
> > > > http://gentoo-wiki.com/HOWTO_Make_your_system_use_unicode/utf-8
> > > > 
> > > > 
> http://hektor.umcs.lublin.pl/~mikosmul/computing/articles/linux-un
> > > > icode.html
> > > 
> > > What is the current progress towards moving Debian fully to
> > > UTF-8 on installation as much as possible to easy users 
> working in 
> > > Debian?
> 
> 
> 
> Prueba el Nuevo Correo Terra; Seguro, Rápido, Fiable.
> 
>

Reply to:

Prev by Date: RE: RTC !??
Next by Date: Re: How to add a new dir to my PATH?
Previous by thread: Re: moving to unicode
Next by thread: openMosix vs openSSI
Index(es):
- Date
- Thread