Re: default character encoding for everything in debian

Bernd Eckenfels, le Tue 11 Aug 2009 21:40:35 +0200, a écrit :
> In article <[🔎] 20090811183800.GE5487@const.famille.thibault.fr> you wrote:
> > Not necessarily.  Any sane implementation should just use wchar_t
> Which could be UTF16 and therefore still has complicatd length semantics. 


wchar_t may be 32 or 16bit (in which case it can't express unicode after
U+FFFF), but it's still meant to have the simple length semantics.

> And even with UTF32 there are combining characters.

Which account for one character. Then there is a problem of rendering
width of course, but as I said it's there anyway as soon as you have
a font with varying letter widths, string manipulation don't pose any
problem anyway.

> But the length could be defined in code units - its just a question
> how usefull it is.

Of course.  It's rarely useful to take into account character width
yourself, unless you are rendering on a tty, but then speed usually
doesn't matter and you can afford calling wcswidth() on your string
as late as possible.


