Re: default character encoding for everything in debian
Bernd Eckenfels, le Tue 11 Aug 2009 21:40:35 +0200, a écrit :
> In article <[🔎] 20090811183800.GE5487@const.famille.thibault.fr> you wrote:
> > Not necessarily. Any sane implementation should just use wchar_t
>
> Which could be UTF16 and therefore still has complicatd length semantics.
??
wchar_t may be 32 or 16bit (in which case it can't express unicode after
U+FFFF), but it's still meant to have the simple length semantics.
> And even with UTF32 there are combining characters.
Which account for one character. Then there is a problem of rendering
width of course, but as I said it's there anyway as soon as you have
a font with varying letter widths, string manipulation don't pose any
problem anyway.
> But the length could be defined in code units - its just a question
> how usefull it is.
Of course. It's rarely useful to take into account character width
yourself, unless you are rendering on a tty, but then speed usually
doesn't matter and you can afford calling wcswidth() on your string
as late as possible.
Samuel
Reply to: