Re: default character encoding for everything in debian
On Wed, Aug 12, 2009 at 07:54:33AM +0200, Giacomo A. Catenazzi wrote:
> Samuel Thibault wrote:
> > Gunnar Wolf, le Tue 11 Aug 2009 13:28:08 -0500, a écrit :
> >> while length(str) in any language up to the 1990s was a mere
> >> substraction, now we must go through the string checking each byte to
> >> see if it is a Unicode marker and substract the appropriate number of
> >> bytes.
> >
> > Not necessarily. Any sane implementation should just use wchar_t and
> > substraction gets back.
>
> An implementation that use wchar_t is usually not sane, but usually
> it is (also) buggy. It is very difficult (AFAIK not impossible,
> but I'm not so sure) to write portable (POSIX way, so with changing
> locales) programs using wchar_t.
Do you have any concrete examples to back up these assertions?
They worked perfectly well for me last time I checked. There were
bugs in the distant past, but I don't see any issues with current
GCC/libc.
BTW, since POSIX/SUS are a superset of the standard C library, they
contain all of the same wide character handling functionality. I'm
not sure what you're getting at with the "changing locales"; SUS
locale functionality like setlocale() comes directly from C with no
changes.
Regards,
Roger
--
.''`. Roger Leigh
: :' : Debian GNU/Linux http://people.debian.org/~rleigh/
`. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/
`- GPG Public Key: 0x25BFB848 Please GPG sign your mail.
Reply to: