[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: default character encoding for everything in debian

On Wed, Aug 12, 2009 at 07:54:33AM +0200, Giacomo A. Catenazzi wrote:
> Samuel Thibault wrote:
> > Gunnar Wolf, le Tue 11 Aug 2009 13:28:08 -0500, a écrit :
> >> while length(str) in any language up to the 1990s was a mere
> >> substraction, now we must go through the string checking each byte to
> >> see if it is a Unicode marker and substract the appropriate number of
> >> bytes.
> > 
> > Not necessarily.  Any sane implementation should just use wchar_t and
> > substraction gets back.
> An implementation that use wchar_t is usually not sane, but usually
> it is (also) buggy. It is very difficult (AFAIK not impossible,
> but I'm not so sure) to write portable (POSIX way, so with changing
> locales) programs using wchar_t.

Do you have any concrete examples to back up these assertions?

They worked perfectly well for me last time I checked.  There were
bugs in the distant past, but I don't see any issues with current

BTW, since POSIX/SUS are a superset of the standard C library, they
contain all of the same wide character handling functionality.  I'm
not sure what you're getting at with the "changing locales"; SUS
locale functionality like setlocale() comes directly from C with no


  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux             http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?       http://gutenprint.sourceforge.net/
   `-    GPG Public Key: 0x25BFB848   Please GPG sign your mail.

Reply to: