[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: default character encoding for everything in debian

Samuel Thibault wrote:
> Gunnar Wolf, le Tue 11 Aug 2009 13:28:08 -0500, a écrit :
>> while length(str) in any language up to the 1990s was a mere
>> substraction, now we must go through the string checking each byte to
>> see if it is a Unicode marker and substract the appropriate number of
>> bytes.
> Not necessarily.  Any sane implementation should just use wchar_t and
> substraction gets back.

An implementation that use wchar_t is usually not sane, but usually
it is (also) buggy. It is very difficult (AFAIK not impossible,
but I'm not so sure) to write portable (POSIX way, so with changing
locales) programs using wchar_t.

The only way I know is to use sanely the wchar_t is to use as the simple
C standard requirements: only one runtime environment and locale.

PS: note that the binary encoding depend on compiler environment (but
such info is not exported).


Reply to: