Re: default character encoding for everything in debian

To: debian-devel@lists.debian.org
Subject: Re: default character encoding for everything in debian
From: Samuel Thibault <sthibault@debian.org>
Date: Wed, 12 Aug 2009 09:56:49 +0200
Message-id: <[🔎] 20090812075649.GQ5487@const.famille.thibault.fr>
Mail-followup-to: debian-devel@lists.debian.org
In-reply-to: <[🔎] 4A825B32.2000009@debian.org>
References: <[🔎] 20090811183800.GE5487@const.famille.thibault.fr> <[🔎] 200908111940.n7BJeZQO067901@neskaya.eckenfels.net> <[🔎] 20090811202423.GA31394@wavehammer.waldi.eu.org> <[🔎] 4A825B32.2000009@debian.org>

Giacomo A. Catenazzi, le Wed 12 Aug 2009 08:03:30 +0200, a écrit :
> Bastian Blank wrote:
> > On Tue, Aug 11, 2009 at 09:40:35PM +0200, Bernd Eckenfels wrote:
> >> In article <[🔎] 20090811183800.GE5487@const.famille.thibault.fr> you wrote:
> >>> Not necessarily.  Any sane implementation should just use wchar_t
> >> Which could be UTF16 and therefore still has complicatd length semantics. 
> > 
> > No, wchar_t is UCS-4 (or UCS-2 in esoteric implementations like
> > Windows).
> 
> No wchar_t is locale dependent (per POSIX).

What do you mean?  The compiler can't know the locale in advance for
the width and endianness.  The value might depend on the locale, yes,
but that's not a problem as long as you convert into UTF-8 before
communicating with other applications.

One same systems (Debian systems are), it's just always UCS-4.

> BTW on gcc:
> 
> -fwide-exec-charset=charset
>     Set the wide execution character set, used for wide string and
> character constants.

It hurts when I shoot myself in the foot.

> The default is UTF-32 or UTF-16, whichever corresponds to the width of
> wchar_t.

This documentation is bogus BTW.  It should read "UCS-4 or UCS-2".

> Note that default encoding is UTF-8, thus giving a UTF-32 wchar_t
> in most developer machines.

I don't understand this sentence.

Samuel

Reply to:

Follow-Ups:
- Re: default character encoding for everything in debian
  - From: Roger Leigh <rleigh@codelibre.net>

References:
- Re: default character encoding for everything in debian
  - From: Samuel Thibault <sthibault@debian.org>
- Re: default character encoding for everything in debian
  - From: Bernd Eckenfels <bernd-09@eckenfels.net>
- Re: default character encoding for everything in debian
  - From: Bastian Blank <waldi@debian.org>
- Re: default character encoding for everything in debian
  - From: "Giacomo A. Catenazzi" <cate@debian.org>

Prev by Date: Re: default character encoding for everything in debian
Next by Date: Remove control field Conflicts if the package is no longer in the repository
Previous by thread: Re: default character encoding for everything in debian
Next by thread: Re: default character encoding for everything in debian
Index(es):
- Date
- Thread