Bug#522776: Subject: Re: Bug#522776: debian-policy: mandate existence of a standardised UTF-8 locale
Giacomo A. Catenazzi writes:
> [Andrew McMillan probably]
> I think nobody should use "C" or "C.UTF-8" as user encoding.
> And I really hope that Debian will try to convince user to
> use a proper locale.
Debian doesn't ship a proper locale. I want sorting according
to the raw Unicode values. I want iswprint() to return non-zero
for a Cyrillic character, a Korean character, etc.
Debian shouldn't be setting locale-related environment variables
unless the user specifically chooses. The implementation-specific
defaults, applied in the absense of any environment variables,
should support Unicode.
>> * All ISO8859 locales are moved to a new locales-legacy-encodings
>> package.
>
> This encoding is used also on CD/, floppy, remote filesystems,
> USB pens, on a lot of internet pages, etc.
Nope.
It's actually UTF-16 in VFAT, Joliet, CIFS, and so on. Linux has
mount options to control how that gets make POSIX-compatible.
You can choose UTF-8. (this should be Debian's default)
> But an ASCII7 "C" encoding allow you to do the same things. It doesn't
> forbid 8 bit characters (thus UTF-8). Unix is transparent on characters
> (i.e. binary and text are the same, you can grep binaries, ...).
>
> So scripts should use LANG=C on most cases.
That leaves iswprint() and towupper() broken. (not that it must)
Reply to: