[please cc me, as indicated in headers] Many Debian users living outside of the US use locales, i.e. they set the environment variable LANG, or LC_DATE, LC_NUMERIC, etc. This post is about the value one should set these variables to. While the locale interface itself is well standardized, there seems to be no resemblence of a standard about these values. SUSv2 calls them "implementation dependent". The current LSB draft does not mention locales. The glibc manual avoids naming names. The only concrete documentation I could find was the setlocale(3) manpage (see below). What's the problem? Well, if you give a value not understood by the program you use, it is generally ignored. There are at least two prominent libraries (glibc and libX11) interpreting the value, and their set of acceptable values is not equal. Since there is no standard, every implementation can claim to be right. Since there is almost no documentation, users have to try and err until they are right. libc example (only works when the locale is present, see /etc/locale.gen): $ for LANG in de de_DE de_DE.foo de_DE.ASCII de_de.iso-8859-1 de_DE.iso-8859-1 de_DE.ISO-8859-1 de_DE.ISO8859-1 de_DE.ISO-88591 de_DE.ISO88591; do printf "%-20s " $LANG; date +"%A"; done de Tuesday de_DE Dienstag de_DE.foo Tuesday de_DE.ASCII Tuesday de_de.iso-8859-1 Tuesday de_DE.iso-8859-1 Dienstag de_DE.ISO-8859-1 Dienstag de_DE.ISO8859-1 Dienstag de_DE.ISO-88591 Tuesday de_DE.ISO88591 Dienstag Wow, so for strftime to work correctly, language and territory have to be there (case is significant); charset must be absent or known, where case is irrelevant this time, some dashes may be omitted, but not every time. Eeek. It gets hairier with libX11. Let's use a gtk program because they issue nice warnings when X11 did not recognize the locale. >From the above variants, only de_DE de_DE.ISO8859-1 succeed without warning, meaning they are recognized by glibc /and/ xlib. The others give the warning, and problems with latin1 characters (see bug#100970). OTOH, glibc seems to suggest that "de_DE.ISO-8859-1" is the standard value, see /etc/locale.alias. "locale -a" reports deutsch, german, de_DE, de_DE@euro. None of these are standard according to xlib. Rather than populating /usr/X11R6/lib/X11/locale/locale.alias with a gazillion alternatives as suggested in bugs 84735, 86903, and 99350, why not standardize on one format and *document* that? I.e. (adapted from setlocale(3)): A locale name is either a convenience alias (see below), or of the form ll_TT[.codeset][@modifier], where ll is a two-letter ISO 639 language code in lower case, TT is a two-letter ISO 3166 country code in upper case. The optional codeset fragment is a character set or encoding identifier. Currently defined values are ISO8859-1, ISO8859-2 [more here]. Variant spellings like iso-8859-1, while accepted by some applications for compatibility, are deprecated. The default codeset is [what?] The optional modifier can select variants of the locale. The only currently defined value is euro (to select the Euro as currency). Convenience aliases are intended to be understandable to users of the locale without need for further documentation. They must not have an underscore as their third character. Systems should provide the description of each locale in english and the locale's language as an alias for this locale. -- Robbe
Attachment:
signature.ng
Description: PGP signature