Bug#311259: error and omission in documentation of LANGUAGE
On Sat, Jun 04, 2005 at 05:17:48PM -0400, Daniel Jacobowitz wrote:
> On Sat, Jun 04, 2005 at 10:43:41PM +0200, Denis Barbier wrote:
> > On Fri, Jun 03, 2005 at 04:56:18PM -0400, Daniel Jacobowitz wrote:
> > > Except that the problem is that (if everything but LANGUAGE is unset)
> > > I would have expected LANGUAGE to set LC_MESSAGES, and it doesn't.
> >
> > This situation should not happen, this is a user configuration error.
> > All non-ASCII characters are replaced by question marks if LC_CTYPE
> > is unset, so these settings are not usable.
>
> Huh? I think we're talking past each other. Let me go back to
> examples.
>
> drow@nevyn:~% env - LANG=de_DE cat -h
> cat: Ungültige Option -- h
> ,,cat --help" gibt weitere Informationen.
>
> With just LANG set, LC_MESSAGES and LC_CTYPE are inferred from LANG.
Sure, this is the well established behavior documented in POSIX.
> drow@nevyn:~% env - LANGUAGE=de_DE cat -h
> cat: invalid option -- h
> Try `cat --help' for more information.
>
> With just LANGUAGE set, LC_MESSAGES and LC_CTYPE are not inferred
> from LANG, because of the choice I think we need to document.
^^^^^^^^^ Do you mean: from LANGUAGE?
> They remain as C.
Yes, LANGUAGE is only checked to select message catalogs, nothing else.
> drow@nevyn:~% env - LANG=ja_JP LANGUAGE=de_DE cat -h
> cat: Ung«ältige Option -- h
> ,,cat --help¡È gibt weitere Informationen.
>
> With LANG set to anything other than C, LANGUAGE set, and nothing else,
> LC_MESSAGES is inferred from C.
^^^^^^ Should be: from LANGUAGE?
> Interestingly I appear to get LC_CTYPE from LANG, not from LANGUAGE.
> This also is not clear from the documentation.
You have the same result with
$ env - LANG=ja_JP LC_MESSAGES=de_DE cat -h
POSIX states that libc behavior is unspecified if LC_MESSAGES and
LC_CTYPE do not share the same encoding, which is why this situation
is a user error.
Your example is quite similar, libc functions try to convert German
strings from UTF-8 to EUC-JP, which is not possible.
In order to have a working environment, one has to set LC_MESSAGES
in accordance with LC_CTYPE, and LANGUAGE can then be used to
give a list of message catalogs using the same encoding.
Of course, setting LANG to a UTF-8 locale is the only choice when
encodings are not compatible, as with your example:
$ env - LANG=ja_JP.UTF-8 LANGUAGE=de_DE cat -h
Denis
Reply to: