[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#311259: error and omission in documentation of LANGUAGE



On Sat, Jun 04, 2005 at 05:17:48PM -0400, Daniel Jacobowitz wrote:
> On Sat, Jun 04, 2005 at 10:43:41PM +0200, Denis Barbier wrote:
> > On Fri, Jun 03, 2005 at 04:56:18PM -0400, Daniel Jacobowitz wrote:
> > > Except that the problem is that (if everything but LANGUAGE is unset)
> > > I would have expected LANGUAGE to set LC_MESSAGES, and it doesn't. 
> > 
> > This situation should not happen, this is a user configuration error.
> > All non-ASCII characters are replaced by question marks if LC_CTYPE
> > is unset, so these settings are not usable.
> 
> Huh?  I think we're talking past each other.  Let me go back to
> examples.
> 
> drow@nevyn:~% env - LANG=de_DE cat -h 
> cat: Ungültige Option -- h
> ,,cat --help" gibt weitere Informationen.
> 
> With just LANG set, LC_MESSAGES and LC_CTYPE are inferred from LANG.

Sure, this is the well established behavior documented in POSIX.

> drow@nevyn:~% env - LANGUAGE=de_DE cat -h
> cat: invalid option -- h
> Try `cat --help' for more information.
> 
> With just LANGUAGE set, LC_MESSAGES and LC_CTYPE are not inferred
> from LANG, because of the choice I think we need to document.
  ^^^^^^^^^  Do you mean: from LANGUAGE?
> They remain as C.

Yes, LANGUAGE is only checked to select message catalogs, nothing else.

> drow@nevyn:~% env - LANG=ja_JP LANGUAGE=de_DE cat -h
> cat: Ung«ältige Option -- h
> ,,cat --help¡È gibt weitere Informationen.
> 
> With LANG set to anything other than C, LANGUAGE set, and nothing else,
> LC_MESSAGES is inferred from C.
                          ^^^^^^  Should be: from LANGUAGE?
> Interestingly I appear to get LC_CTYPE from LANG, not from LANGUAGE.
> This also is not clear from the documentation.

You have the same result with
  $ env - LANG=ja_JP LC_MESSAGES=de_DE cat -h
POSIX states that libc behavior is unspecified if LC_MESSAGES and
LC_CTYPE do not share the same encoding, which is why this situation
is a user error.
Your example is quite similar, libc functions try to convert German
strings from UTF-8 to EUC-JP, which is not possible.

In order to have a working environment, one has to set LC_MESSAGES
in accordance with LC_CTYPE, and LANGUAGE can then be used to
give a list of message catalogs using the same encoding.
Of course, setting LANG to a UTF-8 locale is the only choice when
encodings are not compatible, as with your example:
  $ env - LANG=ja_JP.UTF-8  LANGUAGE=de_DE cat -h

Denis



Reply to: