[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: non-ASCII characters in /etc/locales.alias ?



Hi,

At 17 Jan 2002 10:21:20 +0000,
Alastair McKinstry wrote:

> (1) in /etc/locale.alias, we start with the line
> 
> # -*- coding: iso-8859-1 -* 
> # Locale database
> 
> to signal that the file is encoded in ISO-8859-1. If the encoding is
> changed, this line should be changed.
> This should be documented in the locale.alias manpage (currently in
> BTS).

I propose this to included into /usr/share/doc/locales/examples/
directory.  The default /etc/locale.alias should be ASCII only,
though I agree some people will need compatibility to non-
internationalized (i.e., ISO-8859-1 in /etc/locale.alias)
systems.  I think such people will need to read some documents
(manpages OK, README.Debian OK, any others OK) and set up their
/etc/locale.alias .


> (2) We recommend (again in the manpage) that locales be ASCII-only,
> because of the can't-easily-enter-alias-while-in-conflicting-locale
> problem. 

I see.   (Not "can't easily", but "theoretically impossible".)


> However because of backward compatability we will support the existing
> aliases : people with LC_ALL=bokmål, for example, will want their
> systems to continue working. We can't easily upgrade out of this
> problem; users telneting or sshing from other Linux boxes (or HPUX,
> where these locale aliases started) will not want their displays
> broken).

I propose that ISO-8859-1 version of /etc/locale.alias to be
prepared in /usr/share/doc/locales/examples/ directory.  The
manpage can have an instruction how to use the file.  The 
default /etc/locale.alias should not contain ISO-8859-1 locale
names.

The reason is that, if ISO-8859-1 locale names can be used in
default settings, new users (who don't have to take care of
compatibility to old systems) may want to use the locale names.
This should be avoided because (1) it is simply a wrong thing,
(2) they will depend on a system which cannot co-exist with
international users and they will come to feel i18n as something
annoying, and (3) they will feel one more difficulty to migrate
into UTF-8.  Usage of ISO-8859-1 locale names should be limited
to people who _really_ need the compatibility to old systems
and who read instructions and notices and know what they are
doing.


> (3) 'locale' gets changed to support the coding tag. This fixes the bug
> where
> $ unicode_start ; export LC_ALL=en_US.utf8
> $ locale -a
> lists 'bokml' not  'bokm?l', for example.

I think this is not needed because "fran?ais" and "bokm?l" are
exception and illegal makeshift for compatibility to old non-
internationalized systems.

This opinion is not so strong.  If someone will develop this,
I won't stop him/her.  However, please note this work is more
than many people imagine.  For example, what would be the
"legal" encoding names?  GNU libc names, GNU Emacs names, and
MIME names are different.  Not only names but the real encodings
are different.  Now Li18nux people are trying to construct a
standard names for encodings.  It is not too late to wait the
standard will be released.

I just don't think such works are not worth doing for compatibility
of two dirty locale names of "bokm?l" and "fran?ais".

If this "improvement" of "locale" would enable us to use _any_
multiple encodings we like, it would be nice and might be worth
doing.  However, the "improvement" will enable us to use only
_one_ of any encodings.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/



Reply to: