Re: non-ASCII characters in /etc/locales.alias ?
On Thu, Jan 17, 2002 at 12:39:00AM +0900, Tomohiro KUBOTA wrote:
> > I've been looking at /etc/locales.alias and the possibility of
> > auto-generating it from locale-gen; and noticed that it has non-ASCII
> > characters in it: in particular in
> >
> > bokm?l no_NO.ISO-8859-1
> > fran?ais fr_FR.ISO-8859-1
> >
> > I think using non-ASCII characters in /etc/locales.alias is dodgy; it
> > would break in non- ISO-8859-1 environments. Should this be supported?
> > Should /etc/locales.alias have a tag describing its encoding?
> > (e.g. an emacs-type tag)
>
> I agree your opinion. Since definition of non-ASCII characters are done
> by locale, non-ASCII characters cannot be used before the user specifies
> the locale. Before the user specifies the locale, >0x80 characters
> are "undefined characters".
> ISO-8859-1 is a local encoding, just like EUC-JP is local encoding for
> Japanese. Especially, it cannot co-exist with multibyte encodings.
That applies to all system textfiles (/etc, /usr/include).
If wanting to have native-language tags for existing locales is wanted here,
then making an exception for locale.alias is arguable, but it should
probably be UTF-8, not ISO-8859-1. (Either way, programs using it will
need to know to convert it to the current locale's charset.)
日本語 ja_JP.ISO-2022-JP
> (If you edit /etc/locale.alias with multibyte-capable editor in
> multibyte locales, the 8bit "undefined" characters will be probably
> broken. I feel this difficulty of editing when I translated Debian
> webpage templates with "slices". To avoid destroying the Debian web
> files, I have to use non-locale-supporting and 8bit editors. However,
> to edit Japanese, I have to use 8bit-clean and multibyte-clean editor.)
Why do you need an 8-bit-clean editor to edit Japanese? If you're
editing textfiles, in most locales, you're almost never going to be
8-bit-clean (ie. I wouldn't expect an editor in UTF-8 to maintain
invalid UTF-8 sequences.)
--
Glenn Maynard
Reply to: