[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: non-ASCII characters in /etc/locales.alias ?



Hi,

At Wed, 16 Jan 2002 22:22:18 -0500,
Glenn Maynard wrote:

> > Usage of UTF-8 will be adopted when all of us will use UTF-8 in future.
> > (However, not now.)
> If a file in /etc needs to have a non-ASCII charset, I'm suggesting it
> be UTF-8.  If it's not really needed, it should be avoided.

Right.

> 
> > > 日本語		ja_JP.ISO-2022-JP
> > 
> > Hmm, I think it is not useful because we cannot input Japanese character
> > unless configuring Japanese locale.  It is just "the key for this locked
> > box is inside the box" situation.  (Usage of ISO-8859-1 character before
> > definition of locale is also this situation.)
> 
> If you're in en_US.UTF-8, and an application wants to display a list of
> available character sets, you can display it.  I'm not sure what the
> main goal of this list is; for selecting locales (eg. LC_CTYPE), you're
> right.

Names of locales should be locale-free.  If a name of a locale is
not ASCII (for example, EUC-JP), then I have to set locale to EUC-JP
and then I can use the name.  It is useless.


> You can have multiple aliases for a single locale, as is already done
> bokmal and french.  This seems to mess up Alastair's idea a little,
> since a list will have duplicates, but that's not a big deal.

Having aliases is not a bad idea.  Now I am not talking about this.
Imagine how I can use LANG=fanc?is.  I have to set LANG=*.ISO-8859-1
(or some other locales) in prior to writing LANG=fanc?ais.

> In any event, it *is* probably best to stick with plain old 7-bit ASCII
> for this file.

Right.

> > We need 8-bit-clean editor because EUC-JP is 8bit encoding.
> 
> So you need an EUC-JP-capable editor.  Why does that imply you have to
> use an editor that doesn't support locales ("non-locale-supporting ...
> editors") to edit Debian files?  If you're editing a file that's not
> EUC-JP, just tell your editor that.  Much easier than using different
> editors (which seems to be what you said.)

Please note that the file I am talking about is a mixture of
various encodings.  Though it is illegal, I need to edit it
because I don't have time to reconstruct the whole translation
system of Debian web pages.

If I use EUC-JP-capable editor to edit the file, the non-ASCII
and non-EUC-JP part of the file will be broken because such part
will be regarded as illegal EUC-JP sequence.  (It depends on the
editors how to deal with such illegal sequences.)

> I wonder if there's any way to get Vim to look for
> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=Shift-JIS">
> lines in HTML documents and change fileencoding accordingly.

Not HTML.  I am saying about "wml" source of Debian web pages.
For example, 

http://cvs.debian.org/webwml/english/template/debian/common_translation.wml?cvsroot=webwml

Though I have not researched yet, I have heard that translation
files for GNOME have the similar illegal structure, i.e, mixture
of encodings in one file.

---
Tomohiro KUBOTA <kubota@debian.org>
http://www.debian.or.jp/~kubota/
"Introduction to I18N"  http://www.debian.org/doc/manuals/intro-i18n/



Reply to: