[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: non-ASCII characters in /etc/locales.alias ?



On Thu, Jan 17, 2002 at 09:29:55AM +0900, Tomohiro KUBOTA wrote:
> Usage of UTF-8 will be adopted when all of us will use UTF-8 in future.
> (However, not now.)

If a file in /etc needs to have a non-ASCII charset, I'm suggesting it
be UTF-8.  If it's not really needed, it should be avoided.

> > 日本語		ja_JP.ISO-2022-JP
> 
> Hmm, I think it is not useful because we cannot input Japanese character
> unless configuring Japanese locale.  It is just "the key for this locked
> box is inside the box" situation.  (Usage of ISO-8859-1 character before
> definition of locale is also this situation.)

If you're in en_US.UTF-8, and an application wants to display a list of
available character sets, you can display it.  I'm not sure what the
main goal of this list is; for selecting locales (eg. LC_CTYPE), you're
right.

You can have multiple aliases for a single locale, as is already done
bokmal and french.  This seems to mess up Alastair's idea a little,
since a list will have duplicates, but that's not a big deal.

In any event, it *is* probably best to stick with plain old 7-bit ASCII
for this file.

> One addition, we usually use ja_JP.eucJP locale.  ja_JP.ISO-2022-JP
> is not supported by GNU libc 2.2 because ISO-2022-JP is "stateful".

Sure; it's just an example.

> > > (If you edit /etc/locale.alias with multibyte-capable editor in
> > > multibyte locales, the 8bit "undefined" characters will be probably
> > > broken.  I feel this difficulty of editing when I translated Debian
> > > webpage templates with "slices".  To avoid destroying the Debian web
> > > files, I have to use non-locale-supporting and 8bit editors.  However,
> > > to edit Japanese, I have to use 8bit-clean and multibyte-clean editor.)
> > 
> > Why do you need an 8-bit-clean editor to edit Japanese?  If you're
> > editing textfiles, in most locales, you're almost never going to be
> > 8-bit-clean (ie. I wouldn't expect an editor in UTF-8 to maintain
> > invalid UTF-8 sequences.)
> 
> We need 8-bit-clean editor because EUC-JP is 8bit encoding.

So you need an EUC-JP-capable editor.  Why does that imply you have to
use an editor that doesn't support locales ("non-locale-supporting ...
editors") to edit Debian files?  If you're editing a file that's not
EUC-JP, just tell your editor that.  Much easier than using different
editors (which seems to be what you said.)

I wonder if there's any way to get Vim to look for
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=Shift-JIS">
lines in HTML documents and change fileencoding accordingly.

-- 
Glenn Maynard



Reply to: