[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Language Environment Chinese not support, w LANG=zh_CN.GB2312



I wrote:
>  (defvar auto-language-alist
>    '(("^ja" . "Japanese")
> -    ("^zh" . "Chinese")
> +    ("^zh_CN" . "Chinese-GB")
> +    ("^zh_TW" . "Chinese-BIG5")
>      ("^ko" . "Korean"))
>    "Alist of LANG patterns vs. corresponding language environment.
>  Each element looks like (REGEXP . LANGUAGE-ENVIRONMENT).

>>>>> "zw" == zw  <zw@zhaoway.com> writes:

    zw> i think this is quite reasonable.  zh_CN and zh_CN.GB2312 to
    zw> Chinese-GB, indeed a de facto already zh_TW.Big5 (not sure
    zw> about the case) to Chinese-BIG5

OK.  I use Debian (unstable) myself, so I know that Debian supplies
those locales.  However, as an XEmacs developer, I have to worry about
non-Debian systems.  I suppose that these are standard in glibc, but
hoped that you might be better informed. ;-)  I also have to worry about
non-Linux systems, for that matter.

On this basis I will submit a patch to 21.1; it will probably appear
in release 21.1.13.  It will take a few days to be reviewed.

By the way, what version of XEmacs are you using (I don't use the
.deb, for the obvious reason).

    zw> what about zh_CN.GBK? then...  and there is a Big5p ???

The regexps given above are anchored at the beginning, but not the
end.  This means that ALL [mainland] Chinese locales use the
Chinese-GB environment, and ALL Taiwanese locales use Chinese-BIG5.  I
specified it that way on purpose.

I do not fully understand the implications of this for "GBK" or
"Big5p".  I can guess that since these character sets are not yet
implemented in Mule (AFAIK, neither ETL Mule == Emacs 20.7 nor XEmacs
Mule has them), their users will have to limp along without them.

If someone can explain to me what they are and provide pointers to
(English language) standards references, I can probably have them
incorporated into the next generation of XEmacs Mule.  This means the
current 21.2 development branch, not a near-future 21.1.x upgrade.
(Current Mule is running out of space for charsets; there would be
strong opposition to adding these.)  I suppose the ETL people are
already thinking about them; they have far more resources for adding
charsets than we do.

Especially useful would be (1) proposed or adopted Unicode mapping
tables (we will probably be using Unicode as an internal encoding in
the future), and (2) fonts for testing purposes (they don't need to be
pretty, just free).  Failing an actual font, the XLFD registry for
them would be useful in coming up with tentative language environment
definitions.

I would also appreciate being told why these character sets are
important, how important they are to users, and which users care about
them.  All of this will help justify the work involved.

-- 
University of Tsukuba                Tennodai 1-1-1 Tsukuba 305-8573 JAPAN
Institute of Policy and Planning Sciences       Tel/fax: +81 (298) 53-5091
_________________  _________________  _________________  _________________
What are those straight lines for?  "XEmacs rules."

-- 
| This message was re-posted from debian-chinese-gb@lists.debian.org
| and converted from gb2312 to big5 by an automatic gateway.



Reply to: