Language Environment Chinese not support, w LANG=zh_CN.GB2312


i've completely reworked this aspect of mule initialization as part of my
pending mule patches.  auto-language-alist has been eliminated, and a better
system in place instead [with the locale specifiers in the language environment
itself, and a more powerful system for determining them].  what would help is a
list of all the chinese locales on all the systems you use, along with any
information you know about which charsets they use.



"Stephen J. Turnbull" wrote:
> I wrote:
> >  (defvar auto-language-alist
> >    '(("^ja" . "Japanese")
> > -    ("^zh" . "Chinese")
> > +    ("^zh_CN" . "Chinese-GB")
> > +    ("^zh_TW" . "Chinese-BIG5")
> >      ("^ko" . "Korean"))
> >    "Alist of LANG patterns vs. corresponding language environment.
> >  Each element looks like (REGEXP . LANGUAGE-ENVIRONMENT).
> >>>>> "zw" == zw  <zw@zhaoway.com> writes:
>     zw> i think this is quite reasonable.  zh_CN and zh_CN.GB2312 to
>     zw> Chinese-GB, indeed a de facto already zh_TW.Big5 (not sure
>     zw> about the case) to Chinese-BIG5
> OK.  I use Debian (unstable) myself, so I know that Debian supplies
> those locales.  However, as an XEmacs developer, I have to worry about
> non-Debian systems.  I suppose that these are standard in glibc, but
> hoped that you might be better informed. ;-)  I also have to worry about
> non-Linux systems, for that matter.
> On this basis I will submit a patch to 21.1; it will probably appear
> in release 21.1.13.  It will take a few days to be reviewed.
> By the way, what version of XEmacs are you using (I don't use the
> .deb, for the obvious reason).
>     zw> what about zh_CN.GBK? then...  and there is a Big5p ???
> The regexps given above are anchored at the beginning, but not the
> end.  This means that ALL [mainland] Chinese locales use the
> Chinese-GB environment, and ALL Taiwanese locales use Chinese-BIG5.  I
> specified it that way on purpose.
> I do not fully understand the implications of this for "GBK" or
> "Big5p".  I can guess that since these character sets are not yet
> implemented in Mule (AFAIK, neither ETL Mule == Emacs 20.7 nor XEmacs
> Mule has them), their users will have to limp along without them.
> If someone can explain to me what they are and provide pointers to
> (English language) standards references, I can probably have them
> incorporated into the next generation of XEmacs Mule.  This means the
> current 21.2 development branch, not a near-future 21.1.x upgrade.
> (Current Mule is running out of space for charsets; there would be
> strong opposition to adding these.)  I suppose the ETL people are
> already thinking about them; they have far more resources for adding
> charsets than we do.
> Especially useful would be (1) proposed or adopted Unicode mapping
> tables (we will probably be using Unicode as an internal encoding in
> the future), and (2) fonts for testing purposes (they don't need to be
> pretty, just free).  Failing an actual font, the XLFD registry for
> them would be useful in coming up with tentative language environment
> definitions.
> I would also appreciate being told why these character sets are
> important, how important they are to users, and which users care about
> them.  All of this will help justify the work involved.
