[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Is there anybody interested in supporting GB18030 in debian?



<suzhe@gnuchina.org> writes:

> On 22 Jan 2001, zhaoway wrote:
> 
> > It's not just locale. Say, if I want read Chinese and Japanese at the
> > same time on the same XTerm, UTF-8 will do it, GB18030 won't. Glibc
> > can of course even support GB2312, but if XTerm use GB2312 locale, it
> > won't be able to read Japanese (whatever encoding) then. So you will
> > have to use UTF-8. Then Gb18030 has quirk.
>
> GB18030 of course can do it! I can edit Chinese, Japanese, Korean,
> Russian and many other language in a gedit window with GB18030 locale!

Oh, yeah. To summarize it, you have better chance to bet all of the
other part of the world will most probably prefer UTF-8 than GB18030.

Back to the problem, if every good country will get out their own set
of standards to do what UTF-8 being supposed to do, then suppose I
download an article describing MULE techinicals with a Chinese
filename encoded in, unfortunately, the _good_country_abcde_'s
STD-12345, now back to the problem, what will GB18030 locale do to
help?

If you're using wget to get that article, your command line with that
filename is probably in a horror, and even worse, the filename of the
file you've just downloaded onto your filesystem will neither be in a
good state either. And the worst things is, how will you input that
filename encoded in the _good_country_abcde_'s STD-12345 from your
GB18030 locale? So you know that is Chinese, you can read it, you
recognize the nice glyphs, and you can't even get it! 8-P

So, again, my question is, what does GB18030 provide to us, which
cannot be solved with UTF-8, or Unicode surrogates? (The current
version of Unicode is not perfect, I agree. But there're no fixed
obstacles there, fortunately. And I agree the pressure from China on
Unicode is also good and necessary too.)

> > Oh, man, FontSet is cool, but even cooler is UTF-8 locale and iso10646
> > fonts. FontSet can, cough, _not_ support GB2312 and Big5 in the same
> > IRC window. If XChat use UTF-8, and we all use UTF-8, then we can
> > (people from HK, TW, CN) chat at the same time in #debian-zh. Man,
> > it's not locale here that matters. It's the distinction of characters
> > here I'm talking about.
> 
>   I use ISO10646-1 font under GB18030 locale, but the problem is, there
> are very few complete ISO10646-1 font available. We should use fontset
> instead of a incomplete ISO10646-1 font for a UTF-8 locale either.
>   And not all of us can use UTF-8 locale on there system.

As far as I can say, the _on-the-fly_re-encoding_ layer of the new
XFree86 4 will have a far more better chance to win out. Which means,
you can (will?) use -gb2312.1980-0 + -big5-0 + -iso8859-1 + whatever
as an on-the-fly -iso10646-? font which will eliminate all the needs
for the FontSet. And with this technique being introduced into XFree86
(IMHO, _this_ is the state of the art of XFree86, BTW) you will be
able to use -iso10646-? fonts in GB2312 locale too.

So with all of these questions, if anyone could tell me the benefits
of using GB18030 over GB2312, UTF-8, etc. I will thank you very much.

-- 
zhaoway



Reply to: