[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Is there anybody interested in supporting GB18030 in debian?



<suzhe@gnuchina.org> writes:

>  GB18030 is at the same situation as UTF-8, because GB18030 covers
>  all of the code points in ISO10646. For example, you can convert a
>  BIG5 text to UCS4 and then convert to GB18030 without losing any
>  information. But you cannot make such convertion between GB2312 and
>  Big5.

Oh, man, I convert Big5 into UTF-8, so suppose I've got zh_CN.UTF-8,
then, Bingo! I can read it now, why should I convert it again to
another encoding? For fun? The same is for GB2312 too.

All characters presented in GB2312 and Big5 will be distincted with
UTF-8 and will be presented with a glyph from an -iso10646-? 
font. Would you tell me what is left unsatisfying?

And all the new information, either from zh_TW.UTF-8 or from
zh_CN.UTF-8, will be presented to you in good form. Why bother GB18030
then? If you do, you will have to do an _extra_ converting from
zh_TW.UTF-8 or from zh_TW.Big5-compatible-new-standard to
zh_CN.GB18030. Wouldn't _this_ be tiresome?

> > Locale is not perfect here. Yes, ``ls'' and ``sh'' can understand
> > GB18030 if you've got your locale straight. But think what will happen
> > if you join an IRC channel with people from TW and CN together? The
> > man on the other side of the Internet doesn't seem to understand your
> > LC_ALL settings. 8-P
>   If the other side of the internet does not use UTF-8 locale
> he(she) cannot understand you either.

The better chance is that they will know UTF-8 better than
GB18030. The worse is their own good government will also come out
with their own whatever FORCEFUL-STD-12345 which will, interestingly,
encode Chinese in another funny way because they also want to cover
ISO-10646 and more, heh, tower of Babylon, then. Will you be happy
with this? (And that is the tower of Babylon _after_ we've got
Unicode. So, Unicode/ISO-10646 will only make us _bigger_ tower of
Babylon. Heh..)

>   If you have lots of archives encoded in GB2312 or GBK, How much
> time will you spend to convert them into UTF-8?

That is called computer, I guess. First of all, to convert from GB2312
into UTF-8, you will lose no information. Second this task is computer
do-able.

With GB18030, seems the only benefit (I may be wrong) is that you have
no need to convert your GB2312 archive, to save your CPU cycles to run
converters from UTF-8 and Big5 into GB18030 then, I guess. Fun! Fun!
Fun! 8-)

And remember we will still have debian-zh-gb@... and
debian-zh-big5@...  Oh, yes they have a big firewall, so they're not
interested in these communication stuff really... 8-)

-- 
zhaoway



Reply to: