[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Is there anybody interested in supporting GB18030 in debian?



Yes, the UTF-8/UCS4 is the ultimate solution for all of us. But why most
of us are not using it? Why most of chinese guys are still using
GB2312/BIG5? 

The most important benefit of GB18030 is the GB2312/GBK users do not need
convert there data/archieves from GB2312/GBK to other encoding to gain the
ability of processing many different languages simultaneously. And they
can convert there data from GB18030 to UTF-8 without losing anything, when
UTF-8/UCS4 is used widely.

I think it will not only waste some CPU cycles but lots of money and time
for most of the chinese guys to convert there data from GB2312/GBK to
UTF-8! It's indeed a hard work! We cannot expect most of the GB2312/GBK
users  converting to use UTF-8 in a night or a year!

Regards,
James Su

On 23 Jan 2001, zhaoway wrote:

> <suzhe@gnuchina.org> writes:
> 
> >  GB18030 is at the same situation as UTF-8, because GB18030 covers
> >  all of the code points in ISO10646. For example, you can convert a
> >  BIG5 text to UCS4 and then convert to GB18030 without losing any
> >  information. But you cannot make such convertion between GB2312 and
> >  Big5.
> 
> Oh, man, I convert Big5 into UTF-8, so suppose I've got zh_CN.UTF-8,
> then, Bingo! I can read it now, why should I convert it again to
> another encoding? For fun? The same is for GB2312 too.
> 
> All characters presented in GB2312 and Big5 will be distincted with
> UTF-8 and will be presented with a glyph from an -iso10646-? 
> font. Would you tell me what is left unsatisfying?
> 
> And all the new information, either from zh_TW.UTF-8 or from
> zh_CN.UTF-8, will be presented to you in good form. Why bother GB18030
> then? If you do, you will have to do an _extra_ converting from
> zh_TW.UTF-8 or from zh_TW.Big5-compatible-new-standard to
> zh_CN.GB18030. Wouldn't _this_ be tiresome?
> 
> > > Locale is not perfect here. Yes, ``ls'' and ``sh'' can understand
> > > GB18030 if you've got your locale straight. But think what will happen
> > > if you join an IRC channel with people from TW and CN together? The
> > > man on the other side of the Internet doesn't seem to understand your
> > > LC_ALL settings. 8-P
> >   If the other side of the internet does not use UTF-8 locale
> > he(she) cannot understand you either.
> 
> The better chance is that they will know UTF-8 better than
> GB18030. The worse is their own good government will also come out
> with their own whatever FORCEFUL-STD-12345 which will, interestingly,
> encode Chinese in another funny way because they also want to cover
> ISO-10646 and more, heh, tower of Babylon, then. Will you be happy
> with this? (And that is the tower of Babylon _after_ we've got
> Unicode. So, Unicode/ISO-10646 will only make us _bigger_ tower of
> Babylon. Heh..)
> 
> >   If you have lots of archives encoded in GB2312 or GBK, How much
> > time will you spend to convert them into UTF-8?
> 
> That is called computer, I guess. First of all, to convert from GB2312
> into UTF-8, you will lose no information. Second this task is computer
> do-able.
> 
> With GB18030, seems the only benefit (I may be wrong) is that you have
> no need to convert your GB2312 archive, to save your CPU cycles to run
> converters from UTF-8 and Big5 into GB18030 then, I guess. Fun! Fun!
> Fun! 8-)
> 
> And remember we will still have debian-zh-gb@... and
> debian-zh-big5@...  Oh, yes they have a big firewall, so they're not
> interested in these communication stuff really... 8-)
> 
> -- 
> zhaoway
> 
> 
> --  
> To UNSUBSCRIBE, email to debian-chinese-gb-request@lists.debian.org
> with a subject of "unsubscribe". Trouble? Contact listmaster@lists.debian.org
> 



Reply to: