[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

add UTF-8 -> GB2312 with traditional/simplified mapping



hi guys,
   
   I would like to hear your opinion on adding 
traditional char in UTF-8 -> simplified char in GB2312 conversion in 
iconv. It seems currently, the UTF-8 -> GB2312 will ignore code points
which only has simpified version characters in GB2312. This is a natural
thing since unicode defined the glyphs, not the literal meaning of the
characters. It might not be 'strictly correct' to convert a traditinal Chinese
character in unicode to its simplified counterpart in GB2312. However,
without this kind of mapping, user will experience character losing when
we do a big5->utf-8->gb2312.

    Since more and more applications start using unicode (in whatever
encoding) as their internal representation and using iconv to do the 
I/O conversion, the big5->utf-8->gb2312 route happens more and more often. 
If the traditional characters in big5 can be mapped to their simplified
version at the utf-8->gb2312 stage, it will make the end-user experience more
pleasant.

    The problem will disappare when we switch to GB18030. But there
will still be people using GB2312 for a long time.

    What do you think? Is it a good idea and what's the disadvantage
it will bring us?

-- 
Best regard
hashao

-- 
| This message was re-posted from debian-chinese-gb@lists.debian.org
| and converted from gb2312 to big5 by an automatic gateway.



Reply to: