add UTF-8 -> GB2312 with traditional/simplified mapping
hi guys,
I would like to hear your opinion on adding
traditional char in UTF-8 -> simplified char in GB2312 conversion in
iconv. It seems currently, the UTF-8 -> GB2312 will ignore code points
which only has simpified version characters in GB2312. This is a natural
thing since unicode defined the glyphs, not the literal meaning of the
characters. It might not be 'strictly correct' to convert a traditinal Chinese
character in unicode to its simplified counterpart in GB2312. However,
without this kind of mapping, user will experience character losing when
we do a big5->utf-8->gb2312.
Since more and more applications start using unicode (in whatever
encoding) as their internal representation and using iconv to do the
I/O conversion, the big5->utf-8->gb2312 route happens more and more often.
If the traditional characters in big5 can be mapped to their simplified
version at the utf-8->gb2312 stage, it will make the end-user experience more
pleasant.
The problem will disappare when we switch to GB18030. But there
will still be people using GB2312 for a long time.
What do you think? Is it a good idea and what's the disadvantage
it will bring us?
--
Best regard
hashao
--
| This message was re-posted from debian-chinese-gb@lists.debian.org
| and converted from gb2312 to big5 by an automatic gateway.
Reply to: