[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#471021: locales: EastAsianAmbiguous character width is always 1 in UTF-8



On Sun, Jan 11, 2009 at 11:17:48AM +0900, Masanori Goto wrote:
> wcwidth() is legacy function so that it cannot handle wide, RTL and
> combined characters correctly.  An environment value to select its
> behavior is one way, but it's just a hack and it's hard to specify in libc.
> 
> So, according to UAX#11 definition, it says we should return
> 1 for EastAsiasnAmbiguous characters unless a rigid signal
> (like "language tag, script identification, associated font, source of data")
> is available in UTF-8.  It's sure that we can introduce such kind of change
> for SJIS/EUC-JP, but it's hard to decide for ja_JP.UTF-8.
> 
> Overall, we have no way to expand wcwidth() correctly and rightly,
> so I think each application should handle the actual font size of characters
> instead of using wcwidth().

Thank you for your explanation.

I understand that unable to expand wcwidth()
and each application should be modified.

But each application implements each approach now
For example, own one, various version of Markus Kuhn's wcwidth.
In my layman's idea, could libc offer common method for it?

Regards,
	dai
-- 



Reply to: