[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#471021: locales: EastAsianAmbiguous character width is always 1 in UTF-8



It'd be great that you propose the good way to do so alternatively.

2009/1/11  <d+deb@vdr.jp>:
> On Sun, Jan 11, 2009 at 11:17:48AM +0900, Masanori Goto wrote:
>> wcwidth() is legacy function so that it cannot handle wide, RTL and
>> combined characters correctly.  An environment value to select its
>> behavior is one way, but it's just a hack and it's hard to specify in libc.
>>
>> So, according to UAX#11 definition, it says we should return
>> 1 for EastAsiasnAmbiguous characters unless a rigid signal
>> (like "language tag, script identification, associated font, source of data")
>> is available in UTF-8.  It's sure that we can introduce such kind of change
>> for SJIS/EUC-JP, but it's hard to decide for ja_JP.UTF-8.
>>
>> Overall, we have no way to expand wcwidth() correctly and rightly,
>> so I think each application should handle the actual font size of characters
>> instead of using wcwidth().
>
> Thank you for your explanation.
>
> I understand that unable to expand wcwidth()
> and each application should be modified.
>
> But each application implements each approach now
> For example, own one, various version of Markus Kuhn's wcwidth.
> In my layman's idea, could libc offer common method for it?
>
> Regards,
>        dai
> --
>
>



Reply to: