Bug#471021: locales: EastAsianAmbiguous character width is always 1 in UTF-8

To: d+deb@vdr.jp, 471021@bugs.debian.org
Subject: Bug#471021: locales: EastAsianAmbiguous character width is always 1 in UTF-8
From: GOTO Masanori <gotom@sanori.org>
Date: Fri, 09 Jan 2009 01:56:20 +0900
Message-id: <[🔎] 814p09ag97.wl%gotom@sanori.org>
Reply-to: GOTO Masanori <gotom@sanori.org>, 471021@bugs.debian.org

I don't agree with the concept of "UTF-8-CJK" because it's over
exaggerated.  Is it a locale dependent issue, or character encoding
issue?

According to UAX#11, your point doesn't make sense because your
reference just mention about character mapping.  Instead, "When
processing or displaying data" section says,

"Ambiguous characters behave like wide or narrow characters depending
on the context (language tag, script identification, associated font,
source of data, or explicit markup; all can provide the context). If
the context cannot be established reliably, they should be treated as
narrow characters by default."

If the all legacy applications use wcwidth() supposing the width of
ambiguous font size = 2, it's OK to introduce your idea - but I'm not
sure it's true or not.

Font rendering application should basically consider the font size.
Why doesn't rxvt consider about such font rendering size?  Or should
we introduce special environment variable or locale tag to decide the
behavior of wcwidth value for ambiguous characters?

Reply to:

Follow-Ups:
- Bug#471021: locales: EastAsianAmbiguous character width is always 1 in UTF-8
  - From: d+deb@vdr.jp

Prev by Date: Bug#507731: libc6: AC_FUNC_MKTIME failing
Next by Date: Re: Send GLIBC
Previous by thread: Processed: Re: Bug#507731: libc6: AC_FUNC_MKTIME failing
Next by thread: Bug#471021: locales: EastAsianAmbiguous character width is always 1 in UTF-8
Index(es):
- Date
- Thread