Re: Make Unicode bugs release critical?

To: debian-devel@lists.debian.org
Subject: Re: Make Unicode bugs release critical?
From: Vincent Lefevre <vincent@vinc17.net>
Date: Fri, 11 Feb 2011 15:35:11 +0100
Message-id: <[🔎] 20110211143511.GJ15920@prunille.vinc17.org>
Mail-followup-to: debian-devel@lists.debian.org
In-reply-to: <[🔎] 20110211140202.GB2053@angband.pl>
References: <[🔎] 1297375750-sup-7355@gillespie.rupamsunyata.org> <[🔎] 20110211000216.GG8747@onerussian.com> <[🔎] 20110211084733.GA30787@angband.pl> <[🔎] 1297414335.13596.67.camel@meh> <[🔎] 4D54FBEF.7060107@debian.org> <[🔎] 1297417074.3105.6.camel@havelock.lan> <[🔎] 20110211101442.GA29817@pharaoh.inf.upol.cz> <[🔎] 20110211103349.GG3548@belkar.wrar.name> <[🔎] 20110211133024.GI15920@prunille.vinc17.org> <[🔎] 20110211140202.GB2053@angband.pl>

On 2011-02-11 15:02:02 +0100, Adam Borowski wrote:
> On Fri, Feb 11, 2011 at 02:30:24PM +0100, Vincent Lefevre wrote:
> > On 2011-02-11 15:33:49 +0500, Andrey Rahmatullin wrote:
> > > On Fri, Feb 11, 2011 at 11:14:42AM +0100, Miroslav Kure wrote:
> > > > > However, I'm curious: is there a lot of software that is broken with
> > > > > Unicode, particularly with the UTF-8 encoding? I can't remember anything
> > > > > much in recent times.
> > 
> > "less" has problems with new Unicode characters (bug 597918).
> 
> Unicode 6.0 came out in october 2010,

The character mentioned in my bug report (U+1E9F LATIN SMALL LETTER DELTA)
appeared in Unicode 5.1.0 (March 2008).

> well after Squeeze's freeze, so you can't expect support for new
> characters already.

Well, March 2008 was more than 1 year before Squeeze's freeze.

> There are in no fonts shipped with squeeze, so not recognizing the
> characters as valid is not a big problem.

Fonts containing the character in question are shipped with Squeeze:
the character appears correctly in xterm.

> Less shouldn't maintain a private copy of character properties if
> all that data is already present in libc

I agree.

> -- but guess what, wcwidth(0x1F4A9) and iswprint() don't know them
> too.

No problems with U+1E9F:

Property alnum : yes
Property alpha : yes
Property cntrl : no
Property digit : no
Property graph : yes
Property lower : yes
Property print : yes
Property punct : no
Property space : no
Property upper : no
Property xdigit: no
wcwidth = 1

So, if "less" were using libc, it wouldn't have any problem with
this character.

-- 
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / Arénaire project (LIP, ENS-Lyon)

Reply to:

References:
- RFA: all my packages
  - From: Decklin Foster <decklin@red-bean.com>
- Re: RFA: all my packages
  - From: Yaroslav Halchenko <debian@onerussian.com>
- Re: RFA: all my packages
  - From: Adam Borowski <kilobyte@angband.pl>
- Re: RFA: all my packages
  - From: Josselin Mouette <joss@debian.org>
- Re: RFA: all my packages
  - From: Vincent Fourmond <fourmond@debian.org>
- Make Unicode bugs release critical?
  - From: Lars Wirzenius <liw@liw.fi>
- Re: Make Unicode bugs release critical?
  - From: Miroslav Kure <kurem@upcase.inf.upol.cz>
- Re: Make Unicode bugs release critical?
  - From: Andrey Rahmatullin <wrar@wrar.name>
- Re: Make Unicode bugs release critical?
  - From: Vincent Lefevre <vincent@vinc17.net>
- Re: Make Unicode bugs release critical?
  - From: Adam Borowski <kilobyte@angband.pl>

Prev by Date: Re: Spell checker as reasonable SPAM prevention tool
Next by Date: Use language determination tool for SPAM prevention (Was: Spell checker as reasonable SPAM prevention tool)
Previous by thread: Re: Make Unicode bugs release critical?
Next by thread: Re: Make Unicode bugs release critical?
Index(es):
- Date
- Thread