[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Bits from the Release Team - Kicking off Wheezy



On Wed, Mar 30, 2011 at 04:39:30PM -0500, Peter Samuelson wrote:
> 
> [Roger Leigh]
> >   As a followup, I would like to get the UTF-8 codeset and collation
> >   hardcoded in libc6 directly and sharable by all UTF-8 locales to
> >   reduce startup time and needless duplication
> 
> Collation is not just a function of character set, it's quite
> locale-dependent.  Not sure if the character class tables (<ctype.h>
> functions, and the [:foo:] posix regex classes) could be shared across
> UTF-8 locales.  I rather suspect not.

Maybe I'm just thinking of ctype.  I thought that (possibly due to
having __STDC_ISO_10646__) the character classes were identical across
all locales.  Collation is probably different.

> When you take out collation and possibly character classes, I'm not
> sure whether there's anything in the UTF-8 locales left to hardcode
> into libc.

There's the actual charmap (localedata/charmaps/UTF-8), which is
big and well worth sharing between locales irrespective of
hardcoding.  Looking at it again, I only see the C ctype hardcoded;
not the charmap, so maybe it's autogenerated or not even hardcoded
(since it's a 1:1 ASCII:UCS mapping for C).  It would be easier to
grok what's going on if it wasn't so hideously complex and
undocumented!


Regards,
Roger

-- 
  .''`.  Roger Leigh
 : :' :  Debian GNU/Linux             http://people.debian.org/~rleigh/
 `. `'   Printing on GNU/Linux?       http://gutenprint.sourceforge.net/
   `-    GPG Public Key: 0x25BFB848   Please GPG sign your mail.

Attachment: signature.asc
Description: Digital signature


Reply to: