[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#432563: locales: norwegian locale has started treating aa as å in regex



Package: locales
Version: 2.5-9
Severity: normal
Tags: l10n

When upgrading exim4 to 4.67-5 a strange behavior with regex in sed 
was seen. It seems that aa has started to be treated as å in character
class matching:
$locale
LANG=nb_NO.UTF-8
LANGUAGE=en_US:en_GB:en
LC_CTYPE="nb_NO.UTF-8"
LC_NUMERIC=en_US.UTF-8
LC_TIME="nb_NO.UTF-8"
LC_COLLATE="nb_NO.UTF-8"
LC_MONETARY="nb_NO.UTF-8"
LC_MESSAGES=en_US.UTF-8
LC_PAPER="nb_NO.UTF-8"
LC_NAME="nb_NO.UTF-8"
LC_ADDRESS="nb_NO.UTF-8"
LC_TELEPHONE="nb_NO.UTF-8"
LC_MEASUREMENT="nb_NO.UTF-8"
LC_IDENTIFICATION="nb_NO.UTF-8"
LC_ALL=
$echo "petrus.haavard.name" | sed 's/[^-0-9a-zA-Z\/\.!*@_~:;< ]/_/g'
petrus.h_vard.name
$export LC_ALL=C
$echo "petrus.haavard.name" | sed 's/[^-0-9a-zA-Z\/\.!*@_~:;< ]/_/g'
petrus.haavard.name

Aa should be treated as å when sorting, but this behavior seems wrong.
See also bug 430391.

-- System Information:
Debian Release: lenny/sid
  APT prefers testing
  APT policy: (990, 'testing'), (500, 'stable'), (200, 'unstable')
Architecture: i386 (i686)

Kernel: Linux 2.6.21-2-k7 (SMP w/1 CPU core)
Locale: LANG=nb_NO.UTF-8, LC_CTYPE=nb_NO.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages locales depends on:
ii  debconf [debconf-2.0]         1.5.13     Debian configuration management sy
ii  libc6 [glibc-2.5-1]           2.5-9+b1   GNU C Library: Shared libraries

locales recommends no packages.

-- debconf information:
* locales/default_environment_locale: nb_NO.UTF-8
* locales/locales_to_be_generated: en_US ISO-8859-1, en_US.UTF-8 UTF-8, nb_NO ISO-8859-1, nb_NO.UTF-8 UTF-8, no_NO ISO-8859-1, no_NO.UTF-8 UTF-8




Reply to: