[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#415961: locales: Sorting with pt_BR ignoring spaces - it shouldn't



Package: locales
Version: 2.3.6.ds1-13
Severity: important
Tags: patch l10n


When sorting data, the sort order ignore spaces, being very anoying to use it with a database like PostgreSQL.

Below an example:

$ cat list.txt # A random name list
Adriano José
Adriana da Silva
Adrian Kuerten

The strange behavior:

$ cat lista.txt | sort
Adriana da Silva
Adrian Kuerten
Adriano José

Changing /usr/share/i18n/locales/pt_BR, section LC_COLLATE to:

LC_COLLATE
copy "iso14651_t1"
reorder-after <U00A0>
<U0020><CAP>;<CAP>;<CAP>;<U0020>
reorder-end
END LC_COLLATE

I have the correct behavior:

$ cat lista.txt | sort
Adrian Kuerten
Adriana da Silva
Adriano José

There is a topic in http://sourceware.org/bugzilla/show_bug.cgi?id=3405, but the pt_BR file there doesn't work well with chars 'a','á','ã',etc...

I think this could be a problem for other languages too, but not for sure.

-- System Information:
Debian Release: 4.0
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-4-686
Locale: LANG=pt_BR, LC_CTYPE=pt_BR (charmap=ISO-8859-1)

Versions of packages locales depends on:
ii  debconf [debconf-2.0]       1.5.11       Debian configuration management sy
ii  libc6 [glibc-2.3.6.ds1-1]   2.3.6.ds1-13 GNU C Library: Shared libraries

locales recommends no packages.

-- debconf information:
* locales/default_environment_locale: pt_BR.UTF-8
* locales/locales_to_be_generated: pt_BR ISO-8859-1, pt_BR.UTF-8 UTF-8



Reply to: