Bug#357390: Collating is broken in Czech locales
Package: locales
Version: 2.3.6-3
Severity: normal
Collating is seriously broken when using Czech locales. Calls to strcoll(3)
return wrong results. See this simple testcase which compares a string
with itself:
----------------------------------------------------------------------
#include <string.h>
#include <stdio.h>
#include <locale.h>
const char foo[] = "filename.ext";
int main()
{
int r;
setlocale(LC_ALL, "");
r = strcoll(foo, foo);
printf("Comparing string %s to %s returned %d\n", foo, foo, r);
return !!r;
}
----------------------------------------------------------------------
Here are the results:
michich@hammerfall:~/c$ ./strcoll-test-simple
Comparing string filename.ext to filename.ext returned -10
michich@hammerfall:~/c$ LC_COLLATE=cs_CZ.ISO-8859-2 ./strcoll-test-simple
Comparing string filename.ext to filename.ext returned -16
michich@hammerfall:~/c$ LC_COLLATE=en_US.UTF-8 ./strcoll-test-simple
Comparing string filename.ext to filename.ext returned 0
michich@hammerfall:~/c$ LC_COLLATE=C ./strcoll-test-simple
Comparing string filename.ext to filename.ext returned 0
michich@hammerfall:~/c$
As you can see, when I use any of the two Czech locales, I get a wrong
result. Other locales work fine.
I first noticed the bug with a recursive diff of two directories.
Recursive diff uses strcoll to sort directory entries and gets seriously confused.
I think I'm also seeing another symptom of this bug - inkscape takes
ages to start and eats lots of memory when run with LC_COLLATE set to
"cs_CZ" or "cs_CZ.UTF-8". This doesn't happen with any other locale I've
tried.
Michal
-- System Information:
Debian Release: testing/unstable
APT prefers unstable
APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Shell: /bin/sh linked to /bin/dash
Kernel: Linux 2.6.16-rc5
Locale: LANG=cs_CZ.UTF-8, LC_CTYPE=cs_CZ.UTF-8 (charmap=UTF-8)
Versions of packages locales depends on:
ii debconf [debconf-2.0] 1.4.72 Debian configuration management sy
ii libc6 [glibc-2.3.6-2] 2.3.6-3 GNU C Library: Shared libraries an
locales recommends no packages.
-- debconf information:
* locales/default_environment_locale: cs_CZ.UTF-8
* locales/locales_to_be_generated: cs_CZ ISO-8859-2, cs_CZ.UTF-8 UTF-8, da_DK.UTF-8 UTF-8, en_US ISO-8859-1, en_US.UTF-8 UTF-8, tr_TR.UTF-8 UTF-8
Reply to: