[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: error with locales in glibc 2.2



Tomas Berndtsson <tomas@nocrew.org> writes:

> bugs.debian.org seems to be down, so I couldn't see if this has been
> reported as a bug yet. Locales in glibc 2.2 doesn't seem to work
> properly. I have this small test program:

Ok, now that I got this problem solved, on to the next problem with
locales. I now have this test program:

------
#include <stdio.h>
#include <string.h>
#include <locale.h>

int main(int argc, char *argv[])
{
  int i;
  char *locale_set;

  locale_set = setlocale(LC_COLLATE, "");
  printf("locale set: %s\n", locale_set);

  i = strcoll(argv[1], argv[2]);
  printf("%d\n", i);

  return 0;
}
------


What it does, is to compare two strings, and print out the value of
the comparison. 

Now, study this:

tomas@penne:~/src$ ./localetest "ab, c" "a, bc"
locale set: C
54
tomas@penne:~/src$ ./localetest "ab, c" "a, c"
locale set: C
54
tomas@penne:~/src$ LANG=en_US ./localetest "ab, c" "a, bc"
locale set: en_US
1
tomas@penne:~/src$ LANG=en_US ./localetest "ab, c" "a, c"
locale set: en_US
-1

What happens here, is that, when using some other locale than C (I've
tried this with en_US, en_GB and sv_SE), the strcoll() call skips over
the comma and the space when comparing the two strings. This means
that "a, bc" is sorted before "ab, c", but "a, c" is sorted after
"ab, c". The three string would be sorted like:

a, bc
ab, c
a, c

I have never seen any such list of string get sorted in this
manner. PostgreSQL seems to use strcoll() when ordering the
selections, and it therefore gives this weird sorting order.
I cannot use C as locale for PostgreSQL, because I need swedish sort
order of åäö, which is sorted as äåö in regular iso-8859-1. 

This does not happen with libc6 2.1.3, which seems to have a different
handling for locales.

I suppose this could be a more generic glibc problem, which is not at
all Debian-specific, but maybe you can help anyway?



Greetings,

Tomas



Reply to: