Bug#512525: regexp: missing support for non-localized but utf8 environment
Package: libc6
Version: 2.7-18
Severity: normal
Hello,
My goal is to grep for intervals of unicode characters in utf-8 files.
However, character intervals depend on locales, so I have to set
LC_COLLATE to C, but doing so makes grep not know that my files are
utf-8, so I set LC_CTYPE to a UTF-8 locale, however that fails:
$ LANG=C LC_CTYPE=fr_FR.UTF-8 grep '[é-ë]' test.txt
grep: Invalid collation character
which comes from libc' re_compile_pattern() function.
Samuel
-- System Information:
Debian Release: 5.0
APT prefers testing
APT policy: (990, 'testing'), (500, 'unstable'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.28 (SMP w/2 CPU cores)
Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages libc6 depends on:
ii libgcc1 1:4.3.2-1.1 GCC support library
libc6 recommends no packages.
Versions of packages libc6 suggests:
ii glibc-doc 2.7-18 GNU C Library: Documentation
ii locales 2.7-18 GNU C Library: National Language (
-- debconf information excluded
--
Samuel
We are Pentium of Borg. Division is futile. You will be approximated.
(seen in someone's .signature)
Reply to: