Bug#512525: marked as done (regexp: missing support for non-localized but utf8 environment)
Your message dated Sun, 11 Oct 2020 01:56:57 +0200
with message-id <20201010235657.z7kjagdr6yhq5bal@function>
and subject line Re: Bug#512525: regexp: missing support for non-localized but utf8 environment
has caused the Debian Bug report #512525,
regarding regexp: missing support for non-localized but utf8 environment
to be marked as done.
This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.
(NB: If you are a system administrator and have no idea what this
message is talking about, this may indicate a serious mail system
misconfiguration somewhere. Please contact owner@bugs.debian.org
immediately.)
--
512525: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=512525
Debian Bug Tracking System
Contact owner@bugs.debian.org with problems
--- Begin Message ---
- To: Debian Bug Tracking System <submit@bugs.debian.org>
- Subject: regexp: missing support for non-localized but utf8 environment
- From: Samuel Thibault <samuel.thibault@ens-lyon.org>
- Date: Wed, 21 Jan 2009 14:27:19 +0100
- Message-id: <20090121132719.GA22759@const.bordeaux.inria.fr>
Package: libc6
Version: 2.7-18
Severity: normal
Hello,
My goal is to grep for intervals of unicode characters in utf-8 files.
However, character intervals depend on locales, so I have to set
LC_COLLATE to C, but doing so makes grep not know that my files are
utf-8, so I set LC_CTYPE to a UTF-8 locale, however that fails:
$ LANG=C LC_CTYPE=fr_FR.UTF-8 grep '[é-ë]' test.txt
grep: Invalid collation character
which comes from libc' re_compile_pattern() function.
Samuel
-- System Information:
Debian Release: 5.0
APT prefers testing
APT policy: (990, 'testing'), (500, 'unstable'), (500, 'stable'), (1, 'experimental')
Architecture: amd64 (x86_64)
Kernel: Linux 2.6.28 (SMP w/2 CPU cores)
Locale: LANG=fr_FR.UTF-8, LC_CTYPE=fr_FR.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Versions of packages libc6 depends on:
ii libgcc1 1:4.3.2-1.1 GCC support library
libc6 recommends no packages.
Versions of packages libc6 suggests:
ii glibc-doc 2.7-18 GNU C Library: Documentation
ii locales 2.7-18 GNU C Library: National Language (
-- debconf information excluded
--
Samuel
We are Pentium of Borg. Division is futile. You will be approximated.
(seen in someone's .signature)
--- End Message ---
--- Begin Message ---
- To: John Scott <jscott@posteo.net>, 512525-done@bugs.debian.org
- Subject: Re: Bug#512525: regexp: missing support for non-localized but utf8 environment
- From: Samuel Thibault <sthibault@debian.org>
- Date: Sun, 11 Oct 2020 01:56:57 +0200
- Message-id: <20201010235657.z7kjagdr6yhq5bal@function>
- In-reply-to: <3349557.FH3fGckNC8@t450>
- References: <20090121132719.GA22759@const.bordeaux.inria.fr> <20090121132719.GA22759@const.bordeaux.inria.fr> <3349557.FH3fGckNC8@t450>
Hello,
John Scott, le sam. 10 oct. 2020 09:29:09 -0400, a ecrit:
> On Wednesday, January 21, 2009 8:27:19 AM EDT Samuel Thibault wrote:
> > My goal is to grep for intervals of unicode characters in utf-8 files.
> > However, character intervals depend on locales, so I have to set
> > LC_COLLATE to C, but doing so makes grep not know that my files are
> > utf-8, so I set LC_CTYPE to a UTF-8 locale, however that fails:
> >
> > $ LANG=C LC_CTYPE=fr_FR.UTF-8 grep '[é-ë]' test.txt
> > grep: Invalid collation character
> >
> > which comes from libc' re_compile_pattern() function.
> For this you could try the C.UTF-8 locale.
Ah, that appeared in the meantime indeed :)
Samuel
--- End Message ---
Reply to: