Bug#139861: LC_CTYPE with UTF-8 doesn't work correctly
At Thu, 27 Feb 2003 20:01:25 +0100,
Torsten Hilbrich wrote:
>
> GOTO Masanori <gotom@debian.or.jp> writes:
>
> [Problems with [:lower:] and [:upper:] in UTF-8 locale]
>
> > It should be fixed in sid glibc 2.3.1, please check.
>
> I have now installed the latest versions of theses programs:
>
> ii bash 2.05b-3 The GNU Bourne Again SHell
> ii coreutils 4.5.7-1 The GNU core utilities
> ii grep 2.5.1-2 GNU grep, egrep and fgrep
> ii locales 2.3.1-14 GNU C Library: National Language (locale) da
> ii libc6 2.3.1-14 GNU C Library: Shared libraries and Timezone
>
> The following statements work as expected:
>
> $ grep [[:lower:]]
> $ grep [[:upper:]]
> $ case ... in [[:lower:]]) ... esac # bash
> $ case ... in [[:upper:]]) ... esac # bash
>
> The following don't work with non-ASCII characters when LC_CTYPE is
> set to de_DE.UTF8
>
> $ tr [:lower:] [:upper:]
>
> Using "tr [:alpha:] '-'" I found out that non-ASCII letters (valid
> letters in the de_DE locale) are not even recognized. In the
> de_DE.ISO-8859-1 locale both statements work correctly.
>
> I don't know if this is related to this single program or can be
> caused by problems in libc6 oder locales data. Please tell me if you
> think that I should report to coreutils instead.
>
> So half the bug report is resolved,
Coreutils uses old regex engine, so tr is not ready for UTF-8.
I think it's TODO item for coreutils/textutils.
I reassign this bug to coreutils.
-- gotom
Reply to: