[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#139861: LC_CTYPE with UTF-8 doesn't work correctly



At Thu, 27 Feb 2003 20:01:25 +0100,
Torsten Hilbrich wrote:
> 
> GOTO Masanori <gotom@debian.or.jp> writes:
> 
> [Problems with [:lower:] and [:upper:] in UTF-8 locale]
> 
> > It should be fixed in sid glibc 2.3.1, please check.
> 
> I have now installed the latest versions of theses programs:
> 
> ii  bash           2.05b-3        The GNU Bourne Again SHell
> ii  coreutils      4.5.7-1        The GNU core utilities
> ii  grep           2.5.1-2        GNU grep, egrep and fgrep
> ii  locales        2.3.1-14       GNU C Library: National Language (locale) da
> ii  libc6          2.3.1-14       GNU C Library: Shared libraries and Timezone
> 
> The following statements work as expected:
> 
> $ grep [[:lower:]]                   
> $ grep [[:upper:]]                   
> $ case ... in [[:lower:]]) ... esac  # bash
> $ case ... in [[:upper:]]) ... esac  # bash
> 
> The following don't work with non-ASCII characters when LC_CTYPE is
> set to de_DE.UTF8
> 
> $ tr [:lower:] [:upper:]
> 
> Using "tr [:alpha:] '-'" I found out that non-ASCII letters (valid
> letters in the de_DE locale) are not even recognized.  In the
> de_DE.ISO-8859-1 locale both statements work correctly.
> 
> I don't know if this is related to this single program or can be
> caused by problems in libc6 oder locales data.  Please tell me if you
> think that I should report to coreutils instead.
> 
> So half the bug report is resolved,

Coreutils uses old regex engine, so tr is not ready for UTF-8.
I think it's TODO item for coreutils/textutils.
I reassign this bug to coreutils.

-- gotom



Reply to: