Re: egrep oddity
On Mon, 06 Feb 2012 00:15:14 +0100, Tomas Volka wrote:
> On Ne 05-02-12 | 23:04, Sven Joachim wrote:
>> > The "^[A-Z]" range will never match line beginning with a, since the
>> > range matches only uppercase characters.
>> Not quite true, this very much depends on the locale.
> Tried this under cs_CZ.UTF-8 and C locales and it behaves as i outlined.
> I'm curious under which locale is the result different, as i've never
> experienced such behavior.
"man egrep" (Character Classes and Bracket Expressions) seems to agree
with Sven's assertion although it does not specify the differences
between specific locales.
For example, in the default C locale, [a-d] is equivalent to
[abcd]. Many locales sort characters in dictionary order, and in
these locales [a-d] is typically not equivalent to [abcd]; it might be
equivalent to [aBbCcDd], for example. To obtain the traditional
interpretation of bracket expressions, you can use the C locale
by setting the LC_ALL environment variable to the value C.