Re: egrep oddity
On 2012-02-06 11:50:16 -0700, Bob Proulx wrote:
> Vincent Lefevre wrote:
> > But the grep man page still says:
> >
> > Within a bracket expression, a range expression consists of two
> > characters separated by a hyphen. It matches any single character that
> > sorts between the two characters, inclusive, using the locale's
> > collating sequence and character set. For example, in the default C
> > locale, [a-d] is equivalent to [abcd]. Many locales sort characters in
> > dictionary order, and in these locales [a-d] is typically not
> > equivalent to [abcd]; it might be equivalent to [aBbCcDd], for example.
> > To obtain the traditional interpretation of bracket expressions, you
> > can use the C locale by setting the LC_ALL environment variable to the
> > value C.
>
> I don't see any problem with that wording. The opening for almost any
> behavior comes from "using the locale's collating sequence and
> character set" which isn't defined by grep but is defined by libc.
> Was there something there in particular that you didn't like?
This is precisely because grep no longer follows the locale's
collating sequence. For instance, even though en_US.utf8 uses
the dictionary order (as seen with "sort"), [a-d] is equivalent
to [abcd], not to something that would include B and C.
So, where is the range specified?
--
Vincent Lefèvre <vincent@vinc17.net> - Web: <http://www.vinc17.net/>
100% accessible validated (X)HTML - Blog: <http://www.vinc17.net/blog/>
Work: CR INRIA - computer arithmetic / AriC project (LIP, ENS-Lyon)
Reply to: