[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: Output from date command defaults to 12-hour in Buster.



Greg Wooledge (12020-04-30):
> For the first part, you want LC_NUMERIC=C.

Damn right I want LC_NUMERIC=C. And I want LC_COLLATE=C. And I want
LC_NUMERIC=C, and LC_TIME=C.

And I also want LC_WHATEVER_THE_HECK_THEY_WILL_INVENT_NEXT=C too. I want
LC_EVERYTHING=C except LC_CTYPE.

And everybody who works with command-line tools should want the same,
because these things were ugly mistakes with way more drawbacks than
benefits.

> For the second part, what you're asking for is sometimes called
> "rational ranges", or "rational range interpretation".  This is the
> notion that, for specific range expressions like '[a-z]' within a
> regular expression or glob, the software will assume you want to
> match only '[[:lower:]]', rather than doing what you actually said.
> 
> The idea behind this is based on the (probably accurate) belief that
> most people who write [a-z] or [A-Z] in their scripts wanted the
> LC_COLLATE=C (or 1980s) meaning of the range, not the meaning of the
> range in modern times.  Thus, it's a sort of safety net strung below
> the novice programmer, to catch them when they fall.
> 
> Since this is not how systems currently behave, however, what you need
> to do in your script is write the expression correctly.  For your
> example, I believe that would be [[:xdigit:]].  Or if you really do
> mean to restrict it to lower-case 'a' through 'f', retain what you
> have, but set LC_COLLATE=C first.
> 
> I haven't seen rational range interpretation discussion that covers
> '[a-f]', but I haven't been following it closely.

I am very aware of the pros of cons of localized collation. The thing
that was incredibly stupid was to change the semantic of something as
fundamental as regular expressions. The correct way to introduce
localized alphabetical order in regular expression would have been to
introduce a new notation for it, not to hijack something that was
already used.

Since this ugly mistake has been made, the only sane course of action is
to let locales to their C setting. Which is exactly what I do, and I am
very fine with it.

And I can laugh by myself every time somebody comes complaining that the
output of a command is ugly because it did not consider localization
would break alignment, or that a script is broken because the matching
of a regexp has been altered.

Regards,

-- 
  Nicolas George

Attachment: signature.asc
Description: PGP signature


Reply to: