[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Re: sort (-g) [offtopic]



-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Sun, Feb 18, 2018 at 04:55:28PM +0100, Ionel Mugurel Ciobîcă wrote:
> 
> Anyone care to explain what exactly means the -g option of sort? The
> fine manual only says "general numerical", but I doubt that is true,
> because -g (and all other options I have tried, -n, -M, -h, -V) will
> all put Roman numeral 9 in between 4 and 5. See here:
> 
> # echo "III\nII\nI\nV\nIV\nVII\nVI\nVIII\nX\nIX" | sort -g | nl
> 
> What I expect is to put 9 in between 8 and 10.


The info documentation has more (alternative, you can look
that up on the web, as stated in the man page itself).

Extracted from the info:

  ‘-g’
  ‘--general-numeric-sort’
  ‘--sort=general-numeric’
     Sort numerically, converting a prefix of each line to a long
     double-precision floating point number.  *Note Floating point::.
     Do not report overflow, underflow, or conversion errors.  Use
     the following collating sequence:

      • Lines that do not start with numbers (all considered to be
        equal).
      • NaNs (“Not a Number” values, in IEEE floating point
        arithmetic) in a consistent but machine-dependent order.
      • Minus infinity.
      • Finite numbers in ascending numeric order (with -0 and +0
        equal).
      • Plus infinity.

     Use this option only if there is no alternative; it is much slower
     than ‘--numeric-sort’ (‘-n’) and it can lose information when
     converting to floating point.

So '-g' basically means (decimal representation of) float, plus a
couple of NaNs. No roman numerals, alas...

[...]

> How do I sort in a pipe those roman numerals? I have written two bash
> scripts roman_to_arab.sh and arab_to_roman.sh, but I do not know how
> to adapt it to use it in pipes. Also, it may be too cumbersome to make
> the conversion to arab digits, sort with -n and then convert back into
> roman numerals...

I fear sort is out of its smarts on that. There are libraries for
different languages to do this, e.g. Perl's Roman.pm (in Debian
package libperl-roman).

> Anyone has encounter this issue? Any ideas how to sort out this sort
> issue? Of course, the easier will be if, indeed, the sort -g would
> work as expected, e.g. if "_general_ numeric" will not be particular
> to exclude Roman numerals...

I guess your idea of "general" is just too general to be practical :)

Cheers
- -- tomás
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)

iEYEARECAAYFAlqJrS0ACgkQBcgs9XrR2kZo4ACcDkY4H1RzyWYaQnQF7E/PfLN9
AbsAmgPSPyn7r5kWyTH7CFOir/OMPAwo
=SXF9
-----END PGP SIGNATURE-----


Reply to: