[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#406785: marked as done (sort: incorrect result when sorting on subfields)



Your message dated Mon, 15 Jan 2007 01:04:56 +0100
with message-id <200701150105.25766.elendil@planet.nl>
and subject line Bug#406785: sort: incorrect result when sorting on subfields
has caused the attached Bug report to be marked as done.

This means that you claim that the problem has been dealt with.
If this is not the case it is now your responsibility to reopen the
Bug report if necessary, and/or fix the problem forthwith.

(NB: If you are a system administrator and have no idea what I am
talking about this indicates a serious mail system misconfiguration
somewhere.  Please contact me immediately.)

Debian bug tracking system administrator
(administrator, Debian Bugs database)

--- Begin Message ---
Package: coreutils
Version: 5.97-5

Whilst testing a patch for Busybox sort, I encountered this bug in GNU 
sort.

$ sort -k4.2,4.4 test
999     3       0       algebra
egg     1       2       papyrus
7       3       42      soup
42      1       3       woot
42      1       010     zoology
$ sort -k4.3,4.5 test
egg     1       2       papyrus
999     3       0       algebra
42      1       010     zoology
42      1       3       woot
7       3       42      soup

In the first example, sort should, according to the info page, take the 
second to fourth character of the fourth field, but instead it sorts on 
the first letter.
Only if the character limits are increased by one do I get the correct 
result.

I suspect that a delimiter character (space?) is included at the start of 
the string and that that is counted as the first character.

Cheers,
FJP

Attachment: pgpVkSxlySHJq.pgp
Description: PGP signature


--- End Message ---
--- Begin Message ---
Closing as this is not a bug in current busybox.

The reason this is not a bug is explained in the GNU sort info page for 
the -t option:
`-t SEPARATOR'
`--field-separator=SEPARATOR'
     Use character SEPARATOR as the field separator when finding the
     sort keys in each line.  By default, fields are separated by the
     empty string between a non-blank character and a blank character.
     That is, given the input line ` foo bar', `sort' breaks it into
     fields ` foo' and ` bar'.

Note the leading spaces in the last line in the case that no separator is 
used. Not really intuitive, but well.
(Using the -b option fixes the issue.)

Attachment: pgp6U6w0mgD6S.pgp
Description: PGP signature


--- End Message ---

Reply to: