[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#373704: Patch for sort issues



Package: busybox
Version: 1:1.1.3-3
Followup-For: Bug #373704

Attached is a patch for the sort issues. A few have croped up while
testing this. Hopefully busybox sort now behaves like coreutils sort.

About the patch (line numbers of result):

Line 65   : Fix -b -f -i -d and trailing spaces.

Try any of those flags on its own.

Line 75   : Seperators are not skipped (never where)
Line 76-77: Collaps lines:
	     - the while checks for isspace() already
	     - combine str[end] and !key_separator

No code change, just correcting the comment and simplifying.


Line 78-82: Move 'end++' into the loop into the tests
	    for becomes while

The loop in line 74 is supposed to count 'key->range[2*j]+j'
seperators or whitespaces. The old code would advance end up to the
first seperator and then no more. The same seperator got counted again
and again. Only -k1 and -k2 worked. Moving the 'end++' inside the loop
into the test cases causes them to advance past the seperator so the
next loop starts fresh on the next key again.

",x,,,a" gets split into '','x','','','a' now, into 5 key positions.


Line 88 + : The fix in 78-82 included the last whitespace or
            separator, remove it here.

Without this sort -k2,2 -t, and this input

x,a,a
x,a

would use the following strings for comparisons:

a,
a


Line 88 - : Don't skip the first seperator if a key starts with
            seperators.

Frankly that line didn't make any sense. Consider sort -k2 -t,

,a
,b
,,a
,,b
,,,a
,,,b

That gets split into
a
b
,a
,b
,,a
,,b

And then one leading , got ignored giving

a
b
a
b
,a
,b

and last sorted into

,,,a
,,,b
,,a
,a
,,b
,b

It should either ignore all leading seperators or none, not
one. Coreutils ignores none so I removed that line completly.

MfG
	Goswin

-- System Information:
Debian Release: 4.0
  APT prefers unstable
  APT policy: (500, 'unstable')
Architecture: amd64 (x86_64)
Shell:  /bin/sh linked to /bin/bash
Kernel: Linux 2.6.18-frosties
Locale: LANG=C, LC_CTYPE=de_DE (charmap=ISO-8859-1)

Versions of packages busybox depends on:
ii  libc6                        2.3.6.ds1-9 GNU C Library: Shared libraries

busybox recommends no packages.

-- no debconf information
diff -u busybox-1.1.3/coreutils/sort.c busybox-1.1.3/coreutils/sort.c
--- busybox-1.1.3/coreutils/sort.c
+++ busybox-1.1.3/coreutils/sort.c
@@ -62,7 +62,7 @@
 
 	/* Special case whole string, so we don't have to make a copy */
 	if(key->range[0]==1 && !key->range[1] && !key->range[2] && !key->range[3]
-		&& !(flags&(FLAG_b&FLAG_d&FLAG_f&FLAG_i&FLAG_bb))) return str;
+		&& !(flags&(FLAG_b|FLAG_d|FLAG_f|FLAG_i|FLAG_bb))) return str;
 	/* Find start of key on first pass, end on second pass*/
 	len=strlen(str);
 
@@ -72,23 +72,21 @@
 		else {
 			end=0;
 			for(i=1;i<key->range[2*j]+j;i++) {
-				/* Skip leading blanks or first separator */
-				if(str[end]) {
-					if(!key_separator && isspace(str[end]))
-						while(isspace(str[end])) end++;
-				}
-				/* Skip body of key */
-				for(;str[end];end++) {
+				/* Skip leading blanks */
+				if(str[end] && !key_separator)
+				    while(isspace(str[end])) end++;
+				/* Skip body of key and one blank or seperator */
+				while(str[end]) {
 					if(key_separator) {
-						if(str[end]==key_separator) break;
-					} else if(isspace(str[end])) break;
+						if(str[end++]==key_separator) break;
+					} else if(isspace(str[end++])) break;
 				}
 			}
 		}
 		if(!j) start=end;
 	}
-	/* Key with explicit separator starts after separator */
-	if(key_separator && str[start]==key_separator) start++;
+	/* Don't include the last whitespace or separator */
+	if (str[end]) end--;
 	/* Strip leading whitespace if necessary */
 	if(flags&FLAG_b) while(isspace(str[start])) start++;
 	/* Strip trailing whitespace if necessary */

Reply to: