[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#426118: glibc: documentation-by-comment of strtok() wrong on final state of first argument



Package: glibc
Version: 2.5-4

Extract from file string/strtok.c:

 /* Parse S into tokens separated by characters in DELIM.
    If S is NULL, the last string strtok() was called with is
    used.  For example:
 	char s[] = "-abc-=-def";
 	x = strtok(s, "-");		// x = "abc"
 	x = strtok(NULL, "-=");		// x = "def"
 	x = strtok(NULL, "=");		// x = NULL
 		// s = "abc\0-def\0"
 */

This, according to my understanding, says that after this code gets
executed, the contents of s is "abc\0-def\0" followed by one
unspecified character (to account for the original length of s, which
is reduced by one in the value given at the end).

This is wrong. The contents of s after execution of this code is
"-abc\0=-def\0" as running the attached program will demonstrate.

Note that the behaviour described in the comment would imply that
strtok is Θ(strlen(s)^2) (quadratic time, and no less, in the
worst case), because it would in some cases shift the whole string by
one (or more?) positions to the left, which is a linear operation.

-- System Information:
Debian Release: lenny/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (1, 'experimental')
Architecture: amd64 (x86_64)

Kernel: Linux 2.6.18-3-amd64 (SMP w/2 CPU cores)
Locale: LANG=fr_LU.UTF-8, LC_CTYPE=fr_LU.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
#include <string.h>
#include <stdio.h>

char s[] = "-abc-=-def";
const size_t ls = sizeof s;

void prints(void) {
  size_t i;
  for(i=0; i < ls; ++i) {
    if ( s[i] < 32 )
      printf("\\0%.0o", s[i]);
    else
      printf("%c", s[i]);
  }
  printf("\n");
}

int main(int argc, char* argv[]) {
  char *x;

  x = strtok(s, "-");
  x = strtok(NULL, "-=");
  x = strtok(NULL, "=");

  prints();

  return 0;
}

Reply to: