[Date Prev][Date Next] [Thread Prev][Thread Next] [Date Index] [Thread Index]

Bug#259196: libstdc++5: mixed signed/unsigned char comparison in std::char_traits<char>



Package: libstdc++5
Version: 1:3.3.4-2
Severity: minor

I report this as "minor" since I don't know if it's really a bug or
standard behaviour that I'm not understanding.

The following program (to be compiled with g++ 3.x)

#include <iostream>
#include <string>
#define BLURB(x) #x << "\t== " << (x) << '\n'
int main()
{
    std::basic_string<char> s1, s2;
    s1 = 0xe0;
    s2 = 'a';
    std::cout << "s1 == '" << s1 << "'\ns2 == '" << s2 << "'\n";
    std::cout << BLURB( (s1 < s2) );
    std::cout << BLURB( (s1[0] < s2[0]) );
    std::cout << BLURB( std::char_traits<char>::lt(s1[0], s2[0]) );
    std::cout << BLURB( std::char_traits<char>::compare(s1.c_str(),
                s2.c_str(), 1) );
    std::cout << BLURB( s1.compare(s2) );
    return 0;
}

produces this output on my x86 linux pc:

s1 == 'à'
s2 == 'a'
(s1 < s2)       == 0
(s1[0] < s2[0]) == 1
std::char_traits<char>::lt(s1[0], s2[0])        == 1
std::char_traits<char>::compare(s1.c_str(), s2.c_str(), 1)      == 1
s1.compare(s2)  == 1

The above results are counter-intuitive, but agree with the
behaviour of C standard library (1999 standard): 
- strcmp and memcmp treat their arguments as unsigned char*, so
  that strcmp(s1.c_str(), s2.c_str()) > 0 (meaning s1 > s2)
- on my platform char is signed, so s1[0] < s2[0] (because s1[0] < 0)

On the other hand, Stroustrup's TC++PL, section 20.2.1 "Character traits"
reports "The compare() function uses lt() and eq() to compare characters.", 
so I'd expect
    std::char_traits<char>::lt(s1[0], s2[0])
and 
    std::char_traits<char>::compare(s1.c_str(), s2.c_str(), 1)
to return consistent results, which they do not.  As a side effect,
s1.compare(s2) is not consistent with std::char_traits<char>::lt()
either.

As far as I can see, GNU libstdc++5 implementation of 
std::char_traits<char>::compare() uses memcmp() instead of lt(), and so
inherits the unsigned char comparison, while std::char_traits<char>::lt()
plainly uses '<' to compare its arguments, keeping them signed.

It's quite likely that I am missing something in Stroustrup's book.
What does the standard mandate?

Best regards
giuseppe

-- System Information:
Debian Release: testing/unstable
  APT prefers testing
  APT policy: (500, 'testing')
Architecture: i386 (i686)
Kernel: Linux 2.4.26-1-686
Locale: LANG=C, LC_CTYPE=C

Versions of packages libstdc++5 depends on:
ii  gcc-3.3-base                1:3.3.4-2    The GNU Compiler Collection (base 
ii  libc6                       2.3.2.ds1-13 GNU C Library: Shared libraries an
ii  libgcc1                     1:3.3.4-2    GCC support library

-- no debconf information



Reply to: