string comparison: signed or unsigned char?

G

g

Hi all.
Maybe this question has been asked many times before, but I was not able to find
any pointer. I apologize in advance, for it refers to a particular standard
library implementation (GNU C++ version 3.x), but perhaps it's more general than
that.

The following program (to be compiled with g++ 3.x, 2.9x is not enough)

#include <iostream>
#include <string>
#define BLURB(x) #x << "\t== " << (x) << '\n'
int main()
{
std::basic_string<char> s1, s2;
s1 = 0xe0;
s2 = 'a';
std::cout << "s1 == '" << s1 << "'\ns2 == '" << s2 << "'\n";
std::cout << BLURB( (s1 < s2) );
std::cout << BLURB( (s1[0] < s2[0]) );
std::cout << BLURB( std::char_traits<char>::lt(s1[0], s2[0]) );
std::cout << BLURB( std::char_traits<char>::compare(s1.c_str(),
s2.c_str(), 1) );
std::cout << BLURB( s1.compare(s2) );
return 0;
}

produces this output on my x86 linux pc:

s1 == 'à'
s2 == 'a'
(s1 < s2) == 0
(s1[0] < s2[0]) == 1
std::char_traits<char>::lt(s1[0], s2[0]) == 1
std::char_traits<char>::compare(s1.c_str(), s2.c_str(), 1) == 1
s1.compare(s2) == 1

The above results are counter-intuitive, but agree with the
behaviour of C standard library (1999 standard):
- strcmp and memcmp treat their arguments as unsigned char*, so
that strcmp(s1.c_str(), s2.c_str()) > 0 (meaning s1 > s2)
- on my platform char is signed, so s1[0] < s2[0] (because s1[0] < 0)

On the other hand, Stroustrup's TC++PL, section 20.2.1 "Character traits"
reports "The compare() function uses lt() and eq() to compare characters.",
so I'd expect
std::char_traits<char>::lt(s1[0], s2[0])
and
std::char_traits<char>::compare(s1.c_str(), s2.c_str(), 1)
to return consistent results, which they do not. As a side effect,
s1.compare(s2) is not consistent with std::char_traits<char>::lt()
either.

As far as I can see, GNU libstdc++5 implementation of
std::char_traits<char>::compare() uses memcmp() instead of lt(), and so
inherits the unsigned char comparison, while std::char_traits<char>::lt()
plainly uses '<' to compare its arguments, keeping them signed.

It's quite likely that I am missing something in Stroustrup's book.
Does anybody know what the standard mandates?

giuseppe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,439
Members
44,829
Latest member
PIXThurman

Latest Threads

Top