G
g
Hi all.
Maybe this question has been asked many times before, but I was not able to find
any pointer. I apologize in advance, for it refers to a particular standard
library implementation (GNU C++ version 3.x), but perhaps it's more general than
that.
The following program (to be compiled with g++ 3.x, 2.9x is not enough)
#include <iostream>
#include <string>
#define BLURB(x) #x << "\t== " << (x) << '\n'
int main()
{
std::basic_string<char> s1, s2;
s1 = 0xe0;
s2 = 'a';
std::cout << "s1 == '" << s1 << "'\ns2 == '" << s2 << "'\n";
std::cout << BLURB( (s1 < s2) );
std::cout << BLURB( (s1[0] < s2[0]) );
std::cout << BLURB( std::char_traits<char>::lt(s1[0], s2[0]) );
std::cout << BLURB( std::char_traits<char>::compare(s1.c_str(),
s2.c_str(), 1) );
std::cout << BLURB( s1.compare(s2) );
return 0;
}
produces this output on my x86 linux pc:
s1 == 'à'
s2 == 'a'
(s1 < s2) == 0
(s1[0] < s2[0]) == 1
std::char_traits<char>::lt(s1[0], s2[0]) == 1
std::char_traits<char>::compare(s1.c_str(), s2.c_str(), 1) == 1
s1.compare(s2) == 1
The above results are counter-intuitive, but agree with the
behaviour of C standard library (1999 standard):
- strcmp and memcmp treat their arguments as unsigned char*, so
that strcmp(s1.c_str(), s2.c_str()) > 0 (meaning s1 > s2)
- on my platform char is signed, so s1[0] < s2[0] (because s1[0] < 0)
On the other hand, Stroustrup's TC++PL, section 20.2.1 "Character traits"
reports "The compare() function uses lt() and eq() to compare characters.",
so I'd expect
std::char_traits<char>::lt(s1[0], s2[0])
and
std::char_traits<char>::compare(s1.c_str(), s2.c_str(), 1)
to return consistent results, which they do not. As a side effect,
s1.compare(s2) is not consistent with std::char_traits<char>::lt()
either.
As far as I can see, GNU libstdc++5 implementation of
std::char_traits<char>::compare() uses memcmp() instead of lt(), and so
inherits the unsigned char comparison, while std::char_traits<char>::lt()
plainly uses '<' to compare its arguments, keeping them signed.
It's quite likely that I am missing something in Stroustrup's book.
Does anybody know what the standard mandates?
giuseppe
Maybe this question has been asked many times before, but I was not able to find
any pointer. I apologize in advance, for it refers to a particular standard
library implementation (GNU C++ version 3.x), but perhaps it's more general than
that.
The following program (to be compiled with g++ 3.x, 2.9x is not enough)
#include <iostream>
#include <string>
#define BLURB(x) #x << "\t== " << (x) << '\n'
int main()
{
std::basic_string<char> s1, s2;
s1 = 0xe0;
s2 = 'a';
std::cout << "s1 == '" << s1 << "'\ns2 == '" << s2 << "'\n";
std::cout << BLURB( (s1 < s2) );
std::cout << BLURB( (s1[0] < s2[0]) );
std::cout << BLURB( std::char_traits<char>::lt(s1[0], s2[0]) );
std::cout << BLURB( std::char_traits<char>::compare(s1.c_str(),
s2.c_str(), 1) );
std::cout << BLURB( s1.compare(s2) );
return 0;
}
produces this output on my x86 linux pc:
s1 == 'à'
s2 == 'a'
(s1 < s2) == 0
(s1[0] < s2[0]) == 1
std::char_traits<char>::lt(s1[0], s2[0]) == 1
std::char_traits<char>::compare(s1.c_str(), s2.c_str(), 1) == 1
s1.compare(s2) == 1
The above results are counter-intuitive, but agree with the
behaviour of C standard library (1999 standard):
- strcmp and memcmp treat their arguments as unsigned char*, so
that strcmp(s1.c_str(), s2.c_str()) > 0 (meaning s1 > s2)
- on my platform char is signed, so s1[0] < s2[0] (because s1[0] < 0)
On the other hand, Stroustrup's TC++PL, section 20.2.1 "Character traits"
reports "The compare() function uses lt() and eq() to compare characters.",
so I'd expect
std::char_traits<char>::lt(s1[0], s2[0])
and
std::char_traits<char>::compare(s1.c_str(), s2.c_str(), 1)
to return consistent results, which they do not. As a side effect,
s1.compare(s2) is not consistent with std::char_traits<char>::lt()
either.
As far as I can see, GNU libstdc++5 implementation of
std::char_traits<char>::compare() uses memcmp() instead of lt(), and so
inherits the unsigned char comparison, while std::char_traits<char>::lt()
plainly uses '<' to compare its arguments, keeping them signed.
It's quite likely that I am missing something in Stroustrup's book.
Does anybody know what the standard mandates?
giuseppe