Kenneth Brody said:
Even if chars are signed?
On my system, this prints "-96":
#include <stdio.h>
main()
{
int i = '\xa0';
printf("%d\n",i);
}
Yeah, mine too.
I didn't read far enough in the standard. C99 6.4.4.4p9 says:
Constraints
9 The value of an octal or hexadecimal escape sequence shall
be in the range of representable values for the type unsigned
char for an integer character constant, or the unsigned type
corresponding to wchar_t for a wide character constant.
I mistakenly took that to be a specification of the value of
a character constant containing an octal or hexadecimal escape
sequence, but it isn't. It's just the value of the escape sequence.
The value of the character constant is defined in paragraph 10,
under Semantics:
If an integer character constant contains a single character
or escape sequence, its value is the one that results when
an object with type char whose value is that of the single
character or escape sequence is converted to type int.
I still find the wording a bit shaky. In '\xa0', the value of the
escape sequence, as defined by paragraph 9, is 160. Given that plain
char is signed and CHAR_BIT==8, an object with type char *cannot*
have the value 160.
The standard seems to be assuming that the values 160 and -96 are
interchangeable when stored in a char object. Either I've missed
something else obvious (which is quite possible), or the standard
is playing fast and loose with signed and unsigned values.
I think the standard's accuracy would be improved by changing a
lot of references to character values so they refer to the result
of converting those values to unsigned char. There seem to be a
lot of places that would be improved by this change, including the
description of strchr().
(Mandating that plain char is unsigned would also simplify things,
but that's probably not feasible.)
I've been posting on this thread asserting that '\xa0' is 160.
I apologize for the unintentional misinformation.