Stanley Rice said:
Oh yes, I got it! Thanks a lot. Now I want to know whether the
underlying representation (bit stream) of -1 in type int and type
unsigned are the same or not, suggesting type int is in 32 bit in my
machine. I guess the underlying representation of digit -1 is the
same, say,
1111 1111 1111 1111 1111 1111 1111 1111 (FFFFFFFF)
When we treat it as an signed int, it is decoded as a digit as int_val
= -1, and when we treat it as an unsigned int, it is decoded as a
digit as unsigned_val = 4294967295, could you tell me if my guess is
right or not?
The validity of the expression has (almost) nothing to do with how
signed and unsigned integeres are represented. That's not to say that
you shouldn't be curious about the representation, but it's not directly
relevant here.
1 is an expression of type int, with the obvious value.
-1 is an expression consisting of the unary "-" operator applied to 1.
It's also of type int, also with the obvious value.
(unsigned)-1 is a cast expression; the "(unsigned)" part is a cast
operator. It takes the value of the expression -1 and converts it
from type int to type unsigned int ("unsigned int" and "unsigned"
are simply two names for the same type). As I mentioned, that
conversion is not defined in terms of how either int or unsigned
int is represented; it's defined entirely in terms of *values*.
The result of this particular conversion is defined in the C
standard, section 6.3.1.3, paragraph 2:
Otherwise, if the new type is unsigned, the value is converted
by repeatedly adding or subtracting one more than the maximum
value that can be represented in the new type until the value
is in the range of the new type.
The "maximum value that can be represented in the new type" is
UINT_MAX. The result of the conversion is computed, in this case,
by adding UINT_MAX+1 to -1, which yields UINT_MAX. In general,
the value -1 converted to any unsigned type yields the maximum
value of that unsigned type.
Then we right-shift the result by 1 bit (>> 1), yielding UINT_MAX/2
(truncated). On most systems, that's going to be the value of
INT_MAX (though, as Eric Sosman points out, it's not of the right
type, nor is it suitable for use as the system's definition of
INT_MAX). It's possible for INT_MAX *not* to equal (((unsigned)-1)>>
1), but that can happen only in the presence of padding bits,
which are rare in modern systems.
(On a typical 32-bit system, UINT_MAX is 2**32-1, or 0xffffffff,
or 4294967295, and INT_MAX is 2**31-1, or 0x7fffffff, or 2147483647;
other values are possible. The "**" denotes expontiation; C doesn't
actually have a "**" operator.)
Now has it happens, the rule for converting signed int to unsigned int
works out very well for a typical 2's-complement system. The generated
code for the conversion doesn't really work by "repeatedly adding or
subtracting one more than the maximum value that can be represented in
the new type until the value is in the range of the new type". It does
the exact equivalent -- which happens to mean just copying or
reinterpreting the bits that make up the representation. An
implementation for a 1s'-complement or sign-and-magnitude system still
has to follow the same rule for conversion, even though it's less
convenient. The standard's definition of the conversion works entirely
in terms of values; the reason that definition is the way it is is that
it works well with the most common representation scheme. (And that's
why the standard's wording seems clumsy; it's describe something that
was originally about reinterpreting the representation, but doing so in
terms of mathematical values.)
If my assumption is right, then I am confused about the different
result of the following lines:
printf("%u\n", (unsigned)-1 >> 1); // 1
printf("%u\n", (unsigned)(-1 >> 1)); // 2
printf("%u\n", (unsigned)(-1 >> 2)); // 3
printf("%u\n", (unsigned)-1); // 4
The statement 1 prints 2147483647, which is what i want. But statement
2, 3, 4 all print the same, 4294967295. That is what I am confused
about. Why statement 1 and statement 2 generate different results? And
why statement 2, 3 and 4 generate the same result?
If the left operand of a ">>" operator is negative, the result is
implementation-defined (C99 6.5.7p5). Shift operators are primarily
intended for use with unsigned, or at least non-negative, operands.
You could probably figure out why your lines 2 and 3 are producing
the results you're seeing, but I don't think the answer would be
very useful.
Your line 4 is printing the value of UINT_MAX, as explained above.