Balban said:
On my compiler (gcc), if I add an integer value to a void pointer the
integer is interpreted as signed instead of unsigned. Is this expected
behavior?
I don't think that's what's happening.
As has already been mentioned, arithmetic on void* is a gcc-specific
extension; in standard C, it's a constraint violation, requiring a
diagnostic.
But the same thing applies to arithmetic on char*, which is well
defined by the standard.
Adding a pointer and an integer (p + i) yields a new pointer value
that points i elements away from where p points. For example, if p
points to the element 0 of an array, then (p + 3) points to element 3
of the same array. If p points to element 7 of an array, then (p - 2)
points to element 5 of the same array.
It would have been helpful if you had shown us an example of what
you're talking about. But suppose we have:
char arr[10];
char *p = arr + 5;
int i = -1;
unsigned int u = -1;
Let's assume a typical system where int and pointers are 32 bits.
So p points to arr[5]. The expression (p + i) points to arr[4].
But consider (p + u).
Since u is unsigned, it can't actually hold the value -1. During
initialization, that value is implicitly converted from signed
int to unsigned int, and the value stored in u is 4294967295.
In theory, then, (p + u) would point to arr[4294967300], which
obviously doesn't exist. So the behavior is undefined, if you try
to evaluate (p + u), anything can happen.
What probably will happen on typical modern systems is that the
addition will quietly wrap around. Let's assume that pointer values
are represented as 32-bit addresses that look like unsigned integers
(nothing like this is required by the standard, but it's a typical
implementation), and let's say that arr is at address 0x12345678.
Then p points to address 0x1234567d, and (p + 4294967295) would
theoretically point to address 0x11234567c. But this would require 33
bits, and we only have 32-bit addresses. Typically, an overflowing
addition like this will quietly drop the high-order bit(s) yielding an
address of 0x1234567c -- which just happens to be the address of
arr[4].
So you initialized u with the value -1, computed (p + u), and
got the same result you would have gotten for (p + (-1)). But in
the process, you generated an intermediate result that was out of
range, resulting in undefined behavior. (This is really the worst
possible consequence of undefined behavior: having your program
behave exactly as you expected it to. It means your code is buggy,
but it's going to be very difficult to find and correct the problem.)
This kind of thing is very common with 2's-complement systems. The
2's-complement representation is designed in such a way that addition
and subtraction don't have to care whether the operands are signed or
unsigned. But you shouldn't depend on this. The behavior of addition
and subtraction operations, either on integers or on pointers, is well
defined only when the mathematical result is within the required
range. Adding 0xFFFFFFFF to a pointer can appear to work "correctly",
as if you had really added -1, but it's better to just add a signed
value -1 in the first place.
Even if your code never runs on anything other that the system you
wrote it for, an optimizing compiler may assume that no undefined
behavior occurs. For example, if you write (p + u), it can assume
that p is in the range 0 to 5, and perform optimizations that depend
on that assumption.