Performance of signed vs unsigned types

K

Keith Thompson

Thomas Scrace said:
Well it turned out that you are correct. I have discussed this in previous
posts, but, briefly, I had a bug that caused the function to give me the
result I was looking for, but using a method I was not expecting.

The reason was that I was initiating a loop by assinging -1 to an unsigned type.
This meant that the variable immediately had the max value, and the loop exited
without looping through all the possible values of the unsigned type (which is
what I thought it was doing).

From my perspective this meant that the program was looping through a large
number of values extremely quickly in comparison with the signed version of the
function, which was extremely slow (it actually *was* looping through all the
values).

I have posted the full version of the program I ended up with in a previous
message if you are interested.

Thanks for the information - it was very interesting.

This program makes some non-portable assumptions, but it should work as
expected on any system you're likely to encounter.

#include <stdio.h>

int main(void)
{
const unsigned char uchar_max = -1;
const signed char schar_max = uchar_max / 2u;
const unsigned short ushrt_max = -1;
const signed short shrt_max = ushrt_max / 2u;
const unsigned int uint_max = -1;
const signed int int_max = uint_max / 2u;
const unsigned long ulong_max = -1;
const signed long long_max = ulong_max / 2ul;
const unsigned long long ullong_max = -1;
const signed long long llong_max = ullong_max / 2ull;

printf("schar_max = %20d\n", schar_max);
printf("uchar_max = %20u\n", (unsigned)uchar_max);
printf("shrt_max = %20d\n", shrt_max);
printf("ushrt_max = %20u\n", (unsigned)ushrt_max);
printf("int_max = %20d\n", int_max);
printf("uint_max = %20u\n", uint_max);
printf("long_max = %20ld\n", long_max);
printf("ulong_max = %20lu\n", ulong_max);
printf("llong_max = %20lld\n", llong_max);
printf("ullong_max = %20llu\n", ullong_max);

return 0;
}

It's reliable for the unsigned types; -1 converted to any unsigned type
always yields the maximum value of that type.

For the signed types, it simply assumes that the maximum value is half
the maximum value of the corresponding signed type (with truncating
division); for example, if USHRT_MAX is 65535, it assumes SHRT_MAX is
32767. This is likely to be true even on systems that don't use
2's-complement. It will break down only if corresponding signed and
unsigned types have different numbers of padding bits. (Most or all
modern systems have no padding bits.)

Computing the minimum value for signed types is less straightforward.
It's likely to be either -FOO_MAX-1 (2's-complement) or -FOO_MAX
(1s'-complement or sign-and-magnitude), and you can (99%-reliably)
determine which representation is used by type-punning (e.g., set
the signed member of a union to -1 and see what value the unsigned
value takes on). This will fail if a system uses 2's-complement
but chooses to make the minimum representable value a trap
representation.

Of course it's easier just to use <limits.h>, but perhaps this kind of
technique could be useful for someone *implementing* <limits.h>.
 
M

Michael Press

James Kuyper said:
That's one example of precisely what I was talking about. As soon as x >
INT_MAX - step, the expression x+step has undefined behavior; as soon as
step > INT_MAX/2, the expression 2*step has undefined behavior. The
above code seems to rely upon the unportable assumption that the actual
behavior in each of those cases will be to produce a result that is
negative. That's very common behavior, but it's not guaranteed by the
standard. That's what I meant by "serious portability issues".

Does not assume it goes negative.
 
J

James Kuyper

James Kuyper said:
Does not assume it goes negative.

By assuming that adding two positive ints, or multiplying a positive int
by 2, will produce a negative result when the operation overflows, I can
see why the above program might be expected to produce the intended
result. It seemed a natural assumption, because it happens to be true on
the implementations I'm familiar with. However, I suppose that's not the
only assumption that would make it work.

However, until an overflow occurs, x keeps increasing by step, and step
keeps doubling, so overflow is guaranteed to occur during the
calculation of y sooner or later. This code is apparently expected to
produce a useful result, which implies that some assumption, of some
kind, was made about the value of either (y=x+step)>x or (y=2*step)>step
when that overflow occurs. No such assumption, regardless of what that
assumption was, can be justified by reference only to the C standard. To
put it another way, a fully conforming implementation of C can violate
the assumption, no matter what that assumption was.
 
J

James Kuyper

James Kuyper said:
No, it does not.

Well, that's what I see on both my home machine (AMD64 Ubuntu Linux) and
my work machine (SGI IRIX) when I instrument the else clause with a
"printf("%d\n", y)". If something else happens on the machines that
you're familiar with, that's perfectly conforming, too - which is
precisely my point.

Your responses are just a little too laconic. Could you expand a little
on what it is you're trying to communicate?
 
M

Michael Press

James Kuyper said:
Well, that's what I see on both my home machine (AMD64 Ubuntu Linux) and
my work machine (SGI IRIX) when I instrument the else clause with a
"printf("%d\n", y)". If something else happens on the machines that
you're familiar with, that's perfectly conforming, too - which is
precisely my point.

Your responses are just a little too laconic. Could you expand a little
on what it is you're trying to communicate?

Not until you prove or retract your claim that I and my code
assume addition or multiplication on int produces negative results.
 
J

James Kuyper

Not until you prove or retract your claim that I and my code
assume addition or multiplication on int produces negative results.

I wouldn't call it a "claim" as such, when it was explicitly qualified
by the phrase "seems to", explicitly acknowledging that what seems to be
the case, might not be. A "guess" might be a more appropriate way of
describing it. I can't imagine how I could possibly prove anything about
what your motivations were while writing that code, and I have no
intention of trying.

It was not my intention to suggest any certainty about why you thought
that code should work. I said as much, in words you chose to clip:
However, I suppose that's not the only assumption that would make it work.

My comment that:
If something else happens on the machines that you're familiar with, that's perfectly conforming, too

was also intended to concede that you might have relied upon some other
assumption. Since you seem offended by the idea that you were relying on
that particular assumption, I'll happily retract my guesses as to what
you were thinking when you wrote that code. However, that leaves me with
no idea why you thought it would work. Would you care to explain what
assumptions you were making about the results of signed integer overflow
when you wrote that code?

It did work, on the machines where I tested it, and it worked precisely
because x+step did produce a negative value.
 
M

Michael Press

James Kuyper said:
I wouldn't call it a "claim" as such,

We disagree.
when it was explicitly qualified
by the phrase "seems to", explicitly acknowledging that what seems to be
the case, might not be. A "guess" might be a more appropriate way of
describing it. I can't imagine how I could possibly prove anything about
what your motivations were while writing that code, and I have no
intention of trying.

It was not my intention to suggest any certainty about why you thought
that code should work. I said as much, in words you chose to clip:

Since I never wrote something such as if( x < 0), there are no grounds
for assuming I wrote about negatives or overflow.
My comment that:

was also intended to concede that you might have relied upon some other
assumption. Since you seem offended by the idea that you were relying on
that particular assumption, I'll happily retract my guesses as to what
you were thinking when you wrote that code. However, that leaves me with
no idea why you thought it would work. Would you care to explain what
assumptions you were making about the results of signed integer overflow
when you wrote that code?

It did work, on the machines where I tested it, and it worked precisely
because x+step did produce a negative value.

Here is a list of assumptions. They are as complete and consistent
as I could make them in the time spent; but very likely reducible.

Assumptions.

* There is a set called int.
* int is finite.
* There are binary operators {+, *} on int.
* int is closed under {+, *}.
* + is commutative
* * is commutative
* Multiplication is distributive over addition.
* There is an additive identity 0
* There is a multiplicative identity 1.
* There is an order relation < on int.
* The relation < is transitive.
* If a,b in int, then exactly one of a < b, b < a, a = b obtains.
* 0 < 1.
* If a < b, then a < a + 1 and there is no x such that a < x < a + 1.
* Addition is repeated addition of 1.
- That is if 0 < b, then there is a positive integer (not int!), n, such that
b = 1 + 1 + ... + 1 (n times).
- If 0 < a + b, then a + b = a + [1 + 1 + 1 + ... + 1 (n times)].
* Multiplication is repeated addition.

I could simply assume there is a maximum element M,
since that is an assumption of the C programming language,
but it is a consequence of the assumptions as well.

I used division in the code, but that can be replaced
by keeping a list of the step sizes used, and reverting
to previously used step sizes.
 
I

Ike Naar

[...] However, that leaves me with
no idea why you thought it would work. Would you care to explain what
assumptions you were making about the results of signed integer overflow
when you wrote that code?

It did work, on the machines where I tested it, and it worked precisely
because x+step did produce a negative value.

Perhaps he's hinting at saturating arithmetic.
 
T

Thomas Scrace

I wouldn't call it a "claim" as such,

We disagree.
when it was explicitly qualified
by the phrase "seems to", explicitly acknowledging that what seems to be
the case, might not be. A "guess" might be a more appropriate way of
describing it. I can't imagine how I could possibly prove anything about
what your motivations were while writing that code, and I have no
intention of trying.

It was not my intention to suggest any certainty about why you thought
that code should work. I said as much, in words you chose to clip:

Since I never wrote something such as if( x < 0), there are no grounds
for assuming I wrote about negatives or overflow.
My comment that:

was also intended to concede that you might have relied upon some other
assumption. Since you seem offended by the idea that you were relying on
that particular assumption, I'll happily retract my guesses as to what
you were thinking when you wrote that code. However, that leaves me with
no idea why you thought it would work. Would you care to explain what
assumptions you were making about the results of signed integer overflow
when you wrote that code?

It did work, on the machines where I tested it, and it worked precisely
because x+step did produce a negative value.

Here is a list of assumptions. They are as complete and consistent
as I could make them in the time spent; but very likely reducible.

Assumptions.

* There is a set called int.
* int is finite.
* There are binary operators {+, *} on int.
* int is closed under {+, *}.
* + is commutative
* * is commutative
* Multiplication is distributive over addition.
* There is an additive identity 0
* There is a multiplicative identity 1.
* There is an order relation < on int.
* The relation < is transitive.
* If a,b in int, then exactly one of a < b, b < a, a = b obtains.
* 0 < 1.
* If a < b, then a < a + 1 and there is no x such that a < x < a + 1.
* Addition is repeated addition of 1.
- That is if 0 < b, then there is a positive integer (not int!), n, such that
b = 1 + 1 + ... + 1 (n times).
- If 0 < a + b, then a + b = a + [1 + 1 + 1 + ... + 1 (n times)].
* Multiplication is repeated addition.

Those axioms seem awfully familiar to me somehow...
 
J

James Kuyper

....

Since I never wrote something such as if( x < 0), there are no grounds
for assuming I wrote about negatives or overflow.

Yes, there were grounds for assuming that you were writing about
overflow, and more than adequate ones: your code inherently produces an
overflow. I assumed that you were aware of that fact and were, in fact
relying upon the behavior that you expected to occur when it overflowed.
If you wrote this code without realizing that overflow would occur, I
apologize for making unwarranted assumptions about your competence.

I never claimed that you wrote anything about x<0. On most
implementations I'm familiar with, x will remain positive throughout the
run of that program. It's y which will end up negative as soon as the
relevant expression overflows on those implementations, though that's
not guaranteed by the standard.

....
Here is a list of assumptions. They are as complete and consistent
as I could make them in the time spent; but very likely reducible.

Assumptions.

The assumptions that you list seem very familiar. About the only things
you've left out are the existence of an inverse element and
associativity for addition.

However, completeness was unnecessary; a more specific answer would have
been a lot more responsive to my question. What I was really looking for
was some idea of what you expections you had about the result of x+step
when the mathematical result of adding those two number would be greater
than INT_MAX, a situation your program is guaranteed to create.

Let me be a lot more specific, let's consider the case where x+step is
about to be calculated with x and step both equal to the largest power
of 2 less than or equal to INT_MAX, a situation your program is
guaranteed to create. Do you think that there are any restrictions, of
any kind, on the value of x+step at that point? If so, what do you think
those restrictions are?

Typical 2's complement implementations of 'int' obey all of those
assumptions, but the C standard does not mandate them. In particular,
none of the following assumptions is required to be true for 'int'
arithmetic :
* + is commutative
* * is commutative
* Multiplication is distributive over addition. ....
* Addition is repeated addition of 1.
- That is if 0 < b, then there is a positive integer (not int!), n, such that
b = 1 + 1 + ... + 1 (n times).
- If 0 < a + b, then a + b = a + [1 + 1 + 1 + ... + 1 (n times)].
* Multiplication is repeated addition.

In every case where a+b or a*b overflows, the behavior is undefined
according to the C standard, which means that the results of those
operations are allowed to violate each and every one of the above
assumptions. C does not guarantee that any of the following are true:

INT_MAX + 1 == 1 + INT_MAX
INT_MIN * 2 == 2 * INT_MAX
INT_MAX*(1+2) == INT_MAX*1 + INT_MAX*2
INT_MAX + 1 + 1 == INT_MAX + (1 + 1)
INT_MIN * 2 == INT_MIN + INT_MIN

I make explicit use of INT_MAX and INT_MIN only in order to portably
create values for which the relevant expressions are guaranteed to
overflow; the key issue is the overflow itself, not the macros I relied
upon to ensure it.

A fully conforming implementation could return, as the result for any
int expression that overflows, the current value stored in register 7,
which might happen to contain the machine address of 'x'. I suppose that
your program could produce the expected result on such an
implementation, but it's pretty unlikely to do so.
 
J

James Kuyper

[...] However, that leaves me with
no idea why you thought it would work. Would you care to explain what
assumptions you were making about the results of signed integer overflow
when you wrote that code?

It did work, on the machines where I tested it, and it worked precisely
because x+step did produce a negative value.

Perhaps he's hinting at saturating arithmetic.

He's stated that he's assuming that "Multiplication is distributive
over addition". Saturating arithmetic violates that assumption:

int x = 2;
int y = INT_MAX - 1;
int z = INT_MIN + 3;

The expression x*(y+z) has the value of 2 * (INT_MAX + INT_MIN + 2), but
under saturating arithmetic, x*y + x*z has the value INT_MAX + INT_MIN,
which is generally not the same value.
 
M

madamemandm

[snip]
Assumptions.

* There is a set called int.
* int is finite.
* There are binary operators {+, *} on int.
* int is closed under {+, *}.

signed ints aren't gaurenteed to be closed under {+, *}.

[snip]

Martin Shobe
 
K

Keith Thompson

Noob said:
What is the formal definition of being closed under {+, *} ?

A set S (in this case, the set of int values) is closed under an
operation "op" if, for every x and y in the set S, the result of
x "op" y is in S.

Since the behavior of int overflow is undefined (i.e., not defined
by the standard), it's possible that evaluating x + y, where the
mathematical sum of x and y is less that INT_MIN or greater than
INT_MAX, does not yield a value of type int. For example, it can
crash the program and therefore not yield any result at all.
 
K

Keith Thompson

Michael Press said:
No, it does not.

It certainly makes *some* assumption(s) not guaranteed by the standard.

Specifically, it appears to assume that, where step is a positive
int value, evaluating (x + step) will yield a value less than x if
and only if the mathematical result of (x + step) exceeds INT_MAX.

Again, as James wrote in text that you snipped, there are probably
other sets of assumptions that would also make the program work.
But there is no set of assumptions that make the program work that
cannot be violated by a conforming implementation. In other words,
there can be conforming C implementation on which your program does
*not* print "Maxmum int = " followed by the correct value of INT_MAX.

For example, if signed integer overflow causes the program to terminate,
then your program won't print anything.

Another example: Suppose a CPU provides 64-bit integer registers,
with addition and comparison working correctly for all 64-bit values,
but with multiplication and division working correctly only for
up to, say, 48-bit values. The C implementation chooses to store
ints in 64 bits, but to make INT_MAX 2**47-1 -- i.e., an int has
16 padding bits. It doesn't generate additional code to check for
overflow, since the behavior is undefined anyway. For operands and
results in the range INT_MIN..INT_MAX, all operations work correctly.
But the conditions
INT_MAX + 1 > INT_MAX
and
INT_MAX + INT_MAX > INT_MAX
will both be true.

The above is based on my very faulty recollection of the Cray T90 and
similar machines. It's probably incorrect in some significant ways,
but it is, I believe, a conforming hypothetical implementation. As I
recall, long was also stored in 64 bits, but without the padding
bits, making multiplication and division more expensive because
the compiler had to generate extra code to get correct results.

In general, without either access to <limits.h> or assumptions
about operations whose behavior is undefined, I don't believe it's
possible to determine the correct value for INT_MAX in portable C.

A contrived example: You could take a conforming implementation
with, say, 32-bit 2's-complement ints with no padding bits or
trap representations, and INT_MAX==2**31-1, and create a second
implementation that's identical *except* that INT_MAX==2**30-1 and
INT_MIN==-2**30. Arithmetic on int values would continue to work
exactly the same way, but the implementation woud simply choose
not to define the behavior of any int operations that yield values
exceeding the new tighter bounds.
 
M

madamemandm

Michael Press said:
On 04/21/2011 04:54 PM, Michael Press wrote:
[snip]

A contrived example: You could take a conforming implementation
with, say, 32-bit 2's-complement ints with no padding bits or
trap representations, and INT_MAX==2**31-1, and create a second
implementation that's identical *except* that INT_MAX==2**30-1 and
INT_MIN==-2**30.  Arithmetic on int values would continue to work
exactly the same way, but the implementation woud simply choose
not to define the behavior of any int operations that yield values
exceeding the new tighter bounds.

We don't even need to come up with contrived representations for
signed int to have it fail in some implementations. A compiler could
notice that the value of step is either >= 0 or undefined. This would
mean that x + step > x is false only if step = 0, which can't happen
because of the condition in the for loop. Similarly, the test for 2 *
step > step would also always succeed. The compiler then converts it
into something more like this...

int main (void)
{
int x = 1;
int y;
int step = 1;

while (step != 0)
{
x += step;
step *= 2;
}

printf("Maximum int = %d = %#x.\n", x, x);
return 0;
}

At this point, the compiler notices that step > 0 or undefined and the
final result is.

int main(void)
{
while (1)
;

return 0;
}

Martin Shobe
 
B

Ben Bacarisse

Keith Thompson said:
It certainly makes *some* assumption(s) not guaranteed by the standard.

Specifically, it appears to assume that, where step is a positive
int value, evaluating (x + step) will yield a value less than x if
and only if the mathematical result of (x + step) exceeds INT_MAX.

s/less than/less then or equal to/. It would be a nit-pick except that
I think the possibility of integer overflow latching on INT_MAX was
consciously included to reduce the assumptions being made.

<snip>
 
M

Michael Press

[snip]
Assumptions.

* There is a set called int.
* int is finite.
* There are binary operators {+, *} on int.
* int is closed under {+, *}.

signed ints aren't gaurenteed to be closed under {+, *}.

[snip]

So the program can bomb.
Is it time to fix the C standard to make int closed under {+,*}.
 
M

Michael Press

James Kuyper said:
Yes, there were grounds for assuming that you were writing about
overflow, and more than adequate ones: your code inherently produces an
overflow. I assumed that you were aware of that fact and were, in fact
relying upon the behavior that you expected to occur when it overflowed.
If you wrote this code without realizing that overflow would occur, I
apologize for making unwarranted assumptions about your competence.

I am aware of overflow, but simply ignored it.
I never claimed that you wrote anything about x<0. On most
implementations I'm familiar with, x will remain positive throughout the
run of that program. It's y which will end up negative as soon as the
relevant expression overflows on those implementations, though that's
not guaranteed by the standard.

...

The assumptions that you list seem very familiar. About the only things
you've left out are the existence of an inverse element and
associativity for addition.

Yes associativity of additions and multiplication.
An additive inverse is not guaranteed in 2's complement arithmetic.
However, completeness was unnecessary; a more specific answer would have
been a lot more responsive to my question. What I was really looking for
was some idea of what you expections you had about the result of x+step
when the mathematical result of adding those two number would be greater
than INT_MAX, a situation your program is guaranteed to create.

Let me be a lot more specific, let's consider the case where x+step is
about to be calculated with x and step both equal to the largest power
of 2 less than or equal to INT_MAX, a situation your program is
guaranteed to create. Do you think that there are any restrictions, of
any kind, on the value of x+step at that point? If so, what do you think
those restrictions are?

No and I do not know.
Typical 2's complement implementations of 'int' obey all of those
assumptions, but the C standard does not mandate them. In particular,
none of the following assumptions is required to be true for 'int'
arithmetic :
* + is commutative
* * is commutative
* Multiplication is distributive over addition. ...
* Addition is repeated addition of 1.
- That is if 0 < b, then there is a positive integer (not int!), n, such that
b = 1 + 1 + ... + 1 (n times).
- If 0 < a + b, then a + b = a + [1 + 1 + 1 + ... + 1 (n times)].
* Multiplication is repeated addition.

In every case where a+b or a*b overflows, the behavior is undefined
according to the C standard, which means that the results of those
operations are allowed to violate each and every one of the above
assumptions. C does not guarantee that any of the following are true:

INT_MAX + 1 == 1 + INT_MAX
INT_MIN * 2 == 2 * INT_MAX
INT_MAX*(1+2) == INT_MAX*1 + INT_MAX*2
INT_MAX + 1 + 1 == INT_MAX + (1 + 1)
INT_MIN * 2 == INT_MIN + INT_MIN

I make explicit use of INT_MAX and INT_MIN only in order to portably
create values for which the relevant expressions are guaranteed to
overflow; the key issue is the overflow itself, not the macros I relied
upon to ensure it.

A fully conforming implementation could return, as the result for any
int expression that overflows, the current value stored in register 7,
which might happen to contain the machine address of 'x'. I suppose that
your program could produce the expected result on such an
implementation, but it's pretty unlikely to do so.

Basically I assume.
addition and multiplication are closed on int,
and
int is totally ordered.

As pointed out elsewhere, addition and multiplication are not closed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top