Unsigned integer wrapping on overflow

J

joe

Wouldn't it be better if unsigned integers saturated upon overflow
instead of wrapping? I'm thinking from an overflow detection point of
view and using "overflow" to mean either overflow or underflow. Detection
could be done in a called function then if desired, if one is willing to
give up the maximum unsigned value for a given width integer in select
cases, which sounds very reasonable to me.
 
Ö

Öö Tiib

Wouldn't it be better if unsigned integers saturated upon overflow
instead of wrapping? I'm thinking from an overflow detection point of
view and using "overflow" to mean either overflow or underflow. Detection
could be done in a called function then if desired, if one is willing to
give up the maximum unsigned value for a given width integer in select
cases, which sounds very reasonable to me.

You are not forced to use arithmetic provided by language. You can
easily make your own numeric classes and their operations. Also it is
simple to throw together a static code analysis tool that warns on
usage of any language-provided operations outside of such wrappers. I
do not suggest it ... but C++ is easy to use as sort of language
construction kit from that viewpoint.

Then you can use (one or several) inbuilt values as "invalid value",
"missing value", "overflow value", "underflow value" etc.

Also you may add units to your values so 2 yards multiplied with 2
yards gives 4 square yards and 1 meter - 1 yard gives 0.0856 meters.
More likely than not there already is some library that does it.
 
S

Sprechen sie von C++

joe said:
Wouldn't it be better if unsigned integers saturated upon overflow instead
of wrapping? I'm thinking from an overflow detection point of view and
using "overflow" to mean either overflow or underflow. Detection could be
done in a called function then if desired, if one is willing to give up
the maximum unsigned value for a given width integer in select cases,
which sounds very reasonable to me.

The basic types map directly to the CPU instructions. So given the CPU
supports wraparound to allow for multi-precision calculations that is why it
rolls over. When that happens the CPU asserts the overflow bit so that code
can increment the higher order word.
 
J

joe

Sprechen said:
The basic types map directly to the CPU instructions. So given the CPU
supports wraparound to allow for multi-precision calculations that is
why it rolls over.

But the compiler can interrogate the overflow bit and do what it wants,
yes? What the performance hit would be I don't know.
 
J

joe

Öö Tiib said:
You are not forced to use arithmetic provided by language. You can
easily make your own numeric classes and their operations.

Like MS's SafeInt, I know. But if it was built-in to the low level, there
wouldn't be a need for those kinds of things. If people are doing it in
programmer-land, then surely the compiler could do it more efficiently?
And it doesn't have to be one way or the other, a new set of safe
primitive integers could be added for those who want to use them.
Also it is
simple to throw together a static code analysis tool that warns on
usage of any language-provided operations outside of such wrappers. I
do not suggest it ... but C++ is easy to use as sort of language
construction kit from that viewpoint.

Then you can use (one or several) inbuilt values as "invalid value",
"missing value", "overflow value", "underflow value" etc.

At a pretty big performance hit no doubt (considering that integers are
the most primitve things).
 
Ö

Öö Tiib

Like MS's SafeInt, I know. But if it was built-in to the low level, there
wouldn't be a need for those kinds of things. If people are doing it in
programmer-land, then surely the compiler could do it more efficiently?
And it doesn't have to be one way or the other, a new set of safe
primitive integers could be added for those who want to use them.



At a pretty big performance hit no doubt (considering that integers are
the most primitve things).

Integer arithmetics are cheap. Most int values in software can
consider itself overflown far below 2 billions. Some int variable can
consider itself overflowing as low as 300 other at 2000, depending
what it means. Red-black tree with depth 300 does not perhaps fit into
internet. ;)
 
T

tni

IMO, well-defined overflow (that you get with unsigned) is at least as
useful as saturation. But an additional mechanism that saturates or
throws on overflow would certainly be extremely useful.
The basic types map directly to the CPU instructions. So given the CPU
supports wraparound to allow for multi-precision calculations that is
why it rolls over.

But the funny thing is, you can't even implement that efficiently in
that so-called systems programming language C (and C++).
When that happens the CPU asserts the overflow bit so
that code can increment the higher order word.

Given that pretty much every CPU out there does have an overflow flag,
you should really have some way to access it.
 
Ö

Öö Tiib

Until you box them.

Your overflow handling does not add less overhead on machine code
level. Additionally that is on common case unnecessary overhead,
because variables that are in danger to overflow their physical limits
are of too small type anyway. Usual int variable is within smaller
limits than int32_t or even int16_t has. Additionally the platforms
where size of int16_t and int is same are becoming more and more
exotic each year. Some people find that unsigned variable has
underflow limit too near to its actual lower limit so even advocate
against using unsigned types. Handling the real limits gives lot more
bang for the buck.
 
J

Juha Nieminen

joe said:
But the compiler can interrogate the overflow bit and do what it wants,
yes? What the performance hit would be I don't know.

The performance hit would be quite drastic. Every single operation you
do with an integer would have to be followed by a conditional jump, which
the CPU might or might not guess correctly (if it doesn't, a really severe
penalty follows).
 
J

Juha Nieminen

joe said:
Wouldn't it be better if unsigned integers saturated upon overflow
instead of wrapping?

Would make implenting a linear congruential generator harder.

What would be possible is that if the operation overflows, the upper
bits go to a second variable. This could even be made optional in the
language (in other words, when you declare the integral variable, you
use some special keyword or syntax which tells the compiler that this
is the kind of variable which, on overflow, should spill the upper bits
to this another variable).
 
T

tni

The performance hit would be quite drastic. Every single operation you
do with an integer would have to be followed by a conditional jump, which
the CPU might or might not guess correctly (if it doesn't, a really severe
penalty follows).

Not really. Generally (if there is no branch prediction data), backward
jumps are predicted as taken, forward jumps as not taken. By structuring
the code right (overflow should be very rare), there should be very
little penalty. Somebody actually implemented that (IIRC on x86) and the
performance penalty wasn't bad (a couple %).
 
J

joe

Öö Tiib said:
Your overflow handling does not add less overhead on machine code
level. Additionally that is on common case unnecessary overhead,
because variables that are in danger to overflow their physical limits
are of too small type anyway. Usual int variable is within smaller
limits than int32_t or even int16_t has. Additionally the platforms
where size of int16_t and int is same are becoming more and more
exotic each year. Some people find that unsigned variable has
underflow limit too near to its actual lower limit so even advocate
against using unsigned types. Handling the real limits gives lot more
bang for the buck.

That may make more sense for certain domain value types but for the
general concept of integer, I don't think so. The scenario I was thinking
of was something like malloc which will accept any unsigned integer. Well
how can malloc know if the caller really wanted 4+ billion bytes or if a
negative number was passed? If integers saturated instead of wrapped,
malloc could disallow the value 1 less than the maximum unsigned 32-bit
int (on a 32-bit platform) and use the maximum value as a detection of
overflow/underflow.
 
J

joe

tni said:
Not really. Generally (if there is no branch prediction data),
backward jumps are predicted as taken, forward jumps as not taken. By
structuring the code right (overflow should be very rare), there
should be very little penalty. Somebody actually implemented that
(IIRC on x86) and the performance penalty wasn't bad (a couple %).

Intriguing. I want, I want. OK, it's a contest then (are you listening
GCC and VC++?): first one to implement the "alternative set of integers"
(aka, "safe integers") extension, wins! :) Actually, I think I like the
VS IDE too much to jump to GCC just for that.
 
J

joe

tni said:
IMO, well-defined overflow (that you get with unsigned) is at least as
useful as saturation.

(From another post I just made:)
"The scenario I was thinking
of was something like malloc which will accept any unsigned integer. Well
how can malloc know if the caller really wanted 4+ billion bytes or if a
negative number was passed? If integers saturated instead of wrapped,
malloc could disallow the value 1 less than the maximum unsigned 32-bit
int (on a 32-bit platform) and use the maximum value as a detection of
overflow/underflow."

Now, in that context, please explain how wrapping "is at least as useful
as saturation".
But an additional mechanism that saturates or
throws on overflow would certainly be extremely useful.

I was thinking an alternative set of integers. (I have only been thinking
about unsigned integers in this whole thread).
But the funny thing is, you can't even implement that efficiently in
that so-called systems programming language C (and C++).


Given that pretty much every CPU out there does have an overflow flag,
you should really have some way to access it.

It would, then/now, appear that there are many possibilities. Just the
saturating characteristic would make me a happy camper though (not that I
know all of the ramifications of what I am asking for).
 
J

joe

Juha said:
Would make implenting a linear congruential generator harder.

Again, I suggested that it not be either/or, but rather both: an
additional alternative set of integers available to the programmer to use
if he wanted to. Mix-n-match as you wish.
What would be possible is that if the operation overflows, the upper
bits go to a second variable. This could even be made optional in the
language (in other words, when you declare the integral variable, you
use some special keyword or syntax which tells the compiler that this
is the kind of variable which, on overflow, should spill the upper
bits to this another variable).

Sounds less than transparent.
 
J

James Kanze

Wouldn't it be better if unsigned integers saturated upon
overflow instead of wrapping?

It would be "better" (for some suitable definition of better) if
all types triggered some sort of hardware trap on overflow. (The
"undefined behavior" of signed integral types and floating point
types on overflow is designed to allow that.) Practically
speaking, however, this would result in significantly slower
programs on most platforms (or would require significantly more
intelligent compilers to get close to current performance).

Why undefined behavior is not the case for unsigned integral
types is somewhat of a mystery. I'm guessing that it is due to
some pre-standard code counting on wrapping.
 
J

James Kanze

The basic types map directly to the CPU instructions. So given
the CPU supports wraparound to allow for multi-precision
calculations that is why it rolls over.

Whether they map directly to CPU instructions or not depends on
the implementation. I know of at least one implementation with
compiler options to turn off compatilibity when mapping int to
unsigned int, because of the runtime overhead the required
mapping created.
When that happens the CPU asserts the overflow bit so that
code can increment the higher order word.

You can't access the carry or overflow flags from C/C++ anyway,
so you're stuck. When doing multi-precision calculations, you
can't use the largest size available; you have to use one less.
(Which means that if you want maximum speed, you have to use
assembler, where you can access the condition code.)
 
J

James Kanze

The performance hit would be quite drastic. Every single
operation you do with an integer would have to be followed by
a conditional jump, which the CPU might or might not guess
correctly (if it doesn't, a really severe penalty follows).

That is, of course, is pure speculation. Compilers have become
quite good at optimizing this sort of stuff, and figuring out in
advance whether the expression can actually overflow or not.
How much it would actually cost, once the compiler got through
optimizing, is anybodies guess. Studies with array bounds
checking (a much more limited aspect of the same thing) have
found only limited performance loss.

Historically, too, some machines have done this sort of thing in
hardware. In practice, today, we have a chicken and egg
problem: modern machines don't bother, since modern languages
don't require it, and in some cases, even forbid it, and modern
languages don't require it, because they're afraid of the
performance hit. (On the other hand, on some modern machines,
floating point is practically as fast as integral arithmetic.
And floating point units typically do check for overflow,
including when storing back into an int.)
 
J

James Kanze

Not really. Generally (if there is no branch prediction data), backward
jumps are predicted as taken, forward jumps as not taken. By structuring
the code right (overflow should be very rare), there should be very
little penalty. Somebody actually implemented that (IIRC on x86) and the
performance penalty wasn't bad (a couple %).

Intel had, at least at one point in the past, an into
instruction, which triggered a hardware trap if the overflow bit
was set in the status register. Presumably, branch prediction
would always assume this one as not being taken. (But I have no
idea of the performance of this on modern chips, or even if it
still exists.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,754
Messages
2,569,528
Members
45,000
Latest member
MurrayKeync

Latest Threads

Top