Arithmetic overflow checking

K

Keith Thompson

Joe Pfeiffer said:
Well, yes there is. For example on an addition, if both operands have
the same sign and the result is the other sign, you had an overflow.
Analogous conditions exist (which I don't remember off the top of my
head and am too lazy to look up) exist for subtraction and
multiplication. Integer division can't overflow.

On many systems, yes, you can detect signed overflow after the fact by
examining the values of the operands and the result. But in C, the
behavior is undefined -- and even on systems that use 2's-complement, an
optimizing compiler can take advantage of that fact and generate code
based on the assumption that overflow never occurs. For example, this:

int x = INT_MAX;
if (x + 1 < x) {
fprintf(stderr, "Overflow!\n");
}

can be optimized away (For example, gcc does this at -O2 and above.)

And yes, integer division can overflow; consider INT_MIN / -1.
My reading of the question was "OK, you've detected an overflow. Now
what do you do about it?" and the (correct) answer was, in essence,
"well, what do you *want* to do about it?"

But detecting the overflow in the first place can be *very* tricky.
 
B

BartC

The problem is that 32 bit ints are large enough to count most things,
but not all.

You can't give a different 32-bit integer to everyone in the world,

Just give two each...
for example, nor to all the bytes of RAM you might reasonably have in
your desktop computer.

No, you need up to 40-bits, if there is really a need to address every byte
in each object, and in every task, uniquely.
64 bit ints solve most of these problems, they can count the vast
majority of things we need to count.

When you start doing arithmetic requiring bigger numbers, then 64-bits won't
be enough either. If you're calculating factorials, it means you can go up
to 20! instead of 12! Big deal...

And if you're advocating always using 64-bits, whether it's needed or not,
then you could be doubling or quadrupling memory needs, and bandwidth,
unnecessarily. Especially on a machine with a natural word size of 32-bits.
 
G

Gene Wirchenko

On Tue, 12 Jul 2011 16:54:36 +0000 (UTC), Martin Gregorie

[snip]
...and yeah, whoever it was who said Java was verbose has clearly never
written COBOL!

I did, and I have. Java repeats stuff.

Sincerely,

Gene Wirchenko
 
G

Gene Wirchenko

[snip]
Yes, but it must be checked after every operation.
A hardware that triggers an interrupt, would save this
extra checks.

No, it would not. The check would simply be implemented in
hardware. It will still take time.
Unchecked signed integer operations are used, because of
performance reasons. Many people don't care about correct

That is often what is stated, but rarely is there a benchmark
done to prove that the system would otherwise be too slow.
results, as long as they have maximum performance. With

Ha! They might not see the errors. If they do, expect to hear
about it.
hardware support there would be no performance loss. So
there would be no excuse to accept wrong results.

Sincerely,

Gene Wirchenko
 
M

markspace

It would not necessarily directly take time. It could be done in
parallel with the stage that writes results back to register files, or
bypasses them to other instructions.


A JO almost certainly would be executed in parallel with the ADD
instruction. Executing two instructions in parallel has been available
on consumer CPU (x86) since about 1992. It's two decade old technology.
Folks complaining about performance degradation due to overflow
detection are frankly greatly out of date in their understanding of CPU
architecture.
 
G

Gene Wirchenko

A JO almost certainly would be executed in parallel with the ADD
instruction. Executing two instructions in parallel has been available
on consumer CPU (x86) since about 1992. It's two decade old technology.
Folks complaining about performance degradation due to overflow
detection are frankly greatly out of date in their understanding of CPU
architecture.

I also suggest that they build a time machine and go for a ride
on a certain Ariane 5 launch.

Sincerely,

Gene Wirchenko
 
M

Martin Gregorie

I also suggest that they build a time machine and go for a ride
on a certain Ariane 5 launch.
An out-of-range signal might have been the initial clause[1] but the real
problem was that this exception caused an diagnostic bit pattern to be
written to the SRI's (Inertial Reference System's) normal output channel,
where the OBC (On Board Computer) interpreted it as flight data by
failing to recognise it as an exception message. Unfortunately, by
treating it as flight data, the OBC interpreted it as requiring full
engine deflection, causing the Ariane 5 to yaw violently. Unsurprisingly,
being side-on at high airspeed caused it to break up.

There real cause of the crash was using a poorly documented A4 SRI
without fully understanding its designed-in operating parameters or
ensuring that they were reset to interpret standard A5 operating
conditions as normal and within limits and then compounding the problem
by not designing the OBC to recognise SRI exception messages.

IOW, this crash was more a case of poor documentation and design rather
than arithmetic overflow.

The full report is here: http://www.di.unito.it/~damiani/ariane5rep.html

[1] The instrument causing the problem was an unmodified Ariane 4 SRI
which raised an out-of-limits exception when the normal Ariane 5
trajectory exceeded a permitted Ariane 4 horizontal velocity limit.
 
L

lewbloch

Martin said:
Gene said:
     I also suggest that they build a time machine and go for a ride
on a certain Ariane 5 launch.

An out-of-range signal might have been the initial clause[1] but the real
problem was that this exception caused an diagnostic bit pattern to be
written to the SRI's (Inertial Reference System's) normal output channel,
where the OBC (On Board Computer) interpreted it as flight data by
failing to recognise it as an exception message. Unfortunately, by
treating it as flight data, the OBC interpreted it as requiring full
engine deflection, causing the Ariane 5 to yaw violently. Unsurprisingly,
being side-on at high airspeed caused it to break up.

There real cause of the crash was using a poorly documented A4 SRI
without fully understanding its designed-in operating parameters or
ensuring that they were reset to interpret standard A5 operating
conditions as normal and within limits and then compounding the problem
by not designing the OBC to recognise SRI exception messages.

IOW, this crash was more a case of poor documentation and design rather
than arithmetic overflow.

The full report is here:http://www.di.unito.it/~damiani/ariane5rep.html

[1] The instrument causing the problem was an unmodified Ariane 4 SRI
which raised an out-of-limits exception when the normal Ariane 5
trajectory exceeded a permitted Ariane 4 horizontal velocity limit.  

In other words, this was a case where there *was* an out-of-range
exception, thus it makes the exact opposite point to the one Gene
presumably wanted to support.
 
G

Gene Wirchenko

Martin Gregorie wrote:
[snip]
[1] The instrument causing the problem was an unmodified Ariane 4 SRI
which raised an out-of-limits exception when the normal Ariane 5
trajectory exceeded a permitted Ariane 4 horizontal velocity limit.  

...the Ariane 5 having more powerful engines.
In other words, this was a case where there *was* an out-of-range
exception, thus it makes the exact opposite point to the one Gene
presumably wanted to support.

The data I read was that the exception was not handled. IIRC,
debugging got interpreted as navigational data.

Sincerely,

Gene Wirchenko
 
L

lewbloch

Gene said:
lewbloch said:
Martin Gregorie wrote:
[snip]
[1] The instrument causing the problem was an unmodified Ariane 4 SRI
which raised an out-of-limits exception when the normal Ariane 5
trajectory exceeded a permitted Ariane 4 horizontal velocity limit.  

     ...the Ariane 5 having more powerful engines.
In other words, this was a case where there *was* an out-of-range
exception, thus it makes the exact opposite point to the one Gene
presumably wanted to support.

     The data I read was that the exception was not handled.  IIRC,
debugging got interpreted as navigational data.

Precisely. There was an exception, and it was not handled. Having
the exception was not enough.
 
E

Eric Sosman

What I think he's saying is there's no way physically detect the
overflow in a language like C which has no exceptions.

C has something far more flexible than exceptions: It has
"undefined behavior."

:)

This thread is dragging on in both comp.lang.java.programmer,
where it began, and in comp.lang.c, to which some doubtless well-
meaning but insufficiently wise person cross-posted it. The two
languages are rather different (in Java, there *is* no integer
overflow in C's sense), and the thread has become at least half-
irrelevant to at least one of the two groups.

Permit me to suggest that people who wish to discuss integer
overflow in C should delete "comp.lang.java.programmer" from their
replies, and likewise people who wish to discuss an alternate Java
that handles overflow differently should delete "comp.lang.c". In
this as in many other matters, the two languages have very little
to do with each other.
 
T

tm

Or, if they're in C rather than C++, they either return a result
or set errno to some value that indicates they've run out of memory.

My librariy is written in C, but it raises a Seed7 exception
in out of memory situations. :)
Does GMP set errno when it runs out of memory?

I am not sure that Java can handle out of memory situations
with an exception.


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
M

Malcolm McLean

And if you're advocating always using 64-bits, whether it's needed or not,
then you could be doubling or quadrupling memory needs, and bandwidth,
unnecessarily. Especially on a machine with a natural word size of 32-bits.
We need to find out where memory is actually used.

For instance my current program work with genetic sequence data. When
memory gets tight, that's normally because I need an NxN matrix of
genes in an organism, where N is about 6000. However day to day, most
of the memory is used by sequence data - which fits comfortably into
2GB - a list of genes might be 15 megabytes.

So in my case, moving to 64 bit ints wouldn't impact memory
requirements at all. However that's not necessarily true of everyone.
 
T

tm

I think you're assuming either the code is running nearly bare on the
hardware (in which case your program has its own ISR), or that your OS
has a very inexpensive way to deliver the ISR to a process.

I was not talking about the overhead that occurs when an overflow
happens. Exceptions should be rare and people know that they are
not as cheap as normal computations.

I thought about overhead, when NO overflow occurs.

In an ideal world code with and without overflow checking would run
at the same speed (as long as no overflow occurs).
As I mentioned in another post just now, a decent compiler can use
library routines for all arithmetic operations and thus allow all
operations to be checked for overflow algorithmically.

Yes, but this would have probable more overhead than checking
an overflow flag, not to mention hardware that can do this
without extra cost (trigger interrupt).
See GCC's "-ftrapv" option, for example.

AFAIK -ftrapv adds code to check an overflow flag after arithmetic
operations.

IMHO -ftrapv is the right way, because it allows gcc to use hardware
support (when a CPU is able to trigger an interrupt on overflow).

BTW, does gcj support this option also?
Has javac a similar option?
You may also be assuming that overflows aren't so common in C. :)

Try re-compiling your system's entire userland with "gcc -ftrapv" and
see how much of it still works without aborting.

It does work.
I just had to change a program which checks left shift by comparing
the result with a multiplication by a power of two. AFAIK left shift
is not checked by -ftrapv, but multiplication is. Nor the chkint.sd7
program only checks left shift in situations where no overflow
occurs.

Using the signal to raise an exception is something else.
E.g.: gcc and clang disagree in the signal used (gcc: SIGABRT,
clang: SIGILL). So useful support of this feature in Seed7 may
take some time.


Greetings Thomas Mertes

--
Seed7 Homepage: http://seed7.sourceforge.net
Seed7 - The extensible programming language: User defined statements
and operators, abstract data types, templates without special
syntax, OO with interfaces and multiple dispatch, statically typed,
interpreted or compiled, portable, runs under linux/unix/windows.
 
G

Gene Wirchenko

Gene said:
lewbloch said:
Martin Gregorie wrote:
[snip]

[1] The instrument causing the problem was an unmodified Ariane 4 SRI
which raised an out-of-limits exception when the normal Ariane 5
trajectory exceeded a permitted Ariane 4 horizontal velocity limit.  

     ...the Ariane 5 having more powerful engines.
In other words, this was a case where there *was* an out-of-range
exception, thus it makes the exact opposite point to the one Gene
presumably wanted to support.

     The data I read was that the exception was not handled.  IIRC,
debugging got interpreted as navigational data.
Precisely. There was an exception, and it was not handled. Having
the exception was not enough.

No surprise there. Most of us understand that exceptions have to
be handled as well as thrown.

Sincerely,

Gene Wirchenko
 
L

lewbloch

Gene said:
lewbloch wrote:
Martin Gregorie wrote:
[snip]
[1] The instrument causing the problem was an unmodified Ariane 4 SRI
which raised an out-of-limits exception when the normal Ariane 5
trajectory exceeded a permitted Ariane 4 horizontal velocity limit.  
     ...the Ariane 5 having more powerful engines.
In other words, this was a case where there *was* an out-of-range
exception, thus it makes the exact opposite point to the one Gene
presumably wanted to support.
     The data I read was that the exception was not handled.  IIRC,
debugging got interpreted as navigational data.
Precisely.  There was an exception, and it was not handled.  Having
the exception was not enough.

     No surprise there.  Most of us understand that exceptions have to
be handled as well as thrown.

Apparently it was a surprise to the Ariane 5 folks.

But you brought up the case history in the first place. What was the
point if not that exceptions need to be handled? Isn't the problem
with the rocket exactly that exceptions weren't handled, despite your
attempt to palm that off as obvious? Or are you just shifting ground
now that your apparent original point turns out not to be supported by
the Ariane failure?

It certainly was not that overflow exceptions are good all by
themselves, since that didn't help the rocket in question.

The lesson I derive is that nothing is too simple, trivial or obvious
to overlook. Whether it's metric vs. English system of length that
makes one miss Mars, or failure to handle an exception that makes a
rocket blow up, despite that *obviously* one must use consistent
units, and *obviously* one must handle exceptions when thrown, we
still need to be diligent about such "obvious" matters.

So put aside your pseudo-condescension and let's learn the lessons
these case histories teach us.
 
G

Gene Wirchenko

[snip]
     No surprise there.  Most of us understand that exceptions have to
be handled as well as thrown.

Apparently it was a surprise to the Ariane 5 folks.

No. From what I read, in the area of the error, there were some
places where handling was judged needed and some not. What they did
about it was not adequate.
But you brought up the case history in the first place. What was the
point if not that exceptions need to be handled? Isn't the problem

First, you need a mechanism for them. Then, you handle them.
with the rocket exactly that exceptions weren't handled, despite your
attempt to palm that off as obvious? Or are you just shifting ground

But it is obvious to a professional. It is fairly basic.
now that your apparent original point turns out not to be supported by
the Ariane failure?

Exceptions were not dealt with properly. Not having them when
one should is another case of this.
It certainly was not that overflow exceptions are good all by
themselves, since that didn't help the rocket in question.

A professional would understand that they have to be used
properly, too.
The lesson I derive is that nothing is too simple, trivial or obvious
to overlook. Whether it's metric vs. English system of length that
makes one miss Mars, or failure to handle an exception that makes a
rocket blow up, despite that *obviously* one must use consistent
units, and *obviously* one must handle exceptions when thrown, we
still need to be diligent about such "obvious" matters.

Of course.
So put aside your pseudo-condescension and let's learn the lessons
these case histories teach us.

My condescension? Look in a mirror.

I assume that professionals will understand some pretty basic
things. For example, if we have an exception mechanism, that we do
something about the exceptions. Were I to explain every such detail,
it would be being condescending.

Sincerely,

Gene Wirchenko
 
M

markspace

The lesson I derive is that nothing is too simple, trivial or obvious
to overlook.


What I got from reading that is that the root problem was that the range
of values that the sensor was capable of producing was not understood.
Either or both physically producing, or would produce under normal (or
abnormal) system operation.

It was a failure to understand the the design, and its parameters. That
failure of understanding was then propagated down to the code level.
"We don't need to protect this because an out of range can't happen."

Somewhere, somehow, somebody has to ultimately understand what the
system does, and when. If you don't have that, then no amount of
general wolf-fencing (i.e., catching exceptions) will help, because you
won't know that the exception even means, let alone what to do about it.
 
K

Keith Thompson

markspace said:
What I got from reading that is that the root problem was that the range
of values that the sensor was capable of producing was not understood.
Either or both physically producing, or would produce under normal (or
abnormal) system operation.

As I recall, the range of values the sensor was capable of producing was
understood correctly *when the code was written*.

The problem is that the code was written for the Ariane 4. Management
decided to re-use the same code, with no modifications, on the Ariane 5
-- on which the valid range of values from the sensor was quite
different.
It was a failure to understand the the design, and its parameters. That
failure of understanding was then propagated down to the code level.
"We don't need to protect this because an out of range can't happen."

Decisions had to be made when the code was written about which
exceptions to handle, and which to assume couldn't happen.
Handling them all wasn't an option because it would have slowed
down the system enough so it wouldn't work at all. The particular
decisions were correct for Ariane 4.
Somewhere, somehow, somebody has to ultimately understand what the
system does, and when. If you don't have that, then no amount of
general wolf-fencing (i.e., catching exceptions) will help, because you
won't know that the exception even means, let alone what to do about it.

Given the decision to re-use the same code with no changes for a
new rocket (when the code wasn't designed for cross-rocket portability
in the first place), an improperly handled exception was just one of many
ways that it could have gone wrong.

(All this is based on my rather vague recollection of my partial reading
of the report.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top