Integer Overflow

A

arnuld

WANTED: How to handle integer overflows (and overflows in general)

#include <stdio.h>
#include <limits.h>
#include <errno.h>


int main(void)
{
int i, j, k;

errno = 0;
j = k = INT_MAX;
i = j + k;

printf("i = %d, errno = %d, ERANGE = %d\n", i, errno, ERANGE);
return 0;
}
===================== OUTPUT ============================
[arnuld@dune programs]$ gcc -ansi -pedantic -Wall -Wextra test.c
[arnuld@dune programs]$ ./a.out
i = -2, errno = 0, ERANGE = 34
[arnuld@dune programs]$



It overflows of course, (OT) gcc does not give any error even when all
flags are on (/OT). errno does not work, its value is still zero after
that overflow. In this case how to check for overflow ? I have checked
FAQs and 20.6b gives a technique to handle this which I have applied down
here, it works fine but I am not sure it will still work if I change line
j = k = INT_MAX to j = k = INT_MIN



int main(void)
{
int i, j, k;

errno = 0;
j = k = INT_MAX;

if(INT_MAX - j < k)
{
printf("Overflow error\n");
i = INT_MAX;
}
else
{
i = j + k;
}


printf("i = %d, errno = %d, ERANGE = %d\n", i, errno, ERANGE);
return 0;
}




NOTE: Came to think of this exercise because of this article that claims
"Nearly all binary searches and mergesorts are broken". Author talsk
about line 6 but I think line 3 will overflow issue if written in C, if
a.length comes out to be greater than INT_MAX.

http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-
nearly.html
 
I

Ike Naar

I have checked
FAQs and 20.6b gives a technique to handle this which I have applied down
here, it works fine but I am not sure it will still work if I change line
j = k = INT_MAX to j = k = INT_MIN

For reference, the FAQ says:

if(INT_MAX - b < a) {
fputs("int overflow\n", stderr);
return INT_MAX;
}
return a + b;

To check for underflow as well, you'll have to make a
slightly more elaborate test, something like:

if (0 < a && INT_MAX - a < b)
{
return INT_MAX; /* overflow */
}
else if (a < 0 && b < INT_MIN - a)
{
return INT_MIN; /* underflow */
}
else
{
return a + b; /* okay */
}
 
J

jacob navia

Le 28/12/10 11:13, arnuld a écrit :
WANTED: How to handle integer overflows (and overflows in general)

#include<stdio.h>
#include<limits.h>
#include<errno.h>


int main(void)
{
int i, j, k;

errno = 0;
j = k = INT_MAX;
i = j + k;

printf("i = %d, errno = %d, ERANGE = %d\n", i, errno, ERANGE);
return 0;
}
===================== OUTPUT ============================
[arnuld@dune programs]$ gcc -ansi -pedantic -Wall -Wextra test.c
[arnuld@dune programs]$ ./a.out
i = -2, errno = 0, ERANGE = 34
[arnuld@dune programs]$



It overflows of course, (OT) gcc does not give any error even when all
flags are on (/OT). errno does not work, its value is still zero after
that overflow. In this case how to check for overflow ? I have checked
FAQs and 20.6b gives a technique to handle this which I have applied down
here, it works fine but I am not sure it will still work if I change line
j = k = INT_MAX to j = k = INT_MIN



int main(void)
{
int i, j, k;

errno = 0;
j = k = INT_MAX;

if(INT_MAX - j< k)
{
printf("Overflow error\n");
i = INT_MAX;
}
else
{
i = j + k;
}


printf("i = %d, errno = %d, ERANGE = %d\n", i, errno, ERANGE);
return 0;
}




NOTE: Came to think of this exercise because of this article that claims
"Nearly all binary searches and mergesorts are broken". Author talsk
about line 6 but I think line 3 will overflow issue if written in C, if
a.length comes out to be greater than INT_MAX.

http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-
nearly.html

The lcc-win compiler allows you to check for overflow in all signed
integer operations. I have proposed that this be incorporated into
the C language as a standard but people in comp.std.c seem to prefer
wrong results than "wasting" some cycles in checking overflow.

Note that many other languages based on C (for instance Eiffel) have the
same problem.

You can download the lcc-win compiler at no cost from:

http://www.q-software-solutions.de
or
http://www.cs.virginia.edu/~lcc-win32

jacob
 
M

Malcolm McLean

WANTED: How to handle integer overflows (and overflows in general)

NOTE: Came to think of this exercise because of this article that claims
"Nearly all binary searches and mergesorts are broken".
Normally it's inappropriate for general-purpose logic code to have to
worry about overflow.

The real answer is to join the campaign for 64 bit ints. Searchable
arrays of real data aren't going to go above 64 bits any time soon, so
the problem is eliminated.
 
E

Eric Sosman

WANTED: How to handle integer overflows (and overflows in general)

Avoid them. If a conversion or (sub-)expression overflows,
the behavior is undefined -- meaning, among other things, that
it's already too late to "handle" something.
 
M

Marcin Grzegorczyk

arnuld said:
#include<stdio.h>
#include<limits.h>
#include<errno.h>


int main(void)
{
int i, j, k;

errno = 0;
j = k = INT_MAX;
i = j + k;

printf("i = %d, errno = %d, ERANGE = %d\n", i, errno, ERANGE);
return 0;
}
===================== OUTPUT ============================
[arnuld@dune programs]$ gcc -ansi -pedantic -Wall -Wextra test.c
[arnuld@dune programs]$ ./a.out
i = -2, errno = 0, ERANGE = 34
[arnuld@dune programs]$

$ gcc -ansi -pedantic -Wall -Wextra -m32 -ftrapv test.c
$ ./a.out
Aborted
$

As this example illustrates, the compiler is allowed to make signed
overflows cause an exception (in this case, a signal is raised).
 
B

BGB

Le 28/12/10 11:13, arnuld a écrit :
WANTED: How to handle integer overflows (and overflows in general)

#include<stdio.h>
#include<limits.h>
#include<errno.h>


int main(void)
{
int i, j, k;

errno = 0;
j = k = INT_MAX;
i = j + k;

printf("i = %d, errno = %d, ERANGE = %d\n", i, errno, ERANGE);
return 0;
}
===================== OUTPUT ============================
[arnuld@dune programs]$ gcc -ansi -pedantic -Wall -Wextra test.c
[arnuld@dune programs]$ ./a.out
i = -2, errno = 0, ERANGE = 34
[arnuld@dune programs]$



It overflows of course, (OT) gcc does not give any error even when all
flags are on (/OT). errno does not work, its value is still zero after
that overflow. In this case how to check for overflow ? I have checked
FAQs and 20.6b gives a technique to handle this which I have applied down
here, it works fine but I am not sure it will still work if I change line
j = k = INT_MAX to j = k = INT_MIN



int main(void)
{
int i, j, k;

errno = 0;
j = k = INT_MAX;

if(INT_MAX - j< k)
{
printf("Overflow error\n");
i = INT_MAX;
}
else
{
i = j + k;
}


printf("i = %d, errno = %d, ERANGE = %d\n", i, errno, ERANGE);
return 0;
}




NOTE: Came to think of this exercise because of this article that claims
"Nearly all binary searches and mergesorts are broken". Author talsk
about line 6 but I think line 3 will overflow issue if written in C, if
a.length comes out to be greater than INT_MAX.

http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-
nearly.html

The lcc-win compiler allows you to check for overflow in all signed
integer operations. I have proposed that this be incorporated into
the C language as a standard but people in comp.std.c seem to prefer
wrong results than "wasting" some cycles in checking overflow.

Note that many other languages based on C (for instance Eiffel) have the
same problem.

You can download the lcc-win compiler at no cost from:

http://www.q-software-solutions.de
or
http://www.cs.virginia.edu/~lcc-win32

well, one could always do arithmetic via API calls if they wanted
overflow-detecting calls.

for example, hypothetical API calls:
k=suAddInt32(i, j);
if(suStatusOvfP())
...

then, these could be implemented via ordinary calls (C or ASM), or via
macros and compiler intrinsics...


this would also avoid breaking any code which depends on the typical
integer overflow semantics or causing an overhead for operations where
overflow is not of particular interest, and also allows for a pure C
implementation if needed.

or such...
 
K

Keith Thompson

BGB said:
well, one could always do arithmetic via API calls if they wanted
overflow-detecting calls.

for example, hypothetical API calls:
k=suAddInt32(i, j);
if(suStatusOvfP())
...

then, these could be implemented via ordinary calls (C or ASM), or via
macros and compiler intrinsics...

One problem with that particular interface is that the linkage
between the call and the status is IMHO too loose. The C standard's
use of errno suffers from similar problems. For example, in the
presence of threading the implementation has to play tricks to avoid
getting status results from a different thread. In effect, the error
status is a global variable, with all the problems that can cause.

Unfortunately, C doesn't provide a good clean way of doing this.
You might have something like this:

err = suAddInt32(&sum, i, j);
if (err == 0) {
/* ok, sum contains the result */
}
else {
/* something went wrong, err tells you what */
}

Or it could return the sum and store the error code via a pointer
argument.

I'm almost tempted to suggest that adding exception handling to
the language might not be a completely horrible idea.
 
B

BGB

One problem with that particular interface is that the linkage
between the call and the status is IMHO too loose. The C standard's
use of errno suffers from similar problems. For example, in the
presence of threading the implementation has to play tricks to avoid
getting status results from a different thread. In effect, the error
status is a global variable, with all the problems that can cause.

Unfortunately, C doesn't provide a good clean way of doing this.
You might have something like this:

err = suAddInt32(&sum, i, j);
if (err == 0) {
/* ok, sum contains the result */
}
else {
/* something went wrong, err tells you what */
}

Or it could return the sum and store the error code via a pointer
argument.

I'm almost tempted to suggest that adding exception handling to
the language might not be a completely horrible idea.

one could use thread-local storage for this...

AFAIK pretty much all (common?) OS's with threads also provide for TLS
and similar...

if errno was accessed via a function call, little would stop putting
this in TLS as well (however, as-is, doing so requires using compiler
specific extensions, such as the "__thread" modifier or similar for
MSVC, ...).


this way, one doesn't need to bother with providing a convoluted API to
manage status, ....

for example, AFAIK, OpenGL contexts are often handled via TLS, ...

or such...
 
B

BGB

one could use thread-local storage for this...

AFAIK pretty much all (common?) OS's with threads also provide for TLS
and similar...

if errno was accessed via a function call, little would stop putting
this in TLS as well (however, as-is, doing so requires using compiler
specific extensions, such as the "__thread" modifier or similar for
MSVC, ...).

correction (brain dump...):
__thread is for GCC for, __declspec(thread) is for MSVC, grr...
typically, I wrap stuff like this in macros anyways though...
 
J

jacob navia

Le 28/12/10 22:46, Keith Thompson a écrit :
One problem with that particular interface is that the linkage
between the call and the status is IMHO too loose. The C standard's
use of errno suffers from similar problems. For example, in the
presence of threading the implementation has to play tricks to avoid
getting status results from a different thread. In effect, the error
status is a global variable, with all the problems that can cause.

Unfortunately, C doesn't provide a good clean way of doing this.
You might have something like this:

err = suAddInt32(&sum, i, j);
if (err == 0) {
/* ok, sum contains the result */
}
else {
/* something went wrong, err tells you what */
}

Or it could return the sum and store the error code via a pointer
argument.

I'm almost tempted to suggest that adding exception handling to
the language might not be a completely horrible idea.
lcc-win proposes:

int a,b,c;
// ...
c = a+b;
if (_overflow()) {
}

What's wrong with that?

In i386asm:
addl %eax,%ecx
jo _overflowhandler
 
B

BGB

Le 28/12/10 22:46, Keith Thompson a écrit :
lcc-win proposes:

int a,b,c;
// ...
c = a+b;
if (_overflow()) {
}

What's wrong with that?

In i386asm:
addl %eax,%ecx
jo _overflowhandler

issues:
no pure C implementation is possible (while still preserving semantics);
it may carry costs on architectures which do not provide this by default
(arithmetic isn't safe unless the optimizer notes that one doesn't
capture the state);
since these status codes can only apply to a single operation at a time,
little is gained by using a raw operator;
....

providing a special intrinsic to wrap the addition allows similar to be
done without requiring this.


in my case, early on I had also added additional semantics to C to add a
new feature: geometric vector math.

however, after realizing the costs of using the feature in this form, I
switched to using intrinsics instead, since they could provide an API
which would work on existing compilers with only a modest increase in
typing.


more so, defining something like, say:
#define suAddInt32(a, b) ((a)+(b))
#define suStatusOvfP() _overflow()

would be... trivial...


similarly, I also use a standard-C-friendly interface for dynamic types
and OO facilities as well, since again this avoids compiler-specific
features...

it is really no different from wrapping any compiler specific keywords
or annotations in macros so that they may be stripped out in standard C.

for example, I could have stuff in my code like:

typedef struct FooObj_s *FooObj __declspec(dytname("_foo_obj_t"));
typedef __variant_this FooObj FooThis;
....

or, also valid:
typedef [dytname("_foo_obj_t") this] struct FooObj_s *FooObj;


but, I found it preferable to wrap a lot of this in generic macros, say:

typedef struct FooObj_s *FooObj dytname("_foo_obj_t");
typedef dycThisT(FooObj) FooThis;

which allow the preprocessor to strip a lot out, and without necessarily
exposing standard C compilers to a lot of my extensions and IDL and
reflection-related metadata...


much like a few simple macros allow me to both make use of
__declspec(dllimport) and __declspec(dllexport) without compromising my
ability to build my code on Linux and via both MSVC and GCC in addition
to using my own tools, ...


I hope the reasoning for this is understood...
 
J

jacob navia

Le 29/12/10 09:18, BGB a écrit :
On 12/28/2010 11:31 PM, jacob navia wrote:

issues:
no pure C implementation is possible (while still preserving semantics);

Yes.
But for the implementing a+b there is no pure c implementation either.
It is clear that at *some* level C must meet assembly.
it may carry costs on architectures which do not provide this by default
(arithmetic isn't safe unless the optimizer notes that one doesn't
capture the state);

There is no processor that hasn't an overflow flag. And if
you happen to run in a weird processor without it, you can always
write it in C...
since these status codes can only apply to a single operation at a time,
little is gained by using a raw operator;

Yes, you have to split your statements to figure out which operation
overflowed, if you are interested in that.
...

providing a special intrinsic to wrap the addition allows similar to be
done without requiring this.

That's the same actually:

#define myProtectedAddition(result,op1,op2) result=op1+op2,_overflow()
in my case, early on I had also added additional semantics to C to add a
new feature: geometric vector math.

however, after realizing the costs of using the feature in this form, I
switched to using intrinsics instead, since they could provide an API
which would work on existing compilers with only a modest increase in
typing.

That's what I propose: _overflow() is an intrinsic.
more so, defining something like, say:
#define suAddInt32(a, b) ((a)+(b))
#define suStatusOvfP() _overflow()

would be... trivial...
Yes

[snip]


I hope the reasoning for this is understood...

Yes, but that is not about checking for overflow. Now tell me:

How does your compiler handle it then?
 
B

BartC

NOTE: Came to think of this exercise because of this article that claims
"Nearly all binary searches and mergesorts are broken". Author talsk
about line 6 but I think line 3 will overflow issue if written in C, if
a.length comes out to be greater than INT_MAX.

http://googleresearch.blogspot.com/2006/06/extra-extra-read-all-about-it-
nearly.html

That exercise will fail on the (high+low)/2 expression when high+low is
likely to exceed around two billion.

As the articles says, this was unlikely when the code was created (and even
now, would only apply to an int-array size of 4MB or more).

The answer is simply to use a more appropriate int-size, as trying to mess
about with overflow detection in code like this is a waste of time. (Suppose
you *do* detect that high+low overflows, then what, try and reinvent some
way of doing 33-bit arithmetic?

And what would happen if high and low were unsigned; in that case the
results would go wrong when high+low exceeded about four billion, but this
time that would be well-defined!)

If no wider arithmetic was available, then it would simplest (in this
example), to just impose an upper limit on how big an array can be searched.

I think overflow detection (ie. detecting it after the event) is more useful
when implementing some higher arithmetic routines, rather than for general
use.
 
B

BartC

pete said:
No it wouldn't.


This is simpler: (high - low) / 2 + low
A bit of analysis, is worth a byte of program.

Well, it would be simpler than rewriting code (that has been proved correct,
apparently). Especially if the routine is inside someone else's binary
library.

Although even with that fix, it might still not be possible to search the
maximum possible array size, depending on the type of a.length, so limits
could still be advisable (but at a more reasonable two billion elements
rather than one).

And specifying a range of inputs to a routine, over which it is guaranteed
to work properly, might still be a useful thing to do.
 
J

jacob navia

Le 29/12/10 15:34, christian.bau a écrit :
So what about this:

c = a + b + c;
if (_overflow()) {
}

This is invalid since you do not know where the
overflow happened. The correct way is:
c = c+b;
if (_overflow()) goto error;
c += a;
if (_overflow()) goto error;

This becomes quickly tedious. That is whty lcc win allows you to
specify
#pragma overflowcheck on/off
How is that going to work? Or c = f () + g () ?
c = f();
tmp = g();
if (c+= tmp,_overflow()) goto error;
What _precisely_ is the semantics of the "_overflow" function?

It returns 1 if the last operation overflowed, zero if not.
Concerned operations: signed +,-,*, and /.


What
about this one:

c = a + b;
error = (_overflow() | (c + 3 == x));
if a+b overflowed or c+3 is equal to x error is 1, otherwise is zero
what else you expect?
Keep in mind that I wrote | and not ||.

If the right hand is evaluated before the intrinsic the new operation
will overwrite any overflow flag from the last one.

All this problems can be avoided if you use the #pragma overflowcheck
form.

What bothers me is that you seem so upset that I offer a SOLUTION
instead of just allowing UB. Maybe you can offer a BETTER solution?

I am all ears.

jacob
 
J

jacob navia

Some people here aren't aware of the problems overflow can produce. Just
remember the calloc example

Many implementations of calloc(count size) just multiply count*size
ignoring overflow with catastrophic results. We had a discussion about
that in comp.lang.c

jacob
 
B

BGB

Some people here aren't aware of the problems overflow can produce. Just
remember the calloc example

Many implementations of calloc(count size) just multiply count*size
ignoring overflow with catastrophic results. We had a discussion about
that in comp.lang.c

if one has an overflow here, then one already has much greater problems
than having an overflow here...
 
J

Jorgen Grahn

One problem with that particular interface is that the linkage
between the call and the status is IMHO too loose. The C standard's
use of errno suffers from similar problems. For example, in the
presence of threading the implementation has to play tricks to avoid
getting status results from a different thread. In effect, the error
status is a global variable, with all the problems that can cause.

Unfortunately, C doesn't provide a good clean way of doing this.
You might have something like this:

err = suAddInt32(&sum, i, j);
if (err == 0) {
/* ok, sum contains the result */
}
else {
/* something went wrong, err tells you what */
}

Or it could return the sum and store the error code via a pointer
argument.

I'm almost tempted to suggest that adding exception handling to
the language might not be a completely horrible idea.

I'd suggest simply moving to C++ for the situations where you need to
fail explicitly rather than suffer the silent overflow. The mechanism
is there already -- and so is operator overloading, to make it
readable.

This was after all one of the reasons C++ got created in the first
place.

/Jorgen
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,007
Latest member
obedient dusk

Latest Threads

Top