Zero overhead overflow checking

Dik T. Winter · Sep 9, 2009

> John Nagle a écrit : ....
>
> Imagine that. 50% of the programs at that time were producing
> incorrect results in some situations!

You read that wrong. Not every program that was overflowing delivered
the wrong result! It can be that on at least some of the programs the
overflow was intentional.

Dag-Erling SmÃ¸rgrav · Sep 9, 2009

Bart said:
Someone mentioned hundreds of embedded processors for each advanced
processor. I guess these must be all a little different.

I think you misunderstand me. I did not say that for every advanced CPU
model there are hundreds of embedded CPU models; I said that for every
advanced CPU that comes off the figurative assembly line, tens or
hundreds of embedded CPUs do so as well.

Your cell phone contains at least one GPP and one DSP (possibly combined
on the same chip). The same probably goes for your desk phone. The
cell towers that your cell phone talks to and the exchanges that your
desk phone talks to contain hundreds of GPPs and DSPs. If you have a
multifunction digital watch, chances are it contains a microcontroller.
Your car contains dozens of microcontrollers. Your alarm clock, your TV
set, your DVD player, their respective remote controls, your food
processor, your microwave oven, your dishwasher, your burglary alarm,
probably some of the sensors connected to it, your printer, your copier,
your web camera, your switch (at least if it's managed), your DSL
router... the list goes on.

Chances are many of those microcontrollers are similar (many are based
on the ARM7 or ARM9 architecture) if not identical, and chances are many
of those run one of a handful of real-time operating systems (VxWorks,
QNX, RTEMS) or even Linux or BSD, all of which are written mostly in C
(thus disproving Jacob's claim that C can't be implemented on Harvard
machines like the ARM9).

DES

Ben Bacarisse · Sep 9, 2009

Dag-Erling SmÃ¸rgrav said:
Chances are many of those microcontrollers are similar (many are based
on the ARM7 or ARM9 architecture) if not identical, and chances are many
of those run one of a handful of real-time operating systems (VxWorks,
QNX, RTEMS) or even Linux or BSD, all of which are written mostly in C
(thus disproving Jacob's claim that C can't be implemented on Harvard
machines like the ARM9).

Small but important point: I don't think he said that. He said (over
in comp.std.c) that gcc was that "Harvard architectures are not
supported in gcc's conceptual model" which is not the same. I think
you posted a counter-example to that claim but as well, but we should
avoid putting words into others' mouths.

Hallvard B Furuseth · Sep 9, 2009

Keith said:
The standard says overflow on a signed integer computation
invokes undefined behavior, but overflow on conversion either
yields an implementation-defined result or (in C99) raises an
implementation-defined signal.

I've never understood why the language treats these two kinds of
overflow differently.

char *s;
int c;
while ((c = getchar()) != EOF) { ... *s++ = c; }

Have you ever written code like that?

This can "overflow on coversion" when char is signed. Yet C strings are
char, not unsigned char. That makes it a major pain in the *ss to
handle strings a formally correct way, too much so for my taste. I just
don't worry about it, trusting the market to isolate me from anyone
producing one's complement char implementations or whatever.

Keith Thompson · Sep 9, 2009

Hallvard B Furuseth said:
char *s;
int c;
while ((c = getchar()) != EOF) { ... *s++ = c; }

Have you ever written code like that?

This can "overflow on coversion" when char is signed. Yet C strings are
char, not unsigned char. That makes it a major pain in the *ss to
handle strings a formally correct way, too much so for my taste. I just
don't worry about it, trusting the market to isolate me from anyone
producing one's complement char implementations or whatever.

Ok, that's a good point. Common idioms like this depend on the
assumption that signed and unsigned chars are interchangeable,
even though the language doesn't support this assumption, and yes,
I've depended on that myself.

But that still doesn't quite explain the discrepancy. In C99,
the above code could easily blow up if the conversion raises an
implementation-defined signal. Even in C90, it could fail badly if
the implementatation-defined result is something odd (like, say,
if the result saturates to CHAR_MAX) -- though it could only fail
for character codes exceeding CHAR_MAX, and historically those were
somewhat unusual. If the behavior of the conversion were undefined,
the situation wouldn't be much worse than it already is.

Bart · Sep 9, 2009

Okay, fine: C is inherently non-portable, is implemented on only
an insignificant handful of machines, and it takes years to port C
code from one machine to the next. Useless, a failed language.

My comments were just personal observations. There does seem to be a
need to know the black art of compiler switches and makefile commands
to get these things working, especially when you closely follow the
instructions yet something stil doesn't work.

And the fact that gcc for example lists some eleven thousand lines of
compiler switches, last time I looked, some of which have to be just
right, clearly doesn't impact the portability of the language in any
way!

So why are you wasting your time here? Life's too unsigned short.

You're right. As a bit of a language designer myself, I'm never going
to be completely happy with C. Time to go back to my poor little x86
compiler and it's (IIRC) zero compiler switches (but which,
nevertheless, does the job!).

Dag-Erling SmÃ¸rgrav · Sep 9, 2009

Eric Sosman said:
So why are you wasting your time here? Life's too unsigned short.

That is *so* sig-worthy... almost makes me wish I was into sigs.

DES

Miles Bader · Sep 11, 2009

Stephen Sprunk said:
The back-end part that translates RTL to assembly is quite small, and
that is often all that needs porting for a new architecture. Most "new"
architectures are variations on existing ones, in part due to the
conscious desire to make it easier to port compilers, in which case you
only need to tweak a few things.

A few years ago I ported gcc to a very small (8-bit) processor at work
(the port was never released though

).

For the most part, the backend parts were straightforward enough; the
main problem was that the _non-backend_ parts of the compiler made
various assumptions that my processor didn't satisfy -- in particular,
reload needs a certain amount of "space" to work, and a very small
processor with very constrained register usage may not have it. Getting
around reload issues was really annoying, with lots of hacks to both the
backend and to reload itself (a horrible, practically impenetrable,
piece of code).

Things may be better these days, as gcc internals seem to slowly be
getting cleaned up...

-Miles

Bart van Ingen Schenau · Sep 12, 2009

Bart said:
Or perhaps C is used for (firmware for) a processor inside a phone
say, but that phone will be produced in the millions. Surely it must
help (in programmer effort, performance, code sise, any sort of
measure except actual portability) to have a tailored C version for
that system.

It will probably surprise you how portable the firmware for a mobile
phone has to be.
I have worked in that area, and there are several factors that make it a
definite advantage to write portable code.
1. Software development usually begins way before there is any hardware
available. Initially, the software is tested in a simulation environment
on a PC. This implies that the software must build for at least two,
dissimilar, platforms. Only the (parts of) device drivers that handle
the actual communication with the hardware are not tested in this way,
but they make up less than 1% of the total software.
2. The majority of the software in a mobile phone has to be reused for 5
or 6 generations of phone models, with possibly different hardware and
certainly slightly changed requirements. To cope with that,
maintainability takes a front seat. And with maintainability, often
portability comes along.

Bart v Ingen Schenau

John Nagle · Sep 13, 2009

Francis said:
and assuming a 16-bit int, the addition will overflow.

However this raises the whole issue of narrowing conversions. They are
allowed in C but what should happen if the converted value looses
information?

That's one of those places where the distinction between a coercion
and a conversion has teeth.

jacob navia · Sep 13, 2009

If overflow were to be taken seriously, I'd take the position
that assignments can overflow and that such overflows are errors.

Truncation in assignment usually indicates a problem with the
program. If you really want truncation, one may have to write
something like

unsigned char c;
unsigned long d;

c = d & 0xff;

Compilers can and should recognize such idioms and optimize them out.
For signed values, "%" should be used. (The semantics of "%" when
the divisor is positive are well-defined.)

After a few years of using Python more than C and C++, I have to
say that C/C++ now feel deficient in this area. There are probably
more programmers now using languages where overflow is either checked
or handled automatically than are using ones where it isn't. Java
takes a harder line in this area, and has well-defined integer arithmetic
semantics. C was defined before integer hardware representations settled
down. I have used Burroughs signed magnitude machines, and DEC and UNIVAC
36-bit mainframes. But that era is over. We now have standardized on
binary integer representations. Finally.

As I've said for years, this is a fixable problem, with known good
solutions,
but I don't expect it to be fixed in C/C++.

John Nagle

Excuse me but I do not see why it can't be fixed in C. My proposal was
precisely in that direction. I have implemented it, and it works.

I have updated my proposal in comp.std.c. Maybe you could take a look?

Thanks

John Nagle · Sep 13, 2009

Keith said:
Assignment itself cannot *directly* cause an overflow; it just
copies a value into an object. A conversion that's implicit in
an assignment can cause an "overflow", though the standard doesn't
use that term; see C99 6.3.1.3p3:

Otherwise, the new type is signed and the value cannot be
represented in it; either the result is implementation-defined
or an implementation-defined signal is raised.

For consistency, all forms of conversions should be treated alike.
That includes implicit conversions resulting from a cast, as well as
implicit conversions resulting from assignment, argument passing,
return statements, parameter passing, the "usual arithmetic
conversions", and whatever other cases I've forgotten.

I've argued that conversions should be treated like arithmetic
operators; there's obviously some disagreement on that point.

If overflow were to be taken seriously, I'd take the position
that assignments can overflow and that such overflows are errors.

Truncation in assignment usually indicates a problem with the
program. If you really want truncation, one may have to write
something like

unsigned char c;
unsigned long d;

c = d & 0xff;

Compilers can and should recognize such idioms and optimize them out.
For signed values, "%" should be used. (The semantics of "%" when
the divisor is positive are well-defined.)

After a few years of using Python more than C and C++, I have to
say that C/C++ now feel deficient in this area. There are probably
more programmers now using languages where overflow is either checked
or handled automatically than are using ones where it isn't. Java
takes a harder line in this area, and has well-defined integer arithmetic
semantics. C was defined before integer hardware representations settled
down. I have used Burroughs signed magnitude machines, and DEC and UNIVAC
36-bit mainframes. But that era is over. We now have standardized on
binary integer representations. Finally.

As I've said for years, this is a fixable problem, with known good solutions,
but I don't expect it to be fixed in C/C++.

John Nagle

Keith Thompson · Sep 13, 2009

jacob navia said:
John Nagle a Ã©crit : [...]

If overflow were to be taken seriously, I'd take the position
that assignments can overflow and that such overflows are errors.

Truncation in assignment usually indicates a problem with the
program. If you really want truncation, one may have to write
something like

unsigned char c;
unsigned long d;

c = d & 0xff; [...]
As I've said for years, this is a fixable problem, with known good
solutions,
but I don't expect it to be fixed in C/C++.

Click to expand...

Excuse me but I do not see why it can't be fixed in C. My proposal was
precisely in that direction. I have implemented it, and it works.

[...]

For the particular example shown, assuming the "& 0xff" is dropped,
the behavior is already well defined. As far as the standard is
concerned, there is no overflow for unsigned types. An
implementation, or a new standard, which treated
c = d;
as an error would break existing code.

Hello I am learning how to code and I tried making a calculator with HTML and js with some CSS I am stuck at thing, Like the screen value is	0	Mar 13, 2025
Checking overflow	0	Jul 8, 2004
Lcc win overflow handling	42	Sep 2, 2009
An overflow case that puzzles me	16	Jan 9, 2009
Heap overflow/corruption problem in an arbitrary precision class	8	Jun 12, 2008
signed integer overflow	27	Aug 13, 2005
checking for stack overflow in recursive function	4	Sep 22, 2003
Implementing the KISS4691 RNG	22	Sep 5, 2010

Zero overhead overflow checking

Dik T. Winter

Dag-Erling SmÃ¸rgrav

Ben Bacarisse

Hallvard B Furuseth

Keith Thompson

Bart

Dag-Erling SmÃ¸rgrav

Miles Bader

Bart van Ingen Schenau

John Nagle

jacob navia

John Nagle

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads