Incrementing variables past limits

Bas Wassink · Jan 4, 2005

Hi there,

Does the ANSI standard say anything about incrementing variables past
their limits ?

When I compile code like this:

unsigned char x = 255;
x++;
printf ( "%d\n", x );

with GCC, the output is 0.

From an assembly point of view this seems to be logical behaviour, but is
it defined in the standard, or is it implementation dependant ??

I've googled for an answer, read K&R2 and the c.l.c FAQ but couldn't find
any decent answer..

Thanks in advance,
Bas Wassink.

Ben Pfaff · Jan 4, 2005

Bas Wassink said:
Does the ANSI standard say anything about incrementing variables past
their limits ?

Yes. Unsigned integer types wrap around. With other types it's
unpredictable and you should avoid doing out-of-bounds arithmetic
with them.

Eric Sosman · Jan 4, 2005

Bas said:
Hi there,

Does the ANSI standard say anything about incrementing variables past
their limits ?

When I compile code like this:

unsigned char x = 255;
x++;
printf ( "%d\n", x );

with GCC, the output is 0.

From an assembly point of view this seems to be logical behaviour, but is
it defined in the standard, or is it implementation dependant ??

I've googled for an answer, read K&R2 and the c.l.c FAQ but couldn't find
any decent answer..

The result is well-defined for unsigned integers: they
obey the rules of modular ("clock") arithmetic.

For unsigned integers, there are no guarantees. The
program may trap, or may deliver some implementation-defined
result. Most implementations "wrap around" from the most
positive to the most negative value, but the C language
itself doesn't promise this behavior.

Christopher Benson-Manica · Jan 4, 2005

<[email protected]>

Eric Sosman said:
For unsigned integers, there are no guarantees. The

^^^^^^^^ (I know you meant "signed"

Eric Sosman · Jan 4, 2005

Christopher said:
<[email protected]>

^^^^^^^^ (I know you meant "signed"

I know you're right. Sorry about that.

Keith Thompson · Jan 4, 2005

Bas Wassink said:
Does the ANSI standard say anything about incrementing variables past
their limits ?

When I compile code like this:

unsigned char x = 255;
x++;
printf ( "%d\n", x );

with GCC, the output is 0.

From an assembly point of view this seems to be logical behaviour, but is
it defined in the standard, or is it implementation dependant ??

I've googled for an answer, read K&R2 and the c.l.c FAQ but couldn't find
any decent answer..

For unsigned types, overflow has well-defined behavior; the result
wraps around. More precisely, the result is reduced modulo the number
that is one greater than the largest value that can be represented by
the resulting type. For unsigned char (assuming UCHAR_MAX==256), the
value 256 reduces to 0; gcc is behaving correcly.

For signed types, overflow causes undefined behavior. Wraparound is
fairly common, but you shouldn't depend on it; some implementations
may produce a trap. (A conversion to a signed type, if the value
cannot be represented, either yields an implementation-defined result
or raises an implementation-defined signal.)

Bas Wassink · Jan 4, 2005

Thank you for your quick reply, this will help me with writing clean
and portable 6502/6510-emulation code..

Bas Wassink

Derrick Coetzee · Jan 5, 2005

Bas said:
Does the ANSI standard say anything about incrementing variables past
their limits ?

Others have explained the rules for signed and unsigned integers. Also
note, however, that casting from signed to unsigned, incrementing, and
then casting back won't work. It's the last step that fails; a cast from
an unsigned value to a signed type that cannot represent that value is
implementation-defined (ref C99 6.3.1.3.3). If performance isn't a big
deal, the easiest way to increment a signed integer is a branch:

#include <limits.h>

int signed_incr(int i) {
if (i == INT_MAX)
i = INT_MIN;
else
i++;
}

Peter Nilsson · Jan 5, 2005

[Note that 255 need not be the limit for unsigned char.]

x++;
printf ( "%d\n", x );

with GCC, the output is 0.

From an assembly point of view this seems to be logical behaviour, but is
it defined in the standard, or is it implementation dependant ??

Click to expand...

For unsigned types, overflow has well-defined behavior; the result
wraps around. More precisely, the result is reduced modulo the number
that is one greater than the largest value that can be represented by
the resulting type. For unsigned char [assuming UCHAR_MAX==255], the
value 256 reduces to 0; gcc is behaving correcly.

Unsigned short has scope for problems...

unsigned short us = -1;
us++; /* UB if USHRT_MAX == INT_MAX */
The paranoid can safely use...

us += 1u;

Micah Cowan · Jan 5, 2005

Derrick said:
Others have explained the rules for signed and unsigned integers. Also
note, however, that casting from signed to unsigned, incrementing, and
then casting back won't work. It's the last step that fails; a cast from
an unsigned value to a signed type that cannot represent that value is
implementation-defined (ref C99 6.3.1.3.3). If performance isn't a big
deal, the easiest way to increment a signed integer is a branch:

#include <limits.h>

int signed_incr(int i) {
if (i == INT_MAX)
i = INT_MIN;
else
i++;
}

Of course, the above would not actually modify the original
value, and doesn't specify a return value, but I think we all
know what you meant.

Actually, in the vast majority of cases, it would be much better
to indicate an error when (i == INT_MAX) then to silently roll
it. I generally check explicitly for overflow on every arithmetic
operation that might cause one.

As an absurdly extreme--but true--case, failure to catch such a
rollover caused massive radiation overexposure, and even death,
to patients treated by the infamous Therac-25 liniar accelerator
used for radiation therapy.

http://courses.cs.vt.edu/~cs3604/lib/Therac_25/Therac_1.html

Bas Wassink · Jan 5, 2005

[Note that 255 need not be the limit for unsigned char.]

So, I would be better off using an unsigned int for representing a byte
and performing the wrap-around myself, right ?

Can I rely on fputc to write an eight-bit value to a file ( passing "wb"
to fopen ) or is this implementation dependant too ?

infobahn · Jan 5, 2005

Bas said:
[Note that 255 need not be the limit for unsigned char.]

Click to expand...

So, I would be better off using an unsigned int for representing a byte
and performing the wrap-around myself, right ?

Can I rely on fputc to write an eight-bit value to a file ( passing "wb"
to fopen ) or is this implementation dependant too ?

The Standard says:

"The fputc function writes the character specified by c (converted
to an unsigned char) to the output stream pointed to by stream, at
the position indicated by the associated file position indicator for
the stream (if defined), and advances the indicator appropriately.
If the file cannot support positioning requests, or if the stream
was opened with append mode, the character is appended to the output
stream."

So the answer is that CHAR_BIT bits will be written. This must
be at least 8, but can be higher.

Chris Croughton · Jan 5, 2005

Others have explained the rules for signed and unsigned integers. Also
note, however, that casting from signed to unsigned, incrementing, and
then casting back won't work. It's the last step that fails; a cast from
an unsigned value to a signed type that cannot represent that value is
implementation-defined (ref C99 6.3.1.3.3). If performance isn't a big
deal, the easiest way to increment a signed integer is a branch:

How about straight assignment? If I do:

int test(int i)
{
unsigned int u;
u = i;
return (int)u;
}

is that defined? I have a suspicion that it isn't (because although u
can represent a negative value of i in some implementation-defined way,
the conversion back to an int may not be able to represent that unsigned
value).

More usefully, is there any real computer and C compiler on which it
will fail in practice?

Chris C

Eric Sosman · Jan 5, 2005

Bas said:
[Note that 255 need not be the limit for unsigned char.]

Click to expand...

So, I would be better off using an unsigned int for representing a byte
and performing the wrap-around myself, right ?

"Better" depends on your purposes; there is no "best"
way to define "better."

Machines with 8-bit bytes are the overwhelming majority
these days, the principal exceptions being processors that
are specially tailored to particular tasks like signal
processing. If you can accept the limitation of only
running on "mainstream" systems, just go ahead and write
code that assumes an 8-bit byte. But you should insert a
simple test for the sake of the poor sod who may someday
try to run your code on machine with 11- or 32-bit bytes:

#include <limits.h>
#if UCHAR_MAX != 255
#error "Sorry; must have an 8-bit byte"
#endif

A good, solid error at compile-time will save him a lot of
futile effort; he may not thank you for assuming the 8-bit
byte, but he'll at least not curse you for the time he
spends trying to fix your "bugs."

CBFalconer · Jan 5, 2005

Chris said:
.... snip ...

How about straight assignment? If I do:

int test(int i)
{
unsigned int u;
u = i;
return (int)u;
}

is that defined? I have a suspicion that it isn't (because although
u can represent a negative value of i in some implementation-defined
way, the conversion back to an int may not be able to represent that
unsigned value).

You answered your own question.

More usefully, is there any real computer and C compiler on which it
will fail in practice?

Define fail. Remember, these casts are value transformations, not
bodily transference of bit patterns. Your attitude reminds me of
Microsoft, who at least have the excuse that all their software
runs on the same hardware family. That attitude will lead to
similar reliability.

Keith Thompson · Jan 5, 2005

Chris Croughton said:
How about straight assignment? If I do:

int test(int i)
{
unsigned int u;
u = i;
return (int)u;
}

is that defined? I have a suspicion that it isn't (because although u
can represent a negative value of i in some implementation-defined way,
the conversion back to an int may not be able to represent that unsigned
value).

If i is non-negative, there's no problem at all.

If i is negative, the conversion from signed to unsigned depends on
the implementation-defined value of UINT_MAX, but is otherwise defined
by the language (even on systems that use a representation other than
2's-complement).

But the conversion from unsigned to signed, if the result can't be
represented, either yields an implementation-defined result or raises
an implementation-defined signal (no nasal demons in this case). Note
that "implementation-defined" means the implementation has to document
its behavior.

More usefully, is there any real computer and C compiler on which it
will fail in practice?

If by "fail" you mean doing something other than returning the
original value of i, I don't know.

Chris Croughton · Jan 5, 2005

Define fail.

Not give the same number in the signed variable as it started with.
I.e. is there any real world situation where

i != (int)(unsigned int)i

is true (for i of type int)? Has there ever been such a system? Would
anyone in their right mind make such a system, and would anyone buy it
if they did?

Remember, these casts are value transformations, not
bodily transference of bit patterns. Your attitude reminds me of
Microsoft, who at least have the excuse that all their software
runs on the same hardware family. That attitude will lead to
similar reliability.

There are probably thousands if not millions of programs out there which
already do that sort of thing. It's certainly not limited to one
processor family, every processor I have ever heard of has the property
that an unsigned variable can act as a container for the signed value of
the same size, and can restore it. Quite a lot of printf code would
fail, for a start, as would a lot of code using variadic functions...

Have you ever heard of any processor where that wasn't true?

(For that matter, is anyone still producing machines with 1's complement
or sign+magnitude integers?)

Chris C

Eric Sosman · Jan 5, 2005

Chris said:
[... int i = -42; assert(i == (int)(unsigned)i); ...]

Have you ever heard of any processor where that wasn't true?

(For that matter, is anyone still producing machines with 1's complement
or sign+magnitude integers?)

I've personally used ones' complement and signed magnitude
(decimal!) computers. Admittedly, it was a while ago. But the
fact that I've seen them dwindle away doesn't suggest to me that
they're gone forever; rather, it suggests that techniques in the
computer industry are subject to change. The assumption that
things will remain forever the way they happen to be today has
not been tenable up to now; are you confident that Change has
come to its end?

Or as a colleague of mine likes to say, "We work in a
fashion-driven industry."

Preparing for every conceivable raising or lowering of
computers' hemlines carries a cost, and failing to be prepared
carries a risk. In the context of a given project you may
well decide that the risk is too small and the cost too large.
That's fine; that's part of what engineering is about. But an
implicit decision that all risks are zero is just as foolhardy
as a decision that all costs are justified.

BTW, what are your thoughts on the C0x Committee's decision
to allow balanced ternary integers? ;-)

Chris Croughton · Jan 5, 2005

I've personally used ones' complement and signed magnitude
(decimal!) computers.

I've used a 1's-C computer and a signed magnitude one -- almost 30 years
ago! I haven't used or even seen either since the late '70s, though.
Do Burroughs still make computers? I think theirs was the S/M one.

Admittedly, it was a while ago. But the
fact that I've seen them dwindle away doesn't suggest to me that
they're gone forever; rather, it suggests that techniques in the
computer industry are subject to change. The assumption that
things will remain forever the way they happen to be today has
not been tenable up to now; are you confident that Change has
come to its end?

Any change which means that a signed value can't be cast to its unsigned
equivalent and the back would, I think, break a lot of code. Yes,
things might change (like using balanced ternary) but it would break an
enormous amount of code. The change from BCD to binary was bad enough
(and there are still BCD based languages).

Or as a colleague of mine likes to say, "We work in a
fashion-driven industry."

After a fashion <g>. Although some of the 'fashions' have lasted a long
time for such things, the 8 bit byte for instance (I suspect that almost
all of several major operating systems plus their utilities would have
to be rewritten for a different-sized byte). Even Unicode is only
gradually becoming accepted.

Preparing for every conceivable raising or lowering of
computers' hemlines carries a cost, and failing to be prepared
carries a risk. In the context of a given project you may
well decide that the risk is too small and the cost too large.
That's fine; that's part of what engineering is about. But an
implicit decision that all risks are zero is just as foolhardy
as a decision that all costs are justified.

I didn't say anything about such a decision. I'm looking at risk
assessment -- is it really worth writing code which will be inefficient
and hard to maintain in order to cope with a possible hole which the
standard allows but no one is likely to implement that way? Is the
probability of someone producing a system which breaks a lot of code
higher than that of the next C standard breaking code? (Anyone who used
a variable called 'restrict' or 'inline' will have run foul of that in
C99).

BTW, what are your thoughts on the C0x Committee's decision
to allow balanced ternary integers? ;-)

I'd like to see how they propose to square it with all the references to
'binary' in the C specification <g>. Yes, it is possible to emulate bit
operations using b-tits[1] but C as we know it would not be an efficient
language to program such a machine...

[1] Ternary digits ought to be called tits. If they aren't someone was
slipping when they were named[2]...

[2] Robert A. Heinlein used ternary in some of his futuristic computers.
Knowing his proclivities, how did he miss calling them tits?

Chris C

Keith Thompson · Jan 6, 2005

Chris Croughton said:
[1] Ternary digits ought to be called tits. If they aren't someone was
slipping when they were named[2]...

[2] Robert A. Heinlein used ternary in some of his futuristic computers.
Knowing his proclivities, how did he miss calling them tits?

They're actually called "trits".

<http://www.catb.org/~esr/jargon/html/T/trit.html>

promotion and narrowing integer conversion	14	Jan 21, 2010
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006
clc selected threads (30-jan-2005 to 31-jan-2005) #1	3	Feb 6, 2005
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	1	Feb 1, 2004
Variable length arrays	45	Mar 6, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	Jan 12, 2008
comp.lang.c FAQ list Table of Contents	0	Jan 12, 2008
comp.lang.c Answers (Abridged) to Frequently Asked Questions (FAQ)	0	May 1, 2007

Incrementing variables past limits

Bas Wassink

Ben Pfaff

Eric Sosman

Christopher Benson-Manica

Eric Sosman

Keith Thompson

Bas Wassink

Derrick Coetzee

Peter Nilsson

Micah Cowan

Bas Wassink

infobahn

Chris Croughton

Eric Sosman

CBFalconer

Keith Thompson

Chris Croughton

Eric Sosman

Chris Croughton

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads