"Might be undefined" Behaviour

Frederick Gotham · Nov 21, 2006

I have a general idea of the different kinds of behaviour described by the
C Standard, such as:

(1) Well-defined behaviour:

int a = 2, b = 3;

int c = a + b;

(Jist: The code will work perfectly.)

(2) Implementation-defined behaviour

unsigned i = -1;

if (i > 65535) DoSomething();
else DoSomethingElse();

(Jist: Different things can happen on different platforms, but the
program shouldn't crash.)

(3) Unspecified Behaviour

int i = Func1() + Func2();

(Jist: We don't know which function is called first.)

(4) Undefined Behaviour

int i = INT_MAX;

++i;

(The implementation can do whatever it likes, and the program may very
well crash.)

I'm looking for a term though to describe a code snippet which _might_
invoke undefined behaviour. Here's an example:

int i = 32767;

++i;

Given the minimum range of "int", this code may fail on some systems and
succeed on others. I hestitate though to simply label it as "undefined
behaviour". How would I describe a code snippet which may invoke undefined
behaviour depending on implementation-specific details?

Eric Sosman · Nov 21, 2006

Frederick Gotham wrote On 11/21/06 16:43,:

I have a general idea of the different kinds of behaviour described by the
C Standard, such as:

(1) Well-defined behaviour:

int a = 2, b = 3;

int c = a + b;

(Jist: The code will work perfectly.)

"Gist."

(2) Implementation-defined behaviour

unsigned i = -1;

if (i > 65535) DoSomething();
else DoSomethingElse();

(Jist: Different things can happen on different platforms, but the
program shouldn't crash.)

(3) Unspecified Behaviour

int i = Func1() + Func2();

(Jist: We don't know which function is called first.)

(4) Undefined Behaviour

int i = INT_MAX;

++i;

(The implementation can do whatever it likes, and the program may very
well crash.)

I'm looking for a term though to describe a code snippet which _might_
invoke undefined behaviour. Here's an example:

int i = 32767;

++i;

Given the minimum range of "int", this code may fail on some systems and
succeed on others. I hestitate though to simply label it as "undefined
behaviour". How would I describe a code snippet which may invoke undefined
behaviour depending on implementation-specific details?

"It is implementation-defined whether the behavior is
defined or undefined." Note that mere "implementation-defined"
doesn't quite cover the situation, because an implementation
where INT_MAX==32767 is not required to define the behavior of
this code.

Perhaps we could abbreviate this notion with the phrase
"implementation-undefined."

CBFalconer · Nov 21, 2006

Frederick said:
.... snip ...

I'm looking for a term though to describe a code snippet which
_might_ invoke undefined behaviour. Here's an example:

int i = 32767;

++i;

Given the minimum range of "int", this code may fail on some
systems and succeed on others. I hestitate though to simply label
it as "undefined behaviour". How would I describe a code snippet
which may invoke undefined behaviour depending on implementation-
specific details?

#include <limits.h>

....

if (INT_MAX == i) overflowerror();
else ++i;

Peter Nilsson · Nov 22, 2006

Frederick said:
I have a general idea of the different kinds of behaviour described by the
C Standard, such as:

(1) Well-defined behaviour:

int a = 2, b = 3;

int c = a + b;

(Jist: The code will work perfectly.)

Your next example is perfectly well defined too IMO.

The standard defines the term strictly conforming _program_. Though it
qualifies it as _output_ behaviour. For example, the following is a
strictly
conforming program even though some of the operations use
implementation defined or unspecified values and results...

#include <stdio.h>
#include <limits.h>

int main(void)
{
unsigned x = UINT_MAX;
unsigned y = -1u / 2;
unsigned z = x / y;
printf("The answer to life, the universe and everything is... ");
printf("%u\n", z * 21);
return 0;
}

(2) Implementation-defined behaviour

unsigned i = -1;
if (i > 65535) DoSomething();
else DoSomethingElse();

An implementation is not required to document the behaviour of that
piece of code. Note that the standard already documents the behaviour
of the first declaration and the behaviour of the if conditional and
statement as a whole.

(Jist: Different things can happen on different platforms, but the
program shouldn't crash.)

Implementation defined for me means something like: (-5 >> 1)

(3) Unspecified Behaviour

int i = Func1() + Func2();

(Jist: We don't know which function is called first.)

(4) Undefined Behaviour

int i = INT_MAX;
++i;

Note that there is a stronger class of undefined behaviour, namely
constraint
violations. [In a clc thread a while back, committee members stated
that the
behaviour was undefined even if the standard does in fact specfiy the
behaviour in normative text outside of the constraint.]

(The implementation can do whatever it likes, and the program may very
well crash.)

I'm looking for a term though to describe a code snippet which _might_
invoke undefined behaviour. Here's an example:

int i = 32767;
++i;

Given the minimum range of "int", this code may fail on some systems and
succeed on others. I hestitate though to simply label it as "undefined
behaviour".

How would I describe a code snippet which may invoke undefined
behaviour depending on implementation-specific details?

Schrödinger C?

Potentially undefined behaviour?

"Not portable" seems to be quite common.

Christopher Benson-Manica · Nov 22, 2006

Frederick Gotham said:
I have a general idea of the different kinds of behaviour described by the
C Standard, such as:

(snip)

http://c-faq.com/ansi/undef.html

Frederick Gotham · Nov 22, 2006

Peter Nilsson:

The standard defines the term strictly conforming _program_. Though it
qualifies it as _output_ behaviour.

I realise that.

For example, the following is a strictly conforming program even though
some of the operations use implementation defined or unspecified values
and results...

#include <stdio.h>
#include <limits.h>

int main(void)
{
unsigned x = UINT_MAX;
unsigned y = -1u / 2;
unsigned z = x / y;
printf("The answer to life, the universe and everything is... ");
printf("%u\n", z * 21);
return 0;
}

You sure about that? My own understanding of a "strictly conforming
program" is a program whose output is identical on every implementation. An
example would be:

#include <stdio.h>

int main(void) { printf("%d", 42); }

A counter-example would be:

#include <limits.h>
#include <stdio.h>

int main(void) { printf("%d", INT_MAX); }

(By the way, I don't see why we're talking about "strictly conforming
programs".)

An implementation is not required to document the behaviour of that
piece of code.

Yes, it is. It must document the range of "unsigned int", thus indicating
which leg of the "if" statement will be executed.

Richard Heathfield · Nov 22, 2006

Frederick Gotham said:

Peter Nilsson:

You sure about that? My own understanding of a "strictly conforming
program" is a program whose output is identical on every implementation.

....such as the one quoted above, for example.

Clark S. Cox III · Nov 22, 2006

Frederick said:
Peter Nilsson:

I realise that.

You sure about that? My own understanding of a "strictly conforming
program" is a program whose output is identical on every implementation.

The program above *does* have output that is identical on every platform.

Frederick Gotham · Nov 22, 2006

Richard Heathfield:

...such as the one quoted above, for example.

I read as far as the following line:

unsigned y = -1u / 2;

, and I didn't know what happened when the negation operator was applied to
an unsigned integer type, so I gave up!

I'll take your word for it that the output is identical on every
implementation.

Guest · Nov 22, 2006

Frederick said:
Richard Heathfield:

I read as far as the following line:

unsigned y = -1u / 2;

, and I didn't know what happened when the negation operator was applied to
an unsigned integer type, so I gave up!

I'll take your word for it that the output is identical on every
implementation.

Unsigned arithmetic is performed modulo TYPE_MAX+1, so -1u is simply
UINT_MAX. If I recall correctly, you've used (unsigned char) -1 as an
alternative to UCHAR_MAX in the past. The principle here is the same.

Richard Heathfield · Nov 22, 2006

Frederick Gotham said:

Richard Heathfield:

I read as far as the following line:

unsigned y = -1u / 2;

, and I didn't know what happened when the negation operator was applied
to an unsigned integer type, so I gave up!

I'll take your word for it that the output is identical on every
implementation.

Oh, no no no no no, don't do that. Let's demonstrate it. It's much more
educational.

Firstly, we need to know that C does arithmetic on unsigned integer types
like this: let's say the unsigned type can represent values in the range 0
to N-1 (i.e. N different values, so it has log2(N) bits). And let's say
that we try to stick a value into it that isn't in that range. Well, what
happens is that, if it's below the range, you can (conceptually) keeping
add N to the value over and over until it hits the range, and if it's above
the range, you can keep subtracting N until, again, it hits the range. This
is called "reducing the value modulo N" (even though it might mean
increasing the value!).

So -1u actually becomes -1u + N (where, of course, N is (UINT_MAX + 1) in
this case). -1 + UINT_MAX + 1 is, naturally enough, UINT_MAX, and so in our
example program y ends up with UINT_MAX / 2, which is going to be no more
than half the value of UINT_MAX (and it could be a smidgen less, of
course). Clearly, this will divide into UINT_MAX twice (possibly leaving a
small remainder which will be lost because of integer division rules), so
the program calculates the value 2, irrespective of the value of UINT_MAX.

And then of course it prints a value equal to 21 times this result - on any
hosted implementation.

Frederick Gotham · Nov 22, 2006

Richard Heathfield:

So -1u actually becomes -1u + N (where, of course, N is (UINT_MAX + 1)
in this case).

Let's say that:

sizeof(long) > sizeof(int)

And let's say that we want to store the max value for an "unsigned int"
inside a long.

We could write:

long i = UINT_MAX;

or:

long i = (unsigned)-1;

or:

long i = -1U;

That right? Should the third one be interpreted as:

(1) Take the R-value:

Type: int unsigned
Value: 1

(2) Now let's go to our magical world of omnipotent mathematics, where
nothing overflows and where any number can be represented. In this magical
world, we negate the number 1, yielding -1.

(3) Now let's get back to the real world of C and try to store -1 in an
unsigned int.

Would that sound about right? How about trying to store the max value for
an unsigned short in a long? Would I be right in thinking that the
following would _not_ be OK:

long i = -(short unsigned)1;

Reason being that, before it's negated, it's probably promoted to "int",
yielding an actual -1 which is then stored in the long? Exactly like:

long i = -(int)(short unsigned)1;

Even if it promoted to unsigned int rather than signed int, it still
wouldn't yield the max value for an unsigned short, right? Because it would
be as follows:

long i = -(unsigned)(short unsigned)1;

This would give us the max value for an unsigned int rather than an
unsigned short, right? (Yes I realise they might be equal.)

So what have I learned? Well, it doesn't seem like I'd ever find the need
to negate an unsigned integer.

Richard Heathfield · Nov 22, 2006

Frederick Gotham said:

So what have I learned? Well, it doesn't seem like I'd ever find the need
to negate an unsigned integer.

<shrug> -1 is a kind of shorthand, an idiom, for "largest possible value of
this (unsigned) type". Nobody's forcing you to use it. You may, however,
come across it when reading other people's code, so it's as well to be
aware of it.

Frederick Gotham · Nov 22, 2006

Richard Heathfield:

<shrug> -1 is a kind of shorthand, an idiom, for "largest possible value of
this (unsigned) type". Nobody's forcing you to use it. You may, however,
come across it when reading other people's code, so it's as well to be
aware of it.

Ah yes, I've no problem with using negative values to achieve a particular
unsigned value, such as:

size_t i = -1;

But this is because the literal, 1, is signed rather than unsigned. It's
equivalent to:

size_t i = -(int)1;

Note that it doesn't negate an unsigned integer type object.

The following, however, _does_ negate an unsigned integer type object:

size_t i = 1;

size_t len = -i;

It just doesn't look right to me...

CBFalconer · Nov 22, 2006

Frederick said:
Richard Heathfield:

I read as far as the following line:

unsigned y = -1u / 2;

, and I didn't know what happened when the negation operator was applied to
an unsigned integer type, so I gave up!

That results in the constant UINT_MAX, which is odd. Division by 2
results in (UINT_MAX - 1) / 2. The result is system dependant.
The final result (UINT_MAX / y) is always 2. The printf emits
Richards favorite value, 42. Straightforward.

Eric Sosman · Nov 22, 2006

Richard Heathfield wrote On 11/22/06 14:50,:

Frederick Gotham said:

<shrug> -1 is a kind of shorthand, an idiom, for "largest possible value of
this (unsigned) type". Nobody's forcing you to use it. You may, however,
come across it when reading other people's code, so it's as well to be
aware of it.

... and to pay attention to the details. I once spent
time chasing a bug whose root cause was (paraphrased)

unsigned long x = -1u;

.... as a shorthand for "Fill `x' with 1-bits."

Keith Thompson · Nov 22, 2006

Frederick Gotham said:
Note that it doesn't negate an unsigned integer type object.

The following, however, _does_ negate an unsigned integer type object:

size_t i = 1;

size_t len = -i;

It just doesn't look right to me...

Well, it is. You might consider adjusting your expectations.

Giorgio Silvestri · Nov 22, 2006

Frederick Gotham said:
Richard Heathfield:

Let's say that:

sizeof(long) > sizeof(int)

And let's say that we want to store the max value for an "unsigned int"
inside a long.

We could write:

long i = UINT_MAX;

or:

long i = (unsigned)-1;

or:

long i = -1U;

sizeof(long) > sizeof(int)

is not particularly interesting.

Probably you want:

LONG_MAX >= UINT_MAX

If you consider "padding bits" the following is possible:

sizeof(long) > sizeof(int)

and

LONG_MAX < UINT_MAX

Frederick Gotham · Nov 22, 2006

Giorgio Silvestri:

sizeof(long) > sizeof(int)

is not particularly interesting.

Probably you want:

LONG_MAX >= UINT_MAX

If you consider "padding bits" the following is possible:

sizeof(long) > sizeof(int)

and

LONG_MAX < UINT_MAX

Yes I realise that. However, I've overstepped my year's quote for mentioning
IMAX_BITS.

Mark McIntyre · Nov 22, 2006

Ah yes, I've no problem with using negative values to achieve a particular
unsigned value, such as:

size_t i = -1;
but...

size_t i = 1;
size_t len = -i;

It just doesn't look right to me...

The two are absolutely identical, and any decent compiler would
optimise the latter into the former.

its worth reviewing ones prejudices occasionally, in case they're
invalid.
--
Mark McIntyre

"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it."
--Brian Kernighan

setjmp and undefined behaviour	2	Nov 30, 2013
Undefined behaviour in expressions	8	Apr 1, 2008
Undefined Behaviour designed to be caught [Was: Books for advancedC++ debugging]	9	Jul 17, 2009
Is this an error or undefined behaviour?	14	Oct 28, 2007
Why does left shift operation invoke Undefined Behaviour when theleft side operand has negative valu	1	Sep 24, 2010
is it undefined behaviour?	2	Mar 30, 2005
why "++a" is undefined behaviour ?	15	Jul 16, 2007
Behaviour question	2	Aug 18, 2003

"Might be undefined" Behaviour

Frederick Gotham

Eric Sosman

CBFalconer

Peter Nilsson

Christopher Benson-Manica

Frederick Gotham

Richard Heathfield

Clark S. Cox III

Frederick Gotham

Guest

Richard Heathfield

Frederick Gotham

Richard Heathfield

Frederick Gotham

CBFalconer

Eric Sosman

Keith Thompson

Giorgio Silvestri

Frederick Gotham

Mark McIntyre

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads