Strange - a simple assignment statement shows error in VC++ but worksin gcc !

P

peter koch

[...]> But i++ is inherently safe - at least on most common archtectures. And
certainly there's nothing complicated going on, it is obvious what
happens.

[...]

No, it's not inherently safe.

    int i = INT_MAX;
    i++;

The "i++" in the above invokes undefined behavior.  Perhaps you're
thinking that it will wrap around to INT_MIN, but the standard doesn't
guarantee that.

Yes. That was why I added the disclaimer "on most common
architectures".

/Peter
 
K

Keith Thompson

peter koch said:
[...]> But i++ is inherently safe - at least on most common archtectures. And
certainly there's nothing complicated going on, it is obvious what
happens.

[...]

No, it's not inherently safe.

    int i = INT_MAX;
    i++;

The "i++" in the above invokes undefined behavior.  Perhaps you're
thinking that it will wrap around to INT_MIN, but the standard doesn't
guarantee that.

Yes. That was why I added the disclaimer "on most common
architectures".

But even on an architecture on which the operation itself is "safe",
an optimizing compiler is free to rearrange the code based on the
assumption that no undefined behavior occurs.

Consider the following program:

#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

int ident(int i)
{
if (rand() < 0) return 42;
else return i;
}

int main(void)
{
int i = ident(INT_MAX);
int j = i + 1;
if (i < j) fputs("Yes ", stdout);
else fputs("No ", stdout);
if (ident(i) < ident(j)) puts("Yes");
else puts("No");
return 0;
}


The ident() function is designed to return its argument value while
being too difficult for the compiler to analyze.

When I compile this with "gcc -O0" or "gcc -O1", it prints "No No".
When I compile it with "gcc -O2", it prints "Yes No". When I compile
it with "gcc -O3", it prints "Yes Yes" (I'm not quite sure how that
happens).

As soon as you invoke undefined behavior, all bets are off.
 
J

jameskuyper

Rainer said:
Too many people cannot get their heads around the idea that
'undefined' implies 'nobody knows anything about it'

Good - because that's not what it means. It means "the C standard
provides no guarantees". Another standard, such as POSIX, or a
standard ABI for a given platform, or even the implementation itself,
may provide guarantees that the C standard does not, and there might
be a very large number of people who know about those guarantees.

Even if there are no official guarantees from any source, it's
entirely possible that there's at least one person (and probably a
large number of persons) who knows what any particular implementation
will actually do wth a particular kind of code construct with
undefined behavior - at the very least, I would certainly hope that
whoever actually designed the relevant part of that implementation
would have some idea what it actually does - if not, I wouldn't
recommend relying upon that implementation.
... and not 'program
will crash' and too many other people try to reinforce this notion by
wandering off into long strains of phantastic speculation.

The fantastic speculation serves mainly to draw people's attention to
the important point: the lack of constraints imposed by the C standard
when the behavior is defined. That there are other constraints that
might apply is also an important point, but it can be discussed after
you get their attention; it's a fact that doesn't lend itself as
readily to serving as an attention-getting device. There might be
other reasons why nasal demons are not a possibility, even when the
code has undefined behavior, but it is perfectly accurate to point out
that constraints imposed by the C standard are NOT among those
reasons.
Neither can I.

Nonetheless, such mechanisms do exist (I've already mentioned them in
vague terms that should have been sufficiently specific for your
imagination to fill in the details). They are in use, and compilers
using them are conforming, and it's important for developers to be
aware of that fact if they care about the portability of their code.
It's not very important if they're sure they'll never use a compiler
using such aggressive optimization techniques.
 
R

Rainer Weikusat

jameskuyper said:
Good - because that's not what it means. It means "the C standard
provides no guarantees".

Since the C-standard doesn't 'guarantee' anything, that's hardly a
surprise. It's a definition of 'requirements' (for conforming
implementations). But this digression into yet another back alley
doesn't change the fact that something which is not defined has no
properties known to anyone (And I can provide, provided I am not
getting tired of this earlier, semantically equivalent variants of 'we
know nothing about it' until everyone with at least half a brain has
understood that you are playing dumb on purpose).

[...]
Nonetheless, such mechanisms do exist (I've already mentioned them in
vague terms that should have been sufficiently specific for your
imagination to fill in the details). They are in use, and compilers
using them are

.... certainly an interesting topic in their own right, but I wasn't
writing about the topic you are wandering into, after mutilating the
text you pretend to be replying to in order to get rid of its content.
In particular, I was writing about "the C language", which is a topic
of some interest to me. Compiler-folklore, be it historical or
mythical, isn't.
 
K

Keith Thompson

Golden California Girls said:
Keith said:
peter koch said:
[...]
But i++ is inherently safe - at least on most common
archtectures. And certainly there's nothing complicated going
on, it is obvious what happens.
[The quoting was messed up a bit; I think I've corrected it.]
[...]

No, it's not inherently safe.

int i = INT_MAX;
i++;

The "i++" in the above invokes undefined behavior. Perhaps you're
thinking that it will wrap around to INT_MIN, but the standard doesn't
guarantee that.
Yes. That was why I added the disclaimer "on most common
architectures".

But even on an architecture on which the operation itself is "safe",
an optimizing compiler is free to rearrange the code based on the
assumption that no undefined behavior occurs.

Consider the following program:

#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

int ident(int i)
{
if (rand() < 0) return 42;
else return i;
}

int main(void)
{
int i = ident(INT_MAX);
int j = i + 1;
if (i < j) fputs("Yes ", stdout);
else fputs("No ", stdout);
if (ident(i) < ident(j)) puts("Yes");
else puts("No");
return 0;
}


The ident() function is designed to return its argument value while
being too difficult for the compiler to analyze.

When I compile this with "gcc -O0" or "gcc -O1", it prints "No No".
When I compile it with "gcc -O2", it prints "Yes No". When I compile
it with "gcc -O3", it prints "Yes Yes" (I'm not quite sure how that
happens).

try with gcc -O3:
#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

int ident(int i)
{
if (rand() < 0) return 42;
else return i;
}

int main(void)
{
int i;
int j;
i = ident(INT_MAX);
j = i + 1;
if (i < j) fputs("Yes ", stdout);
else fputs("No ", stdout);
if (ident(i) < ident(j)) puts("Yes");
else puts("No");
return 0;
}

The only change between my program and your version of it is that you
replaced the initializers for i and j with separate assignment
statements. The two versions of the program exhibit exactly the same
behavior with -O0, -O1, -O2, and -O3, at least on my system.
You may be surprised by the difference and maybe you will understand
why you got
yes yes

I am unsurprised by the fact that there was no difference. I would
have been only mildly surprised if there had been a difference.
and also why the most important standards document is your man page
and your best friend is gdb.

Ok, I tried running gdb on your version of the program, compiled with
"-O3 -g". The results were not illuminating. It showed me a garbage
value for i, and couldn't even find an object named "j". That's not
too surprising either; debugging optimized code can be difficult.

I also looked at an assembly listing, and saw some things that could
explain the behavior if I took the time to investigate further.

But my point was merely that a program whose behavior is undefined
really can behave in strange, surprising, and even seemingly
inconsistent ways. The fact that INT_MAX + 1 might evaluate to
INT_MIN on most systems doesn't mean you can depend on that, even on
such systems. The details of why it behaves in a particular way were
not relevant either to that point, or to the C language.
As you did with the missing ...

Huh? The missing *what*?

I'm sure you were trying to make some point, but I'm at a loss to
understand what it was. Can you explain more clearly?
 
F

Flash Gordon

Keith said:
peter koch said:
[...]> But i++ is inherently safe - at least on most common archtectures. And
certainly there's nothing complicated going on, it is obvious what
happens.
[...]

No, it's not inherently safe.

int i = INT_MAX;
i++;

The "i++" in the above invokes undefined behavior. Perhaps you're
thinking that it will wrap around to INT_MIN, but the standard doesn't
guarantee that.
Yes. That was why I added the disclaimer "on most common
architectures".

But even on an architecture on which the operation itself is "safe",
an optimizing compiler is free to rearrange the code based on the
assumption that no undefined behavior occurs.

Consider the following program:

#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

int ident(int i)
{
if (rand() < 0) return 42;
else return i;
}

int main(void)
{
int i = ident(INT_MAX);
int j = i + 1;
if (i < j) fputs("Yes ", stdout);
else fputs("No ", stdout);
if (ident(i) < ident(j)) puts("Yes");
else puts("No");
return 0;
}


The ident() function is designed to return its argument value while
being too difficult for the compiler to analyze.

That's a good one.
When I compile this with "gcc -O0" or "gcc -O1", it prints "No No".
When I compile it with "gcc -O2", it prints "Yes No". When I compile
it with "gcc -O3", it prints "Yes Yes" (I'm not quite sure how that
happens).

As soon as you invoke undefined behavior, all bets are off.

Hmm. I get "No No" with "gcc -O3" on one of my machines but "Yes No" on
another, which just goes to show it varies between versions of compilers
as well (something we already knew could happen).
 
C

CBFalconer

Keith said:
.... snip ...

But even on an architecture on which the operation itself is
"safe", an optimizing compiler is free to rearrange the code
based on the assumption that no undefined behavior occurs.

Consider the following program:

#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

int ident(int i)
{
if (rand() < 0) return 42;
else return i;
}

int main(void)
{
int i = ident(INT_MAX); /* added lnos = 13 */
int j = i + 1; /* 14 */
if (i < j) fputs("Yes ", stdout); /* 15 */
else fputs("No ", stdout); /* 16 */
if (ident(i) < ident(j)) puts("Yes"); /* 17 */
else puts("No"); /* 18 */
return 0;
}

The ident() function is designed to return its argument value
while being too difficult for the compiler to analyze.

When I compile this with "gcc -O0" or "gcc -O1", it prints "No
No". When I compile it with "gcc -O2", it prints "Yes No". When
I compile it with "gcc -O3", it prints "Yes Yes" (I'm not quite
sure how that happens).

7.20.2.1 The rand function
Synopsis

[#1]
#include <stdlib.h>
int rand(void);

Description

[#2] The rand function computes a sequence of pseudo-random
integers in the range 0 to RAND_MAX.

[#3] The implementation shall behave as if no library
function calls the rand function.

Note p. 2. rand never returns a negative value. So, in line 13, i
is set to INT_MAX. In line 14, overflow occures and the result is
undefined behaviour. gcc can assume no overflow occurs in
optimization, so it can assume that i < i + 1. I believe you will
find gcc is making different assumptions (or acting on them)
depending on the optimization mode.
 
J

jameskuyper

Rainer said:
Since the C-standard doesn't 'guarantee' anything,

It guarantees a great many things about how code will be processed by
a conforming implementation. It doesn't guarantee that there are any
such implementations, but that's a separate issue.
... that's hardly a
surprise. It's a definition of 'requirements' (for conforming
implementations).

Any requirement imposed on a conforming implementation constitutes a
guarantee that a user of an implementation which claims to be
conforming is entitled to rely upon.
But this digression into yet another back alley
doesn't change the fact that something which is not defined has no
properties known to anyone

The point you keep missing is that in the C standard, "undefined
behavior" is a piece of technical jargon that has been given a
meaning, in the context of the C standard, different from the ordinary
English usage of "behavior that is not defined". In the context of the
C standard, "undefined behavior" means behavior that the C standard
does not define. This includes behavior that IS defined by other
documents, whose properties ARE known by those familiar with those
other documents.
... (And I can provide, provided I am not
getting tired of this earlier, semantically equivalent variants of 'we
know nothing about it' until everyone with at least half a brain has
understood that you are playing dumb on purpose).

Saying that "we know nothing about it", when in fact a great many
people do know something about it, sounds to me like "playing dumb on
purpose".

....
[clippage restored]
That's an ... interesting ... place that you chose to clip my comment.
It makes my comment seem far less relevant than it actually is. If
that was a way of implicitly disagreeing with my assertion that they
are conforming, I'd prefer to see an argument rebutting that
assertion.
... certainly an interesting topic in their own right, but I wasn't
writing about the topic you are wandering into, after mutilating the
text you pretend to be replying to in order to get rid of its content.
In particular, I was writing about "the C language", which is a topic
of some interest to me. Compiler-folklore, be it historical or
mythical, isn't.

Understanding what the C standard does and does not allow a conforming
implementation to do should be of interest to anyone who plans to
write portable C. If you wish to dismiss such discussions as being of
only historical or mythical significance, than that implies that
portability (at least to the real platforms that you dismiss in that
fashion) is not of importance to you. That's fine - you have every
right to decide for yourself what level of portability is important to
you, but please don't denigrate those of us with wider ambitions for
our code.
 
K

Keith Thompson

CBFalconer said:
7.20.2.1 The rand function
[snip]

Yes, thank you, I know what the rand function does. I assumed that
gcc wouldn't be clever enough to assume that rand() can never return a
negative value. Perhaps it does -- but an assembly listing show
several calls to rand(), so it's not assuming that ``rand() < 0'' is
false.

My statement "I'm not quite sure how that happens" wasn't intended to
launch a discussion of just what gcc is doing. The point is that the
behavior cannot be predicted based on a knowledge of the C standard.
The odd behavior under "gcc -O3" is nothing more than an example.
Note p. 2. rand never returns a negative value. So, in line 13, i
is set to INT_MAX. In line 14, overflow occures and the result is
undefined behaviour. gcc can assume no overflow occurs in
optimization, so it can assume that i < i + 1. I believe you will
find gcc is making different assumptions (or acting on them)
depending on the optimization mode.

Of course, that was my point.
 
H

Harald van Dijk

CBFalconer said:
Keith said:
[...]
if (rand() < 0) return 42;
[...]

7.20.2.1 The rand function
[snip]

Yes, thank you, I know what the rand function does. I assumed that gcc
wouldn't be clever enough to assume that rand() can never return a
negative value. Perhaps it does -- but an assembly listing show several
calls to rand(), so it's not assuming that ``rand() < 0'' is false.

rand() doesn't just return a value, it also has a visible effect of
skipping one random number. A compiler might legitimately optimise the
above to

rand();

but it cannot omit the call to rand altogether -- at least not unless it
analyses the whole program to verify that any remaining calls to rand are
irrelevant for the program's behaviour.

In other words, the fact that calls to rand() are emitted does not by
itself show one way or the other whether gcc is aware of the fact that it
will never return a negative number.
 
H

Harald van Dijk

CBFalconer said:
Keith said:
[...]
if (rand() < 0) return 42;
[...]

7.20.2.1 The rand function
[snip]

Yes, thank you, I know what the rand function does. I assumed that gcc
wouldn't be clever enough to assume that rand() can never return a
negative value. Perhaps it does -- but an assembly listing show several
calls to rand(), so it's not assuming that ``rand() < 0'' is false.

rand() doesn't just return a value, it also has a visible effect of
skipping one random number. A compiler might legitimately optimise the
above to

rand();

but it cannot omit the call to rand altogether -- at least not unless it
analyses the whole program to verify that any remaining calls to rand are
irrelevant for the program's behaviour.

In other words, the fact that calls to rand() are emitted does not by
itself show one way or the other whether gcc is aware of the fact that it
will never return a negative number.
 
K

Keith Thompson

Harald van Dijk said:
CBFalconer said:
Keith Thompson wrote:
[...]
if (rand() < 0) return 42;
[...]

7.20.2.1 The rand function
[snip]

Yes, thank you, I know what the rand function does. I assumed that gcc
wouldn't be clever enough to assume that rand() can never return a
negative value. Perhaps it does -- but an assembly listing show several
calls to rand(), so it's not assuming that ``rand() < 0'' is false.

rand() doesn't just return a value, it also has a visible effect of
skipping one random number. A compiler might legitimately optimise the
above to

rand();

but it cannot omit the call to rand altogether -- at least not unless it
analyses the whole program to verify that any remaining calls to rand are
irrelevant for the program's behaviour.

In other words, the fact that calls to rand() are emitted does not by
itself show one way or the other whether gcc is aware of the fact that it
will never return a negative number.

Quite right. Good catch.
 
M

Mark Wooding

Keith Thompson said:
Consider the following program:

#include <limits.h>
#include <stdlib.h>
#include <stdio.h>

int ident(int i)
{
if (rand() < 0) return 42;
else return i;
}

int main(void)
{
int i = ident(INT_MAX);
int j = i + 1;
if (i < j) fputs("Yes ", stdout);
else fputs("No ", stdout);
if (ident(i) < ident(j)) puts("Yes");
else puts("No");
return 0;
}


The ident() function is designed to return its argument value while
being too difficult for the compiler to analyze.

When I compile this with "gcc -O0" or "gcc -O1", it prints "No No".
When I compile it with "gcc -O2", it prints "Yes No". When I compile
it with "gcc -O3", it prints "Yes Yes" (I'm not quite sure how that
happens).

What's happened is that GCC has decided to inline the calls to `ident'.
It then effectively rewrites the code like this:

int main(void)
{
int i;

if (rand() < 0) i = 42; /* actually done with bit-twiddling */
else i = INT_MAX; /* and interspersed with the fwrite call */
fwrite("Yes ", 1, 4, stdout); /* it turned fputs into fwrite */

if (rand() < 0) i = 42;
if (rand() < 0) puts("No"); /* because !(42 < 42) */
else puts("Yes"); /* because 42 < INT_MAX + 1 */
return (0);
}

Somewhat hairy, but justifiable.

You can get the same effect at -O2 if you declare `ident' to be static:
GCC will then consider it worth inlining because it can elide the
standalone copy.
As soon as you invoke undefined behavior, all bets are off.

Definitely.

-- [mdw]
 
K

Keith Thompson

Golden California Girls said:
Keith said:
Golden California Girls said:
Keith Thompson wrote: [...]
As soon as you invoke undefined behavior, all bets are off.
As you did with the missing ...

Huh? The missing *what*?

I get a no no if I compile your version with gcc -std=c99 -O3
Ok.
I'm sure you were trying to make some point, but I'm at a loss to
understand what it was. Can you explain more clearly?

Yes. Is int i = ident(INT_MAX); valid C89? (the default for gcc)

Yes. Why wouldn't it be? (An initializer for an object with
automatic storage duration needn't be a constant expression, if that's
what you were wondering.)
I actually thought you knew what you had done and were using it as a
subtle example of UB. Or perhaps it is a bug in the optimizer.

Yes, I knew what I was doing and was using it as an example of UB. I
didn't try to analyze in any detail why I was getting the particular
results I was getting, because the details weren't relevant to the
point I was making.

I still have have no idea what you meant above by

As you did with the missing ...

If you'd like to explain, fine. If not, I'll just assume it was
meaningless.
 
K

Keith Thompson

Golden California Girls said:
Not sure because I'm not a standards maven but isn't there something
in the standard about if an implementation says it does something
e.g. two's complement integer math, that it then must do that. Then
I'm wondering if gcc says it does two's complement if it is breaking
the as if rule for the optimizer. And yes, I expect different
results on different compilers and systems, which was your original
point.

Some things are implementation-defined, which means that the
implementation has to document them. For example, C99 3.4.1 defines
"implementation-defined behavior" as "unspecified behavior where each
implementation documents how the choice is made". It think it's
obvious enough that the documentation needs to be truthful.

But none of that applies to the code we're discussing, since its
behavior is undefined.
 
G

Guest

I disagree that that is what it implies.


That's just word play and sophistry. "The C Standard does not define
that behaviour" if you prefer.

It guarantees a great many things about how code will be processed by
a conforming implementation. It doesn't guarantee that there are any
such implementations, but that's a separate issue.


Any requirement imposed on a conforming implementation constitutes a
guarantee that a user of an implementation which claims to be
conforming is entitled to rely upon.

No. "Not Defined" means "Not defined in this particular context".

The point you keep missing is that in the C standard, "undefined
behavior" is a piece of technical jargon that has been given a
meaning, in the context of the C standard, different from the ordinary
English usage of "behavior that is not defined".

I disagree. I think this is a pretty standard usage. There is *always*
some implied context when something is not defined.

In the context of the
C standard, "undefined behavior" means behavior that the C standard
does not define. This includes behavior that IS defined by other
documents, whose properties ARE known by those familiar with those
other documents.

yes

<snip>
 
R

Rainer Weikusat

I disagree that that is what it implies.

And I 'disagree' with the occasional thunderstorm. But for some reason
or another, it keeps raining nevertheless.
 
J

James Kuyper

No. "Not Defined" means "Not defined in this particular context".



I disagree. I think this is a pretty standard usage. There is *always*
some implied context when something is not defined.

I would say that in ordinary English, if in a given context, something
is not defined, then in that same context, it's reasonable to say that
its properties are unknown. However, in any context that includes the C
standard, there is always also the implementation's documentation;
because the C standard mandates that such documentation exist, and
mandates that it explain certain things, without prohibiting it from
expressing other things. Therefore, you can't say "undefined [behavior]
.... has no properties known to anyone"; it's entirely possible that the
implementation's documentation (which is certainly in-context) does not
provide it's own definition for any given thing that , so the strongest
statement you can make along those lines is to replace "has" with
â€ï½ï½‰ï½‡ï½ˆï½” hï½ï½–ï½…â€ï¼Ž

Whether or not that is common English usage, his argument seems to
depend upon misinterpreting "undefined behavior" as prohibiting the
existence of a definition of any kind.
 
J

James Kuyper

Richard said:
James Kuyper said:

so
the strongest statement you can make along those lines is to
replace "has" wit [EF BD 88 0A E2 80 9D EF BD 8D EF BD 89 EF
BD 87 EF BD 88 EF BD 94 E3 80 80 EF BD 88 EF BD 81 EF BD 96 EF BD 85
E2 80 9D EF]

Interesting claim! :)

Sorry about that. The text you're referring to was supposed to say
> replace "has" with "might have".

My newsreader has recently acquired a bug which sometimes causes it to
switch into a mode in which characters are displayed with a different
font while I'm composing a message. I have not yet figured out what
triggers this mode, and nothing I've yet figured out how to do switches
it back to it's original mode except exiting the newsreader and starting
over - closing the message composition window is insufficient.

However, the text is still readable in the message composition window,
and also in the saved copy of the message I posted, though the font is
ugly - I'd hoped that the posted message would also be readable.
However, even my own newsreader doesn't know how to display that text as
it appears on the news server - though what it does display is different
from what you saw.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,577
Members
45,054
Latest member
LucyCarper

Latest Threads

Top