How come this is undefined behavior?

P

Peter

Section 1.9 ("Program execution") of C++11 standard introduces the following example of undefined behaviour:

for function declared as

void f(int, int);

the call:

f(i = -1, i = -1);

yields undefined behaviour.

How come? Of course I know the order in which arguments of f are evaluated is unspecified, so result of f(i, i++) would indeed be undefined, but what's wrong with f(i = -1, i = -1)? Why should the order matter here? f doesn't get called until both arguments are evaluated and both assignment expressions are independent, so what difference does it make if first and then second gets evaluated or the other way round since both evaluate to -1 anyway?
 
V

Victor Bazarov

Section 1.9 ("Program execution") of C++11 standard introduces the
following example of undefined behaviour:
for function declared as

void f(int, int);

the call:

f(i = -1, i = -1);

yields undefined behaviour.
How come? Of course I know the order in which arguments of f are
evaluated is unspecified, so result of f(i, i++) would indeed be
undefined, but what's wrong with f(i = -1, i = -1)? Why should the
order matter here? f doesn't get called until both arguments are
evaluated and both assignment expressions are independent, so what
difference does it make if first and then second gets evaluated or
the other way round since both evaluate to -1 anyway?

The rationale for the undefined behaviour for that example is stated in
the sentence directly preceding the example. The side effect is
unsequenced relative to another side effect on the same scalar. Think
about it a bit. And the fact that both assignments lead to the same
side effect (-1 is assigned to 'i') is notwithstanding - the compiler is
not required to be *that* clever.

V
 
V

Victor Bazarov

I think his question was why that still was undefined since both
modifications (of the same object between sequence points) were to the
same value. And I think the answer to that is simply that there's no
such rule for the special case. And while the example seems fairly
obvious, a general purpose rule would likely be impossible. Consider
"f(i=g(), i=h());" - would that be valid if you can prove that g() and
h() will return the same value?

The "anyway" at the end of the original "how come" paragraph suggests
that the question was *not* about the modifications' being to the same
value. Re-read it if you please. The comparison is given to f(i,i++)
with the emphasis that in f(i=a,i=b) "both assignment expressions are
independent". They aren't, in fact, independent because they have the
same side effect - changing of the value of 'i'. So, the premise is
incorrect, and the conclusion that follows is also incorrect. That's all.

V
 
J

James Kanze

Section 1.9 ("Program execution") of C++11 standard introduces
the following example of undefined behaviour:
for function declared as
void f(int, int);
the call:
f(i = -1, i = -1);
yields undefined behaviour.
How come? Of course I know the order in which arguments of
f are evaluated is unspecified, so result of f(i, i++) would
indeed be undefined, but what's wrong with f(i = -1, i = -1)?
Why should the order matter here? f doesn't get called until
both arguments are evaluated and both assignment expressions
are independent, so what difference does it make if first and
then second gets evaluated or the other way round since both
evaluate to -1 anyway?

I was told, once (so take it with a grain of salt) that there
could be scheduling issues: that writes could be programmed to
overlap (so that in "a = 1; b = 2;", the compiler could schedule
the write to b before the write to a was finished), and that
overlapping writes to the same address caused the processor to
lock. I'll admit that I'm a bit dubious. (I have worked in an
environment where writes could be overlapped, and issuing
a second write instruction before the first was finished would
cause problems. But that was in microcode, and it didn't matter
whether the writes were to the same address or not.)

Otherwise the issue is clearly with things like
`f( *p = -1, *q = -1 )`. If the compiler sees code like that,
it can deduce that p and q do not alias the same variable, and
so elsewhere in the function, given:

*p = 42;
int i = *q;

it may reorder the two operations.
 
J

James Kanze

Most definitely.
A comma operator introduces a sequence point so there is nothing undefined
here as far as I can see.

Sequence points only create a partial ordering. If we label the
sub-expressions:

a b c d e f
f((42, i = -1, 42), (42, i = -1, 42));

The comma operators create the following ordering constraints
(where < means "is ordered before"):

a < b
b < c
d < e
e < f

The transitivity of the relationship adds a < c and d < f, but
there is nothing which forces b < e or e < b; they are unordered
with respect to each other, and since both modify i, the code
has undefined behavior.
What about the following variation?
f((i = -1, 42), (42, i = -1));

Do the same analysis as above. There's no ordering between the
two "i = -1", so the behavior is undefined.
 
R

Roberto Waltman

Peter said:
...
the call:
f(i = -1, i = -1);
yields undefined behaviour.
How come? ...

It seems to me that your question and some of the replies assume that
'i = -1' is an atomic operation. Although with modern compilers and
CPU architectures it is most likely to be implemented as an atomic
assignment, that is not necessarily the case.

A compiler could choose, for example, to implement
i = -1
as
(i = 0, --i)

This would make the original example clearly undefined.
 
P

pilarp

I think his question was why that still was undefined since both
modifications (of the same object between sequence points) were to the

same value. And I think the answer to that is simply that there's no

such rule for the special case. And while the example seems fairly

obvious, a general purpose rule would likely be impossible. Consider

"f(i=g(), i=h());" - would that be valid if you can prove that g() and

h() will return the same value?

It wouldn't be valid and here's a short complete counterexample:

#include <iostream>

int n;

int g()
{
if(n > 0)
++n;
return 1;
}

int h()
{
++n;
return 1;
}

void f(int a, int b) {}

int main()
{
int i;
f(i=g(), i=h());
std::cout << n << std::endl;
return 0;
}

The above will print either 1 or 2 depending on whether g() or h() will be called first. The fact that either case leads to f(i=1, i=1) is irrelevant because the call made above is really of the form:

f(i=first_function(), i=second_function());

and side effects of first_function() and second_function() is what makes the call above unsafe.

What about a simple case I originally gave (right side of assignment operator being a literal)? Even though the order of evaluation is still unspecified, the only side effect is i being modified, so is there any imaginable scenario when a call such as:

f(i=LITERAL_1, i=LITERAL_2); // LITERAL_1 not necessarily = LITERAL_2

or more generally:

f(i=expr1, i=expr2);

where expr1 and expr2 are expressions with no side effects

can result in different "visible effects" depending on whether first or second argument will be evaluated first? Are such calls considered unsafe purely by convention or can they indeed be dangerous (perhaps due to some compiler tricks beyond our control)?
 
J

James Kanze

It wouldn't be valid and here's a short complete
[... irrelevant example deleted]

It wouldn't be valid because the standard says it is undefined
behavior. Just what is it that you don't understand about
"undefined"?

In this case, where the aliasing is obvious, the compiler could
insert an illegal instruction in the code. (At least one does
in some cases of undefined behavior.) More likely, if instead
of i in both cases, you had *p and *q, and p and q pointed to
the same place, the compiler would deduce that there was no
aliasing between p and q. But all of this is really irrelevant.
The standard says that it is undefined behavior, and that's the
end of it. Speculating as to what an implementation might do is
just that: speculation.
 
J

James Kanze

It seems to me that your question and some of the replies assume that
'i = -1' is an atomic operation.

It seems to me that his question (and a lot of the replies,
including yours) assume that undefined doesn't mean undefined.
(With regards to the OP: I understand his question to have been
more along the lines of the motivation for the committee to have
made this undefined.)

I have already posted the reason that I was given as to why it
is undefined, which was given to me by a member of the C
standards committee back when the C standard was being
formulated (late 1980's). I have my doubts about the validity
of that reason, even then (but I can conceive of a hardware
implementation where it would have been valid). The fact
remains that the standard says that this is undefined behavior,
and a compiler is allowed to assume that the program doesn't do
it, and make conclusions from this fact.
 
G

Gerhard Fiedler

It wouldn't be valid and here's a short complete counterexample:

#include <iostream>

int n;

int g()
{
if(n > 0)
++n;
return 1;
}

int h()
{
++n;
return 1;
}

void f(int a, int b) {}

int main()
{
int i;
f(i=g(), i=h());
std::cout << n << std::endl;
return 0;
}


The above will print either 1 or 2 depending on whether g() or h()
will be called first. The fact that either case leads to f(i=1, i=1)
is irrelevant because the call made above is really of the form:

As James says, this example is irrelevant for your question. You can
strip out your particular issue, it still suffers from the same problem,
and that's just how C++ is: it doesn't prevent you from shooting
yourself in the foot.

// Same code as above

int main()
{
f(g(), h());
std::cout << n << std::endl;
return 0;
}

Same problem. Don't work with side effects, but that wasn't your
question.

Gerhard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top