Does the statement given below invoke undefined behavior?
i = (i, i++, i) + 1;
I am almost convinced that it does not because of the following
reasons
Ok, I've been poring over the latest draft, which takes a better stab at
all of this. I still don't really know the answer, but here's more
stuff according to that draft. (I stared at C99 today trying to coax
the real answer out of it, but I was just getting unhappy with the model
which didn't seem to want to spit it out. C09 is more complex with some
abstractions that I think help clarify these issues.)
Comments welcomed on everything I've written here. Please be aware that
this is my understanding of a document that I just looked at for the
first time today and is in no way necessarily correct or definitive.
1> the RHS must be evaluated before a value can be stored in i
It's a little bit nuanced, because "evaluation" is two things:
# Evaluation of an expression in general includes both value
# computations and initiation of side effects. [5.1.2.3p2]
Note that it's not "resolution" of side effects (which don't necessarily
occur until a sequence point.)
With respect to expressions:
# The value computations of the operands of an operator are sequenced
# before the value computation of the result of the operator. [6.5p1]
So, yes, the value computation must be done before the assignment, but
not necessarily the resolution of side effects.
In terms of sequencing of operations:
# Given any two evaluations A and B, if A is sequenced before B, then
# the execution of A shall precede the execution of B. [...] If A is
# not sequenced before or after B, then A and B are unsequenced.
# [5.1.2.3p3]
Remember, we're talking about "evaluations", which does not necessarily
include resolution of side effects.
And how this relates to expressions (this is *the* paragraph that lays
down the law):
# If a side effect on a scalar object is unsequenced relative to either
# a different side effect on the same scalar object or a value
# computation using the value of the same scalar object, the behavior is
# undefined. [6.5p2]
So back to the example:
i = (i, i++, i) + 1;
We have two side effects in the assignment and the ++. The question is,
are they sequenced?
Well, we know that the value computations of the operands to + are
sequenced before the value computation of the result of +. So the value
of 1 and the value of (i,i++,i) are computed before the result of + is.
What of the comma operator?
# The left operand of a comma operator is evaluated as a void
# expression; there is a sequence point between its evaluation and that
# of the right operand. Then the right operand is evaluated; the result
# has its type and value. [6.5.17p2]
What is a sequence point?
# The presence of a sequence point between the evaluation of expressions
# A and B implies that every value computation and side effect
# associated with A is sequenced before every value computation and side
# effect associated with B. [5.1.2.3p3]
So now we get our forced sequencing of side effects, as well. With the
expression (i,i++,i), the side effect of i++ must be complete before the
value of the expression (namely i) is can be computed. And the value of
the expression must be computed before it can subsequently be used by +.
And +'s value must be computed before the assignment can occur:
# The side effect of updating the stored value of the left operand is
# sequenced after the value computations of the left and right operands.
# [6.5.16p3]
Working backward:
o For the assignment side effect to occur, the value computations of
both operands of the assignment must be complete.
o For the value computations on the right side of the assignment to be
complete, the value computations of the + operator's operands have
to be complete.
o For the value computation of (i,i++,i) to be complete, i++'s side
effects must be complete.
And so, I think, the side effect of i++ is sequenced before the side
effect of i=, and so in this case is not undefined behavior.
Some counter cases:
i = i++;
While the sequence of value computations is defined for i=i++, the
side effects are unsequenced, and so it is undefined behavior.
|----- A ----| |----- B ----|
k = (i, i /= 3, i) + (i, i *= 5, i); // "please...kill me..."
In this case, the value computations of both subexpressions A and B
must be complete before +, and therefore, by the previous pages of
arguments, the side effects of i/=3 and i*=5 must also be complete
before the +.
And, therefore, the side effects of i/=3 and i*=5 must also be
complete before the result of the value computation of + is finally
assigned into k.
However, the two subexpressions A and B are unsequenced relative to
one another and both modify the same object, and so the behavior is
undefined.
Do I believe it myself? I don't even know anymore.
What do you think, folks?
-Beej
(Remember: this analysis is based on the draft, not the Standard. I'm
just presuming they're going to try to keep it basically compatible.)