[by itself,] The expression i++ is well defined
Yes: it schedules an increment of "i", which must occur by
the next sequence point, and produces as its value the value
of "i" before said increment.
(And for completeness, "++i" schedules an increment of i, which
also will occur by the next sequence point, but produces as its
value the incremented -- without necessarily reading "i" back,
i.e., a line like:
result = ++i;
could compile to machine code of the form:
tmp = i + 1;
i = tmp;
result = tmp;
or:
result = i + 1;
i = result;
or:
i = i + 1;
result = i;
all of which do the same thing as long as "i" is not a "volatile"
object that actually changes underneath. If "i" is somehow mapped
to a hardware register, for instance, the last version may produce
a value completely different from the other two.
The difference between "i++" and "++i" is not in the scheduling of
the increment -- although that may be one method by which a compiler
accomplishes the required difference -- but rather in the value
produced. The postfix operator produces the old, un-incremented
value; the prefix operator produces the new, incremented value.
The compiler-writer can pick any method of achieving this, as long
as the result is correct and the increment happens by the next
sequence point.)
... and hence all the arguments of in
call printf("%d ,%d ,%d\n",i,++i,i++); are well defined.
The individual arguments may be, but the expression overall is
not.
I feel this is a UB becase the order of evaluation of function
argument is undefined.
This is not quite right. According to the Standard:
[#10] The order of evaluation of the function designator,
the arguments, and subexpressions within the arguments is
unspecified, but there is a sequence point before the actual
call.
Consider the following:
(fexpr)(arg1, arg2, (expr1, arg3))
Here "fexpr" is an expression that evaluates to a pointer to a
function (of three arguments). (For convenience, let me call the
actual function "f()".) The arg1, arg2, and arg3 expressions are
the values that will be passed for those three arguments. The
expression expr1 is also evaluated and its result discarded (because
the comma operator produces, as its value, the value of its right-hand
operand).
The third argument contains a comma-expression, but the three
arguments themselves are not separated by comma-expressions. So
where are the sequence points? Well, there is one produced by the
comma-expression, and one produced by paragraph "#10" above.
Sequence points establish a time order, so we know that "expr1" is
evaluated before the sequence point and "arg3" after it; and we
know that there is another sequence point before we enter function
f() (the function that fexpr points to).
Here is the really tricky part. It is tempting to assume that, no
matter what order is actualy chosen, there is *some* order for
evaluating "fexpr", "arg1", and "arg2". If arg1 is done first,
then arg2, we get Outcome A; if arg2 is done first, then arg1, we
get Outcome B. But the C standard does not require that one be
"done first" at all. In particular, Standard C allows what is
called "interleaved evaluation", in which arg1 is partially
evaluated, then some parts of arg2 are evaluated, then more parts
of arg1 are evaluated.
Suppose we actually get interleaved evaluation, and suppose arg1
and arg2 contain things like "i++". For instance, suppose we have
something like this:
int i, j, x, y;
int f(int, int, int), g(void), h(void);
int (*fp)(int, int, int) = f;
... fill in some variables here as needed ...
(rand()==0 ? nullop : fp)(j = i++, x = ++y, (g(), h()));
The compiler *could* do these strictly left-to-right, or strictly
right-to-left, except for any embedded sequence points. But it
could also do "interleaved" evaluation. Here is one possible
interleaving:
"hmm, x = ++y, i'll schedule an increment of y and set x to y + 1,
like so..."
add Y, 1, X # Y is in a register, so we just need
# one machine instruction: x = y+1
"ooh, I need to call rand, too, and there's a sequence point
before the call, so I think I'll finish my increment y; oh look
I already have y+1 in register x, how handy"
mov X, Y
call rand
"ok, now R1 holds the result of rand, better make sure I
don't wipe that out, what else do we need to do ... hm, we
have to call g, let's do that now"
push R1 # or move it to a callee-save register
call g
"and I need to call h after g after a sequence point, but
I don't have anything I have to sequence and R1 is still
saved so let's take care of h and put arg3 into argument
register 3"
call h
mov R1, A3
"let's see, I still have to do j = i++ and pick which
function to call... i is in memory, not a register, so I have
to load that, j is not used after the call so I'll just
put j in argument register 1 now."
ld A1,
# get old value of i from RAM
"now pick which function, and finish setting up arguments;
need X, which is equal to Y, as argument 2"
pop R1 # get rand() value back
mov nullop, R2 # start with r2 = nullop
bnz,a R1, 1f # if r1!=0,
ld [fp], R2 # set r2 = fp
1: mov Y, A2 # put X (which is also Y) into arg2
"now for the call ... which is a sequence point, so better
finish incrementing i. the old value is still in A1"
add A1, 1, R1 # compute i+1 into R1
st R1, # store R1 back into RAM
"whew, all done, call the function"
call R2
Note that the actual times of the increments were almost random,
and there was no single "evaluation order" for any of the arguments.
Moreover, this was just one of many possible ways to evaluate the
function call.
If the arguments had been something like:
(fexpr)(++i, i++, arg3)
we could have scheduled *two* increments for i, without an intervening
sequence point, simply by (partially or completely) evaluating the
first argument ("schedule increment for i, producing incremented
value") and also (partially or completely) evaluating the second
argument ("schedule increment for i, producing non-incremented
value").
The Standard says that, if you attempt to modify an object more
than once without an intervening sequence point, the effect is
undefined. Hence "unspecified behavior" (evaluation of function
arguments) *can* lead to "undefined behavior".
What actually happens depends on the machine and the compiler, and
possibly on the optimization flags and so on. What this means to
you, as a C programmer, is: "don't do that." You have no control
over what actually happens. If you separate it out:
int a1, a2;
...
a1 = ++i;
a2 = i++;
(fexpr)(a1, a2, arg3);
then *you* control the action, by putting in those sequence points.
You know what must happen; you can predict the result.