Order of evaluation of function arguments

D

dragoncoder

Consider the following code.

#include <stdio.h>

int main()
{
int i =1;
printf("%d ,%d ,%d\n",i,++i,i++);
return 0;
}

Is the above code well defined? If yes, what is the output. If no, why
?

Thanks.
 
M

Michael Mair

dragoncoder said:
Consider the following code.

#include <stdio.h>

int main()
{
int i =1;
printf("%d ,%d ,%d\n",i,++i,i++);
return 0;
}

Is the above code well defined? If yes, what is the output. If no, why
?

This seems to be a homework question. If yes, please state so
and show us your best shot at answering the question.
If no, convince us.

Otherwise: Read the comp.lang.c FAQ. Apart from that being the
obvious and polite thing to do _before_ posting for the first
time, it should help you answering these question

You're welcome.

Cheers
Michael
 
D

dragoncoder

This is not a homework question. I just wanted to make it short. I
understand that modifying the value more than once between 2 sequence
points invokes undefined behaviour as mentioned in
http://www.eskimo.com/~scs/C-faq/q3.2.html but here scenario is
different I believe.

I think the comma (,) operator is a sequence point and therefore the
sequence point requirement of UB is not satisfied. I feel it has to do
something with the order of evaluation of argements.

Please put me to right direction.
 
F

Flash Gordon

dragoncoder said:
Consider the following code.

#include <stdio.h>

int main()
{
int i =1;
printf("%d ,%d ,%d\n",i,++i,i++);
return 0;
}

Is the above code well defined? If yes, what is the output. If no, why
?

This is a FAQ, or at least a variation on it.
http://www.eskimo.com/~scs/C-faq/q3.2.html
The comma separating arguments is *not* a sequence point, and modifying
a variable twice between sequence points is undefined behaviour.
Please check the FAQ for a group and search the group before posting
questions.
 
C

Christopher Benson-Manica

dragoncoder said:
I think the comma (,) operator is a sequence point

It is. The comma that separates arguments passed to a function is not
the comma operator.
 
D

dragoncoder

There are 2 things here.

1. Modifiying the value of a variable twice between 2 consecutive
sequence is undefined behaviour. As in

int i = 0;
int j = i++ * i++; // UB

Talking about the example in the faq, the statement printf("%d\n", i++
* i++); is a UB because the expression i++ * i++ itself is undefined.
So far so good.

2. The below code is well-defined.

int i = 0;
int j = i++;

The expression i++ is well defined and hence all the arguments of in
call printf("%d ,%d ,%d\n",i,++i,i++); are well defined. I feel this is
a UB becase the order of evaluation of function argument is undefined.
Aren't these 2 cases different. Or am I missing something. Please help
me.

Thanks in advance.
 
K

Keith Thompson

dragoncoder said:
This is not a homework question. I just wanted to make it short. I
understand that modifying the value more than once between 2 sequence
points invokes undefined behaviour as mentioned in
http://www.eskimo.com/~scs/C-faq/q3.2.html but here scenario is
different I believe.

I think the comma (,) operator is a sequence point and therefore the
sequence point requirement of UB is not satisfied. I feel it has to do
something with the order of evaluation of argements.

The statement in question was:

printf("%d ,%d ,%d\n",i,++i,i++);

There is no comma operator in that statement. The commas here are
syntactic markers that separate the function arguments.

Something like this would be well-defined:

printf("%d", (++i,i++));

And please read <http://cfaj.freeshell.org/google/>.
 
J

Jordan Abel

It is. The comma that separates arguments passed to a function is not
the comma operator.

no, but it is the function-argument comma - which makes it merely an
unspecified order, not undefined behavior, IIRC
 
J

Jordan Abel

The expression i++ is well defined and hence all the arguments of in
call printf("%d ,%d ,%d\n",i,++i,i++); are well defined. I feel this is
a UB becase the order of evaluation of function argument is undefined.

The order is unspecified. If it were undefined, it would be UB to call
any function with multiple arguments, as foo(a,b) would invoke undefined
behavior. Your printf line cannot appear in a strictly conforming
program because its output depends on unspecified behavior.
 
K

Keith Thompson

Jordan Abel said:
no, but it is the function-argument comma - which makes it merely an
unspecified order, not undefined behavior, IIRC

No, it's undefined behavior.

These:
func(f(), g());
f() * g();
both call f() and g() in some unspecified order, but don't invoke
undefined behavior.

These:
func(i++, i++);
i++ * i++;
both invoke undefined behavior because they both modify i twice
between sequence points.
 
F

Flash Gordon

Jordan said:
no, but it is the function-argument comma - which makes it merely an
unspecified order, not undefined behavior, IIRC

The example was:
printf("%d ,%d ,%d\n",i,++i,i++);

The order of evaluation is unspecified, but that has nothing to do with
it. It is undefined behaviour because i modified twice between sequence
points. The unspecified behaviour would become important if you did:
printf("%d ,%d\n",a(),b());
because you would not know which if a and b would be executed first.
 
F

Flash Gordon

Jordan said:
The order is unspecified.
Correct.

> If it were undefined, it would be UB to call
any function with multiple arguments, as foo(a,b) would invoke undefined
behavior. Your printf line cannot appear in a strictly conforming
program because its output depends on unspecified behavior.

Incorrect. It is a clear case of undefined behaviour because i is
modified twice without an intervening sequence point.
 
C

Chris Torek

[by itself,] The expression i++ is well defined

Yes: it schedules an increment of "i", which must occur by
the next sequence point, and produces as its value the value
of "i" before said increment.

(And for completeness, "++i" schedules an increment of i, which
also will occur by the next sequence point, but produces as its
value the incremented -- without necessarily reading "i" back,
i.e., a line like:

result = ++i;

could compile to machine code of the form:

tmp = i + 1;
i = tmp;
result = tmp;

or:

result = i + 1;
i = result;

or:

i = i + 1;
result = i;

all of which do the same thing as long as "i" is not a "volatile"
object that actually changes underneath. If "i" is somehow mapped
to a hardware register, for instance, the last version may produce
a value completely different from the other two.

The difference between "i++" and "++i" is not in the scheduling of
the increment -- although that may be one method by which a compiler
accomplishes the required difference -- but rather in the value
produced. The postfix operator produces the old, un-incremented
value; the prefix operator produces the new, incremented value.
The compiler-writer can pick any method of achieving this, as long
as the result is correct and the increment happens by the next
sequence point.)
... and hence all the arguments of in
call printf("%d ,%d ,%d\n",i,++i,i++); are well defined.

The individual arguments may be, but the expression overall is
not.
I feel this is a UB becase the order of evaluation of function
argument is undefined.

This is not quite right. According to the Standard:

[#10] The order of evaluation of the function designator,
the arguments, and subexpressions within the arguments is
unspecified, but there is a sequence point before the actual
call.

Consider the following:

(fexpr)(arg1, arg2, (expr1, arg3))

Here "fexpr" is an expression that evaluates to a pointer to a
function (of three arguments). (For convenience, let me call the
actual function "f()".) The arg1, arg2, and arg3 expressions are
the values that will be passed for those three arguments. The
expression expr1 is also evaluated and its result discarded (because
the comma operator produces, as its value, the value of its right-hand
operand).

The third argument contains a comma-expression, but the three
arguments themselves are not separated by comma-expressions. So
where are the sequence points? Well, there is one produced by the
comma-expression, and one produced by paragraph "#10" above.
Sequence points establish a time order, so we know that "expr1" is
evaluated before the sequence point and "arg3" after it; and we
know that there is another sequence point before we enter function
f() (the function that fexpr points to).

Here is the really tricky part. It is tempting to assume that, no
matter what order is actualy chosen, there is *some* order for
evaluating "fexpr", "arg1", and "arg2". If arg1 is done first,
then arg2, we get Outcome A; if arg2 is done first, then arg1, we
get Outcome B. But the C standard does not require that one be
"done first" at all. In particular, Standard C allows what is
called "interleaved evaluation", in which arg1 is partially
evaluated, then some parts of arg2 are evaluated, then more parts
of arg1 are evaluated.

Suppose we actually get interleaved evaluation, and suppose arg1
and arg2 contain things like "i++". For instance, suppose we have
something like this:

int i, j, x, y;
int f(int, int, int), g(void), h(void);
int (*fp)(int, int, int) = f;
... fill in some variables here as needed ...

(rand()==0 ? nullop : fp)(j = i++, x = ++y, (g(), h()));

The compiler *could* do these strictly left-to-right, or strictly
right-to-left, except for any embedded sequence points. But it
could also do "interleaved" evaluation. Here is one possible
interleaving:

"hmm, x = ++y, i'll schedule an increment of y and set x to y + 1,
like so..."

add Y, 1, X # Y is in a register, so we just need
# one machine instruction: x = y+1

"ooh, I need to call rand, too, and there's a sequence point
before the call, so I think I'll finish my increment y; oh look
I already have y+1 in register x, how handy"

mov X, Y
call rand

"ok, now R1 holds the result of rand, better make sure I
don't wipe that out, what else do we need to do ... hm, we
have to call g, let's do that now"

push R1 # or move it to a callee-save register
call g

"and I need to call h after g after a sequence point, but
I don't have anything I have to sequence and R1 is still
saved so let's take care of h and put arg3 into argument
register 3"

call h
mov R1, A3

"let's see, I still have to do j = i++ and pick which
function to call... i is in memory, not a register, so I have
to load that, j is not used after the call so I'll just
put j in argument register 1 now."

ld A1, # get old value of i from RAM

"now pick which function, and finish setting up arguments;
need X, which is equal to Y, as argument 2"

pop R1 # get rand() value back
mov nullop, R2 # start with r2 = nullop
bnz,a R1, 1f # if r1!=0,
ld [fp], R2 # set r2 = fp
1: mov Y, A2 # put X (which is also Y) into arg2

"now for the call ... which is a sequence point, so better
finish incrementing i. the old value is still in A1"

add A1, 1, R1 # compute i+1 into R1
st R1, # store R1 back into RAM

"whew, all done, call the function"

call R2

Note that the actual times of the increments were almost random,
and there was no single "evaluation order" for any of the arguments.
Moreover, this was just one of many possible ways to evaluate the
function call.

If the arguments had been something like:

(fexpr)(++i, i++, arg3)

we could have scheduled *two* increments for i, without an intervening
sequence point, simply by (partially or completely) evaluating the
first argument ("schedule increment for i, producing incremented
value") and also (partially or completely) evaluating the second
argument ("schedule increment for i, producing non-incremented
value").

The Standard says that, if you attempt to modify an object more
than once without an intervening sequence point, the effect is
undefined. Hence "unspecified behavior" (evaluation of function
arguments) *can* lead to "undefined behavior".

What actually happens depends on the machine and the compiler, and
possibly on the optimization flags and so on. What this means to
you, as a C programmer, is: "don't do that." You have no control
over what actually happens. If you separate it out:

int a1, a2;
...
a1 = ++i;
a2 = i++;
(fexpr)(a1, a2, arg3);

then *you* control the action, by putting in those sequence points.
You know what must happen; you can predict the result.
 
M

Martin Ambuhl

dragoncoder said:
I think the comma (,) operator is a sequence point and therefore the
sequence point requirement of UB is not satisfied. I feel it has to do
something with the order of evaluation of argements.

There is no comma operator in
printf("%d ,%d ,%d\n",i,++i,i++);
Please put me to right direction.

Do you have an elementary text on C?
 
N

Niklas Norrthon

dragoncoder said:
Consider the following code.

#include <stdio.h>

int main()
{
int i =1;
printf("%d ,%d ,%d\n",i,++i,i++);
return 0;
}

Is the above code well defined? If yes, what is the output. If no, why
?

Read the FAQ!

If the FAQ is not your cup of tea, at least browse the old posts of
this group, this question and a similar involving i = i++ + ++i,
comes up several times every week.

Why do so many people want to know this? Does it matter? When would
such a construct be useful even if it were well defined? And why is
it so hard to find out without asking here?

/Niklas Norrthon
 
C

CoL

dragoncoder said:
Consider the following code.

#include <stdio.h>

int main()
{
int i =1;
printf("%d ,%d ,%d\n",i,++i,i++);
return 0;
}

Is the above code well defined? If yes, what is the output. If no, why
?

Thanks.

I can see lot more confusions on this topic to make it clear, sequence
points defined by the language are:
Fuction call
Statement termination by semicolon(;)
and logical operators(|| and &&...)

Secondly, question asked about the order of evaluation, LANGUAGE C no
where specifies the order of evaluation of function arguments, its
totally compiler and platform specific.So good programming practice
discourages any code thats dependent on order of evaluation of
function argument..

Regards,
Apoorv
 
F

Flash Gordon

CoL said:
I can see lot more confusions on this topic to make it clear, sequence
points defined by the language are:

You didn't get them all. The one that causes confusion is that the comma
operator is also a sequence point, but the comma separating parameters
is *not* a comma operator and does not provide a sequence point.

There are other sequence points as well, I'm not going to enumerate them
all because I would miss something ;-)
Secondly, question asked about the order of evaluation, LANGUAGE C no
where specifies the order of evaluation of function arguments, its
totally compiler and platform specific.So good programming practice
discourages any code thats dependent on order of evaluation of
function argument..

That is all true.
 
K

Keith Thompson

Niklas Norrthon said:
Read the FAQ!

The FAQ doesn't directly address this. Question 3.2 discusses
i++ * ++i, but it doesn't explicitly mention modifying the same
variable twice in different arguments of the same function call.
Combine this with the common misconception that the commas separating
arguments are comma operators, and the confusion is understandable.

It would be good for question 3.2 to mention function arguments. On
the other hand, I see that <http://www.eskimo.com/~scs/C-faq/q3.2.html>
celebrated its tenth birthday two months ago (that's ten years since
the page has been modified).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,044
Latest member
RonaldNen

Latest Threads

Top