Evaluation order of assignment statement

N

nachch

Does the C specification define the order of evaluation of assignment
statements?

For example, what should be the output from the following:

int foo1() { printf("foo1\n"); return 0; }
int foo2() { printf("foo2\n"); return 0; }
int foo3() { printf("foo3\n"); return 0; }

int main()
{
int array[1];
array[foo1()] = foo2() + foo3();
}

I'm asking this question since I'm getting conflicting results on
different compilers, and want to understand whether this is a compiler
bug or not.

gcc prints: foo1, foo2, foo3.
Microsoft Visual C prints: foo2, foo3, foo1.

Thanks.
 
J

John Smith

Does the C specification define the order of evaluation of assignment
statements?

For example, what should be the output from the following:

int foo1() { printf("foo1\n"); return 0; }
int foo2() { printf("foo2\n"); return 0; }
int foo3() { printf("foo3\n"); return 0; }

int main()
{
int array[1];
array[foo1()] = foo2() + foo3();

It's because the order of execution of the functions foo2(),
foo3() or foo1() is undefined. There's nothing wrong with
the compilers.

There no way to tell which function of the three is gonna
get called first.
}

I'm asking this question since I'm getting conflicting results on
different compilers, and want to understand whether this is a compiler
bug or not.

gcc prints: foo1, foo2, foo3.
Microsoft Visual C prints: foo2, foo3, foo1.

John
 
W

Walter Roberson

Does the C specification define the order of evaluation of assignment
statements?
For example, what should be the output from the following:
int foo1() { printf("foo1\n"); return 0; }
int foo2() { printf("foo2\n"); return 0; }
int foo3() { printf("foo3\n"); return 0; }
int main()
{
int array[1];
array[foo1()] = foo2() + foo3();
}

That isn't really a question about the order of evaluation of
assignment statements: it's really a question about the order of
evaluation of the components of an expression.

For the most part, the answer is NO.

The full answer involves complex rules about "sequence points", and
there are continual arguments about what the rules mean in obscure
circumstances. Even seasoned pros don't always agree (at least
for the first dozen flames) about the implications of sequence points
in some unusual expressions.
 
N

nachch

Thank you all, I just wanted to make sure it really was not defined in
any specification (that's what I originally thought).

This came up since I saw a weird bug, where a function on the RHS of an
assignment had side effects that incremented a variable used as a
subscript index on the LHS...
 
W

Walter Roberson

Thank you all, I just wanted to make sure it really was not defined in
any specification (that's what I originally thought).

You should include enough context so that oeople can follow the
discussion without having to try to locate previous postings.

Your question was about the order of evaluation, especially for
assignment statements.

This came up since I saw a weird bug, where a function on the RHS of an
assignment had side effects that incremented a variable used as a
subscript index on the LHS...

If I recall correctly, that situation is well defined. Each function call
has a sequence point surrounding it -- it is the order of those sequence
points relative to those of other function calls is not defined. But because
there is a sequence point for the function call, any side effects
of the function call are considered to be finalized before evaluation
of the left hand side (if my brain hasn't frotzed this up yet again.)

Of course, well-defined code is not necessarily even close to
"readable and maintainable" code!. I'd want to have very good reasons
before coding anything like that myself!
 
N

nachch

Walter said:
You should include enough context so that oeople can follow the
discussion without having to try to locate previous postings.

Your question was about the order of evaluation, especially for
assignment statements.



If I recall correctly, that situation is well defined. Each function call
has a sequence point surrounding it -- it is the order of those sequence
points relative to those of other function calls is not defined. But because
there is a sequence point for the function call, any side effects
of the function call are considered to be finalized before evaluation
of the left hand side (if my brain hasn't frotzed this up yet again.)

Of course, well-defined code is not necessarily even close to
"readable and maintainable" code!. I'd want to have very good reasons
before coding anything like that myself!

[Sorry about the lost context, I'm using the Google client, so I just
see a posting thread]

Anyway, regarding this specific issue: I'm getting different results
with different compilers.
In my original post I used function calls that printed some output
because that was what I used to help me visualize the timeline of
execution.

My problem is actually more like this:

int g_index = 0; // global

int bar() { g_index++; return 17; }

int foo()
{
int array[10];
array[g_index] = bar();
}

Using gcc, I get 17 written in array[0].
Using Microsoft Visual C compiler, I get 17 written in array[1].

So, gcc must have evaluated the LHS first into a memory position, then
evaluated the RHS, and finally made the assignment.
MSVC on the other hand evaluated RHS first.

Is *this* thing defined in the C specification? (*this* as opposed to
the order of function evaluation, which I understand is not defined).

Thank you!
 
R

robertwessel2

[Sorry about the lost context, I'm using the Google client, so I just
see a posting thread]


Right, but not everyone else is.

Anyway, regarding this specific issue: I'm getting different results
with different compilers.
In my original post I used function calls that printed some output
because that was what I used to help me visualize the timeline of
execution.

My problem is actually more like this:

int g_index = 0; // global

int bar() { g_index++; return 17; }

int foo()
{
int array[10];
array[g_index] = bar();
}

Using gcc, I get 17 written in array[0].
Using Microsoft Visual C compiler, I get 17 written in array[1].

So, gcc must have evaluated the LHS first into a memory position, then
evaluated the RHS, and finally made the assignment.
MSVC on the other hand evaluated RHS first.

Is *this* thing defined in the C specification? (*this* as opposed to
the order of function evaluation, which I understand is not defined).


Short answer: no. There is no sequence point separating the evaluation
of the array subscript expressing and the function call, so there's
no defined order.

In a slightly more complex case, "array[bar2()] = bar();" you could see
either bar or bar2 get called first, but it is guaranteed that the one
that gets called first is fully complete, and all side effects
evaluated, before the second is called (or any of its parameters
evaluated - although there aren't any in this example).

In your example, you probably want to do something like:

t = bar();
array[g_index] = t;

or:

t = g_index;
array[t] = bar();


depending on what you actually want to happen.
 
W

Walter Roberson

My problem is actually more like this:
int g_index = 0; // global
int bar() { g_index++; return 17; }
int foo()
{
int array[10];
array[g_index] = bar();
}
Is *this* thing defined in the C specification? (*this* as opposed to
the order of function evaluation, which I understand is not defined).

There are subtle nuances to sequence points that I don't think I
am clear on myself. Puzzling through the C89 wording, I -think- the
above is not valid.

There is a sequence point before the actual call to bar(), which
I understand to mean that g_index would not yet have been computed
(according to the abstract semantics.) However, at -this- level
there is no sequence point between the start of the call to bar and
the end of the assignment. An object may be modified only once
between the previous sequence point and the next, and g_index is
modified only once by the call, so that part itself is okay. However,
the previous value of an object may be accessed only to determine
the value to store, and that's being violated because the
value of g_index has to be accessed inside bar() in order to determine
the new value to store -and- the value of g_index needs to be accessed
in order to determine the subscript to use. So if I understand
correctly, that is two kinds of accesses where only one is
permitted.

But if my understanding is correct, then the following would work,
and I -suspect- it won't:

int g_index = 0; /* global */
int bar() { g_index++; return 17; }
int copyarg( int inval ) { return inval; }
void foo(void) {
int array[10];
array[g_index] = copyarg( bar() );
}

The idea here being that with the extra layer of function call,
there is a sequence point before the call to copyarg() so the
sideeffects of bar() would be finalized before copyarg() was called,
and hence the implication would be that the g_index in the subscript
should get the side-effect'd value of g_index . But it doesn't sound
right that putting in an extra layer of call could make right
a side effect.
 
C

Chris Torek

[with vertical compression by me]
int g_index = 0; // global
int bar() { g_index++; return 17; }
int foo() {
int array[10];
array[g_index] = bar();
}
Is *this* thing defined in the C specification? (*this* as opposed to
the order of function evaluation, which I understand is not defined).

The precise answer is that it is "unspecified".

There are subtle nuances to sequence points that I don't think I
am clear on myself. Puzzling through the C89 wording, I -think- the
above is not valid.

Depends what you mean by "valid".

The sequence points that surround the call to, and return from,
function bar(), guarantee that g_index is 0 before, and 1 after,
the call to bar(). (Assuming of course that it has not been altered
before this.) In addition, we can (I believe) be sure that in the
left-hand sub-expression "array[g_index]", g_index is evaluated
either entirely before, or entirely after, the call to bar(). It
will therefore be either 0 or 1.

Unfortunately, there is nothing that says *which* will occur. This
is not even "implementation-defined". If it were, the programmer
could read the documentation that comes with the compiler, and find
out whether array[g_index] will be array[0] or array[1], and know
the answer for that particular compiler (perhaps "that compiler with
specific flags", since there might be a compiler switch to choose
one or the other). But it is "unspecified", meaning the compiler
does not even have to tell you how it chooses when to evaluate
g_index (i.e., before or after the call to bar()).

That, in turn, means the compiler can make this choice based on
the phase of the moon, the temperature of the CPU, or any other
hard-to-predict item. You can be sure of "zero or one", but not
which.

The way to force the desired order of evaluation -- whatever that
is -- is to capture g_index with a sequence point that is ordered
with respect to the function-call sequence point. For instance:

i = g_index; /* capture value before the call */
array = bar(); /* index with old value */

or:

i = bar(); /* do the call */
array[g_index] = i; /* index with new value */
 
W

Walter Roberson

Chris Torek said:
The sequence points that surround the call to, and return from,
function bar(), guarantee that g_index is 0 before, and 1 after,
the call to bar(). (Assuming of course that it has not been altered
before this.)

In the C89 standard, I saw the mandatory sequence point just
before the call to functions, but I could not find any mandatory
sequence points after function calls -- at least not sequence
points "at the same level" (so to speak.)

When I refer to sequence points "at the same level", I am supposing
that the sequence points are per expression (or per statement), and
there can, in essence, be "suspended" sequence points -- in
contrast to a model in which the sequence points within an
expression evaluation are all "global" sequence points in which
*all* possible finalization (all all nesting levels) must occur.

I am not certain at the moment how to distinguish the two models,
but I'll throw something out and perhaps someone will understand
the difference and know the answer:

The C89 standard indicates that if a signal or exception occurs,
that the values of objects (including auto objects with block scope)
are determined as of the previous sequence point, and that
any modification (or volatile access) that might be "in progress"
has an uncertain state as of the time of the the signal or exception.

Suppose I have a block with an auto variable X, and inside that block
there is a sequence point (so we have finalized the auto object),
then an expression that has a call to a routine. Suppose that routine
in turn has a block with an auto Y, and the routine has completed
a sequence point (so Y has been finalized), and suppose the routine
is in the middle of an expression, and that a signal or
exception occurs.

If sequence points can be "suspended", then the state of X may be
indeterminate at the time of the signal or exception, because we
are between the sequence points in the outer routine. But if
sequence points are in some sense "global", then the sequence point
in the inner routine is a a full sequence point for the purposes
of determining whether X has a determinate value or not.

??
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top