Evaluation order of assignment statement

Discussion in 'C Programming' started by nachch@gmail.com, Oct 23, 2006.

  1. Guest

    Does the C specification define the order of evaluation of assignment
    statements?

    For example, what should be the output from the following:

    int foo1() { printf("foo1\n"); return 0; }
    int foo2() { printf("foo2\n"); return 0; }
    int foo3() { printf("foo3\n"); return 0; }

    int main()
    {
    int array[1];
    array[foo1()] = foo2() + foo3();
    }

    I'm asking this question since I'm getting conflicting results on
    different compilers, and want to understand whether this is a compiler
    bug or not.

    gcc prints: foo1, foo2, foo3.
    Microsoft Visual C prints: foo2, foo3, foo1.

    Thanks.
     
    , Oct 23, 2006
    #1
    1. Advertising

  2. John Smith Guest

    wrote:
    > Does the C specification define the order of evaluation of assignment
    > statements?
    >
    > For example, what should be the output from the following:
    >
    > int foo1() { printf("foo1\n"); return 0; }
    > int foo2() { printf("foo2\n"); return 0; }
    > int foo3() { printf("foo3\n"); return 0; }
    >
    > int main()
    > {
    > int array[1];
    > array[foo1()] = foo2() + foo3();


    It's because the order of execution of the functions foo2(),
    foo3() or foo1() is undefined. There's nothing wrong with
    the compilers.

    There no way to tell which function of the three is gonna
    get called first.

    > }
    >
    > I'm asking this question since I'm getting conflicting results on
    > different compilers, and want to understand whether this is a compiler
    > bug or not.
    >
    > gcc prints: foo1, foo2, foo3.
    > Microsoft Visual C prints: foo2, foo3, foo1.


    John
    --
    John Smith
     
    John Smith, Oct 23, 2006
    #2
    1. Advertising

  3. In article <>,
    <> wrote:
    >Does the C specification define the order of evaluation of assignment
    >statements?


    >For example, what should be the output from the following:


    >int foo1() { printf("foo1\n"); return 0; }
    >int foo2() { printf("foo2\n"); return 0; }
    >int foo3() { printf("foo3\n"); return 0; }


    >int main()
    >{
    > int array[1];
    > array[foo1()] = foo2() + foo3();
    >}


    That isn't really a question about the order of evaluation of
    assignment statements: it's really a question about the order of
    evaluation of the components of an expression.

    For the most part, the answer is NO.

    The full answer involves complex rules about "sequence points", and
    there are continual arguments about what the rules mean in obscure
    circumstances. Even seasoned pros don't always agree (at least
    for the first dozen flames) about the implications of sequence points
    in some unusual expressions.
    --
    "It is important to remember that when it comes to law, computers
    never make copies, only human beings make copies. Computers are given
    commands, not permission. Only people can be given permission."
    -- Brad Templeton
     
    Walter Roberson, Oct 23, 2006
    #3
  4. wrote:
    > Does the C specification define the order of evaluation of assignment
    > statements?


    No. The rest follows.

    --
    Best regards,
    Andrey Tarasevich
     
    Andrey Tarasevich, Oct 23, 2006
    #4
  5. Guest

    Thank you all, I just wanted to make sure it really was not defined in
    any specification (that's what I originally thought).

    This came up since I saw a weird bug, where a function on the RHS of an
    assignment had side effects that incremented a variable used as a
    subscript index on the LHS...

    --
    Nachch
     
    , Oct 23, 2006
    #5
  6. In article <>,
    <> wrote:
    >Thank you all, I just wanted to make sure it really was not defined in
    >any specification (that's what I originally thought).


    You should include enough context so that oeople can follow the
    discussion without having to try to locate previous postings.

    Your question was about the order of evaluation, especially for
    assignment statements.


    >This came up since I saw a weird bug, where a function on the RHS of an
    >assignment had side effects that incremented a variable used as a
    >subscript index on the LHS...


    If I recall correctly, that situation is well defined. Each function call
    has a sequence point surrounding it -- it is the order of those sequence
    points relative to those of other function calls is not defined. But because
    there is a sequence point for the function call, any side effects
    of the function call are considered to be finalized before evaluation
    of the left hand side (if my brain hasn't frotzed this up yet again.)

    Of course, well-defined code is not necessarily even close to
    "readable and maintainable" code!. I'd want to have very good reasons
    before coding anything like that myself!
    --
    I was very young in those days, but I was also rather dim.
    -- Christopher Priest
     
    Walter Roberson, Oct 23, 2006
    #6
  7. Guest

    Walter Roberson wrote:
    > In article <>,
    > <> wrote:
    > >Thank you all, I just wanted to make sure it really was not defined in
    > >any specification (that's what I originally thought).

    >
    > You should include enough context so that oeople can follow the
    > discussion without having to try to locate previous postings.
    >
    > Your question was about the order of evaluation, especially for
    > assignment statements.
    >
    >
    > >This came up since I saw a weird bug, where a function on the RHS of an
    > >assignment had side effects that incremented a variable used as a
    > >subscript index on the LHS...

    >
    > If I recall correctly, that situation is well defined. Each function call
    > has a sequence point surrounding it -- it is the order of those sequence
    > points relative to those of other function calls is not defined. But because
    > there is a sequence point for the function call, any side effects
    > of the function call are considered to be finalized before evaluation
    > of the left hand side (if my brain hasn't frotzed this up yet again.)
    >
    > Of course, well-defined code is not necessarily even close to
    > "readable and maintainable" code!. I'd want to have very good reasons
    > before coding anything like that myself!
    > --
    > I was very young in those days, but I was also rather dim.
    > -- Christopher Priest


    [Sorry about the lost context, I'm using the Google client, so I just
    see a posting thread]

    Anyway, regarding this specific issue: I'm getting different results
    with different compilers.
    In my original post I used function calls that printed some output
    because that was what I used to help me visualize the timeline of
    execution.

    My problem is actually more like this:

    int g_index = 0; // global

    int bar() { g_index++; return 17; }

    int foo()
    {
    int array[10];
    array[g_index] = bar();
    }

    Using gcc, I get 17 written in array[0].
    Using Microsoft Visual C compiler, I get 17 written in array[1].

    So, gcc must have evaluated the LHS first into a memory position, then
    evaluated the RHS, and finally made the assignment.
    MSVC on the other hand evaluated RHS first.

    Is *this* thing defined in the C specification? (*this* as opposed to
    the order of function evaluation, which I understand is not defined).

    Thank you!
    --
    Nachch
     
    , Oct 23, 2006
    #7
  8. Guest

    wrote:
    > [Sorry about the lost context, I'm using the Google client, so I just
    > see a posting thread]



    Right, but not everyone else is.


    > Anyway, regarding this specific issue: I'm getting different results
    > with different compilers.
    > In my original post I used function calls that printed some output
    > because that was what I used to help me visualize the timeline of
    > execution.
    >
    > My problem is actually more like this:
    >
    > int g_index = 0; // global
    >
    > int bar() { g_index++; return 17; }
    >
    > int foo()
    > {
    > int array[10];
    > array[g_index] = bar();
    > }
    >
    > Using gcc, I get 17 written in array[0].
    > Using Microsoft Visual C compiler, I get 17 written in array[1].
    >
    > So, gcc must have evaluated the LHS first into a memory position, then
    > evaluated the RHS, and finally made the assignment.
    > MSVC on the other hand evaluated RHS first.
    >
    > Is *this* thing defined in the C specification? (*this* as opposed to
    > the order of function evaluation, which I understand is not defined).



    Short answer: no. There is no sequence point separating the evaluation
    of the array subscript expressing and the function call, so there's
    no defined order.

    In a slightly more complex case, "array[bar2()] = bar();" you could see
    either bar or bar2 get called first, but it is guaranteed that the one
    that gets called first is fully complete, and all side effects
    evaluated, before the second is called (or any of its parameters
    evaluated - although there aren't any in this example).

    In your example, you probably want to do something like:

    t = bar();
    array[g_index] = t;

    or:

    t = g_index;
    array[t] = bar();


    depending on what you actually want to happen.
     
    , Oct 23, 2006
    #8
  9. In article <>,
    <> wrote:

    >My problem is actually more like this:


    >int g_index = 0; // global


    >int bar() { g_index++; return 17; }


    >int foo()
    >{
    > int array[10];
    > array[g_index] = bar();
    >}


    >Is *this* thing defined in the C specification? (*this* as opposed to
    >the order of function evaluation, which I understand is not defined).


    There are subtle nuances to sequence points that I don't think I
    am clear on myself. Puzzling through the C89 wording, I -think- the
    above is not valid.

    There is a sequence point before the actual call to bar(), which
    I understand to mean that g_index would not yet have been computed
    (according to the abstract semantics.) However, at -this- level
    there is no sequence point between the start of the call to bar and
    the end of the assignment. An object may be modified only once
    between the previous sequence point and the next, and g_index is
    modified only once by the call, so that part itself is okay. However,
    the previous value of an object may be accessed only to determine
    the value to store, and that's being violated because the
    value of g_index has to be accessed inside bar() in order to determine
    the new value to store -and- the value of g_index needs to be accessed
    in order to determine the subscript to use. So if I understand
    correctly, that is two kinds of accesses where only one is
    permitted.

    But if my understanding is correct, then the following would work,
    and I -suspect- it won't:

    int g_index = 0; /* global */
    int bar() { g_index++; return 17; }
    int copyarg( int inval ) { return inval; }
    void foo(void) {
    int array[10];
    array[g_index] = copyarg( bar() );
    }

    The idea here being that with the extra layer of function call,
    there is a sequence point before the call to copyarg() so the
    sideeffects of bar() would be finalized before copyarg() was called,
    and hence the implication would be that the g_index in the subscript
    should get the side-effect'd value of g_index . But it doesn't sound
    right that putting in an extra layer of call could make right
    a side effect.
    --
    All is vanity. -- Ecclesiastes
     
    Walter Roberson, Oct 23, 2006
    #9
  10. Chris Torek Guest

    >In article <>,
    > <> wrote:

    [with vertical compression by me]
    >>int g_index = 0; // global
    >>int bar() { g_index++; return 17; }
    >>int foo() {
    >> int array[10];
    >> array[g_index] = bar();
    >>}


    >>Is *this* thing defined in the C specification? (*this* as opposed to
    >>the order of function evaluation, which I understand is not defined).


    The precise answer is that it is "unspecified".

    In article <ehjf5u$644$>
    Walter Roberson <-cnrc.gc.ca> wrote:
    >There are subtle nuances to sequence points that I don't think I
    >am clear on myself. Puzzling through the C89 wording, I -think- the
    >above is not valid.


    Depends what you mean by "valid".

    The sequence points that surround the call to, and return from,
    function bar(), guarantee that g_index is 0 before, and 1 after,
    the call to bar(). (Assuming of course that it has not been altered
    before this.) In addition, we can (I believe) be sure that in the
    left-hand sub-expression "array[g_index]", g_index is evaluated
    either entirely before, or entirely after, the call to bar(). It
    will therefore be either 0 or 1.

    Unfortunately, there is nothing that says *which* will occur. This
    is not even "implementation-defined". If it were, the programmer
    could read the documentation that comes with the compiler, and find
    out whether array[g_index] will be array[0] or array[1], and know
    the answer for that particular compiler (perhaps "that compiler with
    specific flags", since there might be a compiler switch to choose
    one or the other). But it is "unspecified", meaning the compiler
    does not even have to tell you how it chooses when to evaluate
    g_index (i.e., before or after the call to bar()).

    That, in turn, means the compiler can make this choice based on
    the phase of the moon, the temperature of the CPU, or any other
    hard-to-predict item. You can be sure of "zero or one", but not
    which.

    The way to force the desired order of evaluation -- whatever that
    is -- is to capture g_index with a sequence point that is ordered
    with respect to the function-call sequence point. For instance:

    i = g_index; /* capture value before the call */
    array = bar(); /* index with old value */

    or:

    i = bar(); /* do the call */
    array[g_index] = i; /* index with new value */
    --
    In-Real-Life: Chris Torek, Wind River Systems
    Salt Lake City, UT, USA (40°39.22'N, 111°50.29'W) +1 801 277 2603
    email: forget about it http://web.torek.net/torek/index.html
    Reading email is like searching for food in the garbage, thanks to spammers.
     
    Chris Torek, Oct 24, 2006
    #10
  11. In article <>,
    Chris Torek <> wrote:

    >The sequence points that surround the call to, and return from,
    >function bar(), guarantee that g_index is 0 before, and 1 after,
    >the call to bar(). (Assuming of course that it has not been altered
    >before this.)


    In the C89 standard, I saw the mandatory sequence point just
    before the call to functions, but I could not find any mandatory
    sequence points after function calls -- at least not sequence
    points "at the same level" (so to speak.)

    When I refer to sequence points "at the same level", I am supposing
    that the sequence points are per expression (or per statement), and
    there can, in essence, be "suspended" sequence points -- in
    contrast to a model in which the sequence points within an
    expression evaluation are all "global" sequence points in which
    *all* possible finalization (all all nesting levels) must occur.

    I am not certain at the moment how to distinguish the two models,
    but I'll throw something out and perhaps someone will understand
    the difference and know the answer:

    The C89 standard indicates that if a signal or exception occurs,
    that the values of objects (including auto objects with block scope)
    are determined as of the previous sequence point, and that
    any modification (or volatile access) that might be "in progress"
    has an uncertain state as of the time of the the signal or exception.

    Suppose I have a block with an auto variable X, and inside that block
    there is a sequence point (so we have finalized the auto object),
    then an expression that has a call to a routine. Suppose that routine
    in turn has a block with an auto Y, and the routine has completed
    a sequence point (so Y has been finalized), and suppose the routine
    is in the middle of an expression, and that a signal or
    exception occurs.

    If sequence points can be "suspended", then the state of X may be
    indeterminate at the time of the signal or exception, because we
    are between the sequence points in the outer routine. But if
    sequence points are in some sense "global", then the sequence point
    in the inner routine is a a full sequence point for the purposes
    of determining whether X has a determinate value or not.

    ??
    --
    All is vanity. -- Ecclesiastes
     
    Walter Roberson, Oct 24, 2006
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ilias Lazaridis
    Replies:
    2
    Views:
    393
    Ilias Lazaridis
    Apr 24, 2005
  2. David Côme
    Replies:
    6
    Views:
    414
    Victor Bazarov
    Mar 19, 2008
  3. Bogdan
    Replies:
    4
    Views:
    744
    Rainer Grimm
    Nov 26, 2011
  4. Ilias Lazaridis
    Replies:
    74
    Views:
    764
    Ilias Lazaridis
    Apr 4, 2005
  5. Ilias Lazaridis
    Replies:
    18
    Views:
    335
    Bill Guindon
    Apr 9, 2005
Loading...

Share This Page