Is a[i] = i++ correct?

Discussion in 'C Programming' started by jeniffer, Dec 27, 2007.

  1. jeniffer

    jeniffer Guest

    Hi

    I want to know why is a = i++ ; wrong? People say that it is
    because of different parsing during compilation.Please explain
    technically why it is wrong/behaviour undefined?

    Regards,
    Jeniffer
     
    jeniffer, Dec 27, 2007
    #1
    1. Advertising

  2. jeniffer <> writes:
    > I want to know why is a = i++ ; wrong? People say that it is
    > because of different parsing during compilation.Please explain
    > technically why it is wrong/behaviour undefined?


    This is question 3.1 in the comp.lang.c FAQ, <http://www.c-faq.com>.
    A number of other questions in section 3 address this point.

    It's not a matter of "different parsing"; there's no syntactic
    ambiguity. The ambiguity is semantic.

    n1256 6.5p2 says:

    Between the previous and next sequence points an object shall have
    its stored value modified at most once by the evaluation of an
    expression. Furthermore, the prior value shall be read only to
    determine the value to be stored.

    The use of the word "shall" outside a constraint means that any
    violation invokes undefined behavior. In ``a = i++;'' the value of
    i is read (that's ok) and modified (ok), and the previous value is
    read to determine the value to be stored in i (ok) -- but the value is
    *also* read to determine which element of the array to modify
    (kaboom!).

    Even without the above rule, the standard doesn't specify the order of
    evaluation; the determination of which array element to modify could
    occur either before or after i is incremented. If that were the only
    issue, then if i==2 before the statement is executed, it could modify
    either a[2] or a[3]. But 6.5p2 says that (more or less) that whenever
    such an ambiguity occurs, the results aren't limited to differing
    orders of evaluation; *anything* can happen. The point of all this is
    to allow for more aggressive optimization; if the compiler doesn't
    need to worry about consistent results for ambiguous expressions, it
    can generate better code for unambiguous expressions.

    --
    Keith Thompson (The_Other_Keith) <>
    [...]
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Dec 27, 2007
    #2
    1. Advertising

  3. jeniffer said:

    > Hi
    >
    > I want to know why is a = i++ ; wrong?


    What do you think it should mean? Given this code:

    int a[3] = { 5, 7, 9 };
    i = 0;
    a = i++; /* bug */

    which member of a[] do you think will be updated, and to what value?

    --
    Richard Heathfield <http://www.cpax.org.uk>
    Email: -http://www. +rjh@
    Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
    "Usenet is a strange place" - dmr 29 July 1999
     
    Richard Heathfield, Dec 27, 2007
    #3
  4. jeniffer

    Kaz Kylheku Guest

    On Dec 27, 3:45 am, jeniffer <> wrote:
    > Hi
    >
    > I want to know why is  a = i++ ; wrong? People say that it is
    > because of different parsing during compilation.Please explain
    > technically why it is wrong/behaviour undefined?


    This has been covered in a thousand and one previous discussion
    threads in this newsgroup. It has coverage in the FAQ also.

    Or you could just search the newsgroup archives for the one thousand
    and one threads that have re-hashed this problem over and over again.

    I just typed this exact query into Google:

    comp.lang.c FAQ "a = i++"

    At the time of writing, the first hit points to a copy of the FAQ, and
    highlights the section for you where "a = i++" is discussed.

    Visit:

    http://c-faq.com/expr/index.html
    http://c-faq.com/expr/evalorder4.html

    If that's not enough, you can easily obtain a draft copy of the ISO
    standard for the programming language and look up the exact rules. The
    quickest way is to google by document number. Use:

    ISO 9899:1999

    and click "I'm feeling lucky" to get to a PDF. Look at section 6.5
    Expressions, second paragraph.
     
    Kaz Kylheku, Dec 27, 2007
    #4
  5. jeniffer

    Chris Dollin Guest

    jeniffer wrote:

    > I want to know why is a = i++ ; wrong? People say that it is
    > because of different parsing during compilation.Please explain
    > technically why it is wrong/behaviour undefined?


    Even supposing it were not specifically undefined (ie, the C
    Standard says that it doesn't care what this code does), what
    would you expect it to /mean/?

    The evaluation order of the address of `a` and `i++` is
    implementation-specific, and the timing of the increment to `i`
    is implementation-specific. The rule the implementation uses
    to order these things can be arbitrarily obscure. So, even
    if it means something specific on your implementation, today,
    when there's left-over Christmas pudding in the fridge and
    you can't face another turkey sandwich, it needn't mean the
    same thing tomorrow; which is not a good recipe for portable
    code.

    Let's not mention that the intention of the writer of that
    code is completely unclear.

    --
    Another POV Hedgehog
    Scoring, bah. If I want scoring I'll go play /Age of Steam/.
     
    Chris Dollin, Dec 28, 2007
    #5
  6. In article <rd1dj.112184$>,
    Chris Dollin <> wrote:
    >jeniffer wrote:
    >
    >> I want to know why is a = i++ ; wrong? People say that it is
    >> because of different parsing during compilation.Please explain
    >> technically why it is wrong/behaviour undefined?

    >
    >Even supposing it were not specifically undefined (ie, the C
    >Standard says that it doesn't care what this code does), what
    >would you expect it to /mean/?


    Get off the "the C standard says..." horse and think logically, like a
    human being for a second, and it becomes clear. When a human sees the
    above, they logically think:
    1) Evaluate the RHS
    2) Assign it to the LHS
    in that order. So, obviously, if i=0 on start, then at end, a[1] will
    have been assigned the value 0 [*]. Though C doesn't always get this right
    (of course, since the C standard allows it, there's no crime here),
    C-like scripting languages (e.g., AWK) do, IME, get it right.

    [*] Whether or not doing this makes any sense is, of course, not for us
    to say.

    >Let's not mention that the intention of the writer of that
    >code is completely unclear.


    Wrong. See above.
     
    Kenny McCormack, Dec 28, 2007
    #6
  7. On Thu, 27 Dec 2007 12:13:10 +0000, Richard Heathfield
    <> wrote:

    >jeniffer said:
    >
    >> Hi
    >>
    >> I want to know why is a = i++ ; wrong?

    >
    >What do you think it should mean? Given this code:
    >
    >int a[3] = { 5, 7, 9 };
    >i = 0;
    >a = i++; /* bug */
    >
    >which member of a[] do you think will be updated, and to what value?


    If C used a left to right order of application similar to that
    for arithmetic (with the as-if rule as a back door) then the
    results would be well defined. After the statement a[0] would be
    0 and i would be 1. Similarly, the statements

    i=0;
    a[i++] = ++i + i++;

    would evaluate as follows:

    The target of the assignment is a[0].
    i is incremented after computing the location to become 1.
    On the RHS i is incremented to become 2. (++i)
    i is added to i to produce 4; once the addition is completed i is
    incremented to become 3. (i++).

    Of course C does not guarantee the order of evaluation except in
    special cases, and it is important to understand that it does
    not. One can argue that not guaranteeing the preservation of
    code order is a design flaw in the C language, but it doesn't
    matter - C is cast in stone.
     
    Richard Harter, Dec 28, 2007
    #7
  8. On Dec 28, 3:34 pm, (Kenny McCormack)
    wrote:

    > Get off the "the C standard says..." horse and think logically, like a
    > human being for a second, and it becomes clear.  When a human sees the
    > above, they logically think:
    >         1) Evaluate the RHS
    >         2) Assign it to the LHS
    > in that order.  So, obviously, if i=0 on start, then at end, a[1] will
    > have been assigned the value 0 [*].  Though C doesn't always get this right
    > (of course, since the C standard allows it, there's no crime here),
    > C-like scripting languages (e.g., AWK) do, IME, get it right.


    Java programmers all over the world think you are completely wrong. In
    Java, a = i++; has defined behaviour. The expression is evaluated
    from left to right. So first it evaluates the lvalue a , then the
    right hand side i++. The array element changed is determined by the
    original value of i.

    On the other hand, nobody cares about expressions like this. What
    programmers and compiler writers care about are expressions that most
    likely don't do this kind of thing, but possibly might, like a =
    (*p)++; . A C compiler can evaluate the address of a and the value
    of (*p)++ in any order it likes, without having to care about the
    perverted case that p == &i. If evaluating the left side first is
    faster, the compiler can evaluate the left hand side first. If
    evaluating the right hand side first is faster, it does that the right
    hand first.
     
    christian.bau, Dec 28, 2007
    #8
  9. jeniffer

    Rick Guest

    On Thu, 27 Dec 2007 03:45:29 -0800 (PST), jeniffer
    <> wrote:

    >I want to know why is a = i++ ; wrong? People say that it is
    >because of different parsing during compilation.Please explain
    >technically why it is wrong/behaviour undefined?


    Good afternoon, Jeniffer.

    Looks like I'm the only one here who's going to give you a straight
    answer without chastising you first for not reading the FAQ. :)

    The answer is that C does not guarantee order of evaluation.
    Therefore, i++ might be evaluated first, before being applied as an
    index into a[], or it might be evaluated last, and the compiler is
    perfectly free to do it either way.

    So, if i started off as, say, 2, then a might be a[2], or it might
    be a[3].

    Hope this helps...
     
    Rick, Dec 28, 2007
    #9
  10. In article <>,
    christian.bau <> wrote:
    >On Dec 28, 3:34 pm, (Kenny McCormack)
    >wrote:
    >
    >> Get off the "the C standard says..." horse and think logically, like a
    >> human being for a second, and it becomes clear.  When a human sees the
    >> above, they logically think:
    >>         1) Evaluate the RHS
    >>         2) Assign it to the LHS
    >> in that order.  So, obviously, if i=0 on start, then at end, a[1] will
    >> have been assigned the value 0 [*].  Though C doesn't always get this right
    >> (of course, since the C standard allows it, there's no crime here),
    >> C-like scripting languages (e.g., AWK) do, IME, get it right.

    >
    >Java programmers all over the world think you are completely wrong. In
    >Java, a = i++; has defined behaviour. The expression is evaluated
    >from left to right. So first it evaluates the lvalue a , then the
    >right hand side i++. The array element changed is determined by the
    >original value of i.


    Well, that doesn't actually prove anything. What it means is that Java
    defined it that way (probably because it was easier to implement) and
    the programmers accepted it. It doesn't mean it is desirable (nor, of
    course, does it mean it is undesirable).

    >On the other hand, nobody cares about expressions like this.


    Agreed.
     
    Kenny McCormack, Dec 28, 2007
    #10
  11. jeniffer

    Flash Gordon Guest

    Rick wrote, On 28/12/07 19:42:
    > On Thu, 27 Dec 2007 03:45:29 -0800 (PST), jeniffer
    > <> wrote:
    >
    >> I want to know why is a = i++ ; wrong? People say that it is
    >> because of different parsing during compilation.Please explain
    >> technically why it is wrong/behaviour undefined?

    >
    > Good afternoon, Jeniffer.
    >
    > Looks like I'm the only one here who's going to give you a straight
    > answer without chastising you first for not reading the FAQ. :)


    Well, since it is only polite to read the FAQ first and the FAQ is more
    accurate than your answer...

    Actually, *you* should have read the FAQ first as well so that you could
    provide correct information.

    > The answer is that C does not guarantee order of evaluation.
    > Therefore, i++ might be evaluated first, before being applied as an
    > index into a[], or it might be evaluated last, and the compiler is
    > perfectly free to do it either way.


    No, that is NOT why it is undefined. As others have stated it is
    undefined because i is modified and read for a reason other than
    determining the new value between sequence points. This means that the
    compiler is NOT restricted to the possibilities you suggested.

    > So, if i started off as, say, 2, then a might be a[2], or it might
    > be a[3].


    Or it could be 97 or cause your program to crash or anything else. Yes,
    there *are* reasons it could crash a program on some possible
    implementations.

    > Hope this helps...


    I hope I have corrected your misconceptions.
    --
    Flash Gordon
     
    Flash Gordon, Dec 28, 2007
    #11
  12. Rick <> writes:
    > On Thu, 27 Dec 2007 03:45:29 -0800 (PST), jeniffer
    > <> wrote:
    >>I want to know why is a = i++ ; wrong? People say that it is
    >>because of different parsing during compilation.Please explain
    >>technically why it is wrong/behaviour undefined?

    >
    > Good afternoon, Jeniffer.
    >
    > Looks like I'm the only one here who's going to give you a straight
    > answer without chastising you first for not reading the FAQ. :)


    Since reading the FAQ would have answered the question, I don't see
    any problem with reminding people to check it first.

    > The answer is that C does not guarantee order of evaluation.
    > Therefore, i++ might be evaluated first, before being applied as an
    > index into a[], or it might be evaluated last, and the compiler is
    > perfectly free to do it either way.
    >
    > So, if i started off as, say, 2, then a might be a[2], or it might
    > be a[3].


    That's only part of the problem. The behavior is completely
    undefined; a might be a[42], or your left earlobe.

    There are cases where C's unspecified order of evaluation doesn't lead
    to undefined behavior (for example, if the two subexpressions don't
    refer to any of the same variables). But in this particular case, the
    standard places absolutely no restrictions on how the program can
    behave.

    --
    Keith Thompson (The_Other_Keith) <>
    [...]
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Dec 28, 2007
    #12
  13. jeniffer

    Kaz Kylheku Guest

    On Dec 28, 7:34 am, (Kenny McCormack)
    wrote:
    > Get off the "the C standard says..." horse and think logically, like a
    > human being for a second, and it becomes clear.  When a human sees the
    > above, they logically think:


    A twit who has somehow been pushed through a computer science
    undergraduate program is hardly representative of all humans.

    >         1) Evaluate the RHS
    >         2) Assign it to the LHS


    Except, doh, the left hand side requires evaluation. To know what a
    refers to requires you to evaluate i.

    A human being with no preconception of any of these concepts could
    interpret it in various ways.

    For example, strict left-to-right evaluation would be this:

    1) Evaluate the left side completely to determine what location a
    is.
    2) Evaluate the right side, performing the increment of i,
    yielding the previous value.
    3) Store the value to the location computed in 1.

    Or, rvalue-first evaluation would be:

    1) Fully evaluate the expression which produces the value to be
    assigned.
    2) Then evaluate the left side, if necessary, to determine
    the location where the value will be stored.
    3) Store the value computed in 1 into the location
    computed in 3.
    a human, but rather a twit who has somehow been pushed through a
    computer science undergraduate program.

    > in that order.  So, obviously, if i=0 on start, then at end, a[1] will


    Obviously, you're a moron.
     
    Kaz Kylheku, Dec 28, 2007
    #13
  14. jeniffer

    Kaz Kylheku Guest

    On Dec 28, 8:37 am, (Richard Harter) wrote:
    > On Thu, 27 Dec 2007 12:13:10 +0000, Richard Heathfield
    >
    > <> wrote:
    > >jeniffer said:

    >
    > >> Hi

    >
    > >> I want to know why is  a = i++ ; wrong?

    >
    > >What do you think it should mean? Given this code:

    >
    > >int a[3] = { 5, 7, 9 };
    > >i = 0;
    > >a = i++; /* bug */

    >
    > >which member of a[] do you think will be updated, and to what value?

    >
    > If C used a left to right order of application


    Then it would be a better safer language.

    similar to that
    > for arithmetic (with the as-if rule as a back door) then the
    > results would be well defined.


    But it isn't and so they are not. Your point is?


     After the statement a[0] would be
    > 0 and i would be 1.  Similarly, the statements
    >
    > i=0;
    > a[i++] = ++i + i++;
    >
    > would evaluate as follows:
    >
    > The target of the assignment is a[0].
    > i is incremented after computing the location to become 1.
    > On the RHS i is incremented to become 2. (++i)
    > i is added to i to produce 4; once the addition is completed i is
    > incremented to become 3.  (i++).
    >
    > Of course C does not guarantee the order of evaluation except in
    > special cases, and it is important to understand that it does
    > not.  One can argue that not guaranteeing the preservation of
    > code order is a design flaw in the C language, but it doesn't


    I would agree. Today, it's no longer a good engineering tradeoff.

    What the loose order of evaluation buys you is the ability to optimize
    code whose operands are accessed through indirection that can't be
    analyzed at compile time.

    Even the order of evaluation is well-defined, you can still optimize
    code like

    a[j] = i++;

    quite nicely. The compiler still can change the order of actual
    evaluation to make it run fast on the given CPU, because the objects
    a[], i and j are distinct, non-overlapping. And, also, they are not
    volatile objects. So the order in which anything takes place is not
    externally visible behavior. Only the correctness of the end result
    matters. It doesn't matter whether a[j] receives the value first, or
    whether i receives the value first.

    However, if you have indirection, like:

    a[*p] = (*q)++

    then the order matters. In C the way it is, this is undefined if p and
    q point to the same memory location. But if they point to different
    integers, then it's well-defined! In the general case, it is only
    known at run-time whether p and q are aliased. Because of the
    undefinedness of the behavior if p and q are aliased, the compiler
    doesn't have to care about that case, and can generate code to do it
    in any arbitrary order.

    If you make the order well-defined, then the compiler has to work with
    the suspicion that p and q may be the same object. That of course
    affects code generation decisions. If p and q are never in fact
    aliased, then that code may be less than optimal.

    In modern C, we now have the "restrict" qualifier which makes code
    undefined when pointers are aliased. I.e. in a C language dialect
    which is like C99, but in which evaluation order is well-defined, we
    could still get the undefined behavior of p and q being overlapped,
    like this:

    int *restrict p, * restrict q;

    /* ... point them to the same thing ... */

    a[*p] = (*q)++;

    The compiler can assume that p and q are not aliased and optimize the
    code accordingly.

    Loose evaluation order is merely an optimization crutch which was
    needed before restrict qualifiers were introduced.

    Speaking of optimization crutches, ultimately, what would be a good
    solution would be the ability to define optimization parameters over
    specific blocks of code. Suppose you had a way to express the idea
    ``over this block of code, please use classic loose evaluation
    order''. You could have the safety benefit of well-defined order
    throughout most of the program, as well as the optimization benefits
    of loose order in hotspots.

    So basically, the argument that loose evaluation order is a necessary
    design decision for good code generation simply doesn't hold water.
    It's true with regard to 1970's compiler technology, if even that.

    > matter - C is cast in stone.


    C is not cast in stone. Past undefined behaviors can easily be defined
    in the future, without breaking any correctly written code.
     
    Kaz Kylheku, Dec 28, 2007
    #14
  15. On Dec 28, 7:50 pm, (Kenny McCormack)
    wrote:
    >
    > christian.bau <> wrote:
    > >Java programmers all over the world think you are completely wrong. In
    > >Java, a = i++; has defined behaviour. The expression is evaluated
    > >from left to right. So first it evaluates the lvalue a , then the
    > >right hand side i++. The array element changed is determined by the
    > >original value of i.

    >
    > Well, that doesn't actually prove anything.  What it means is that Java
    > defined it that way (probably because it was easier to implement) and
    > the programmers accepted it.  It doesn't mean it is desirable (nor, of
    > course, does it mean it is undesirable).


    You actually think anything in Java is defined the way it is defined
    because "it was easier to implement"? Seriously?
     
    christian.bau, Dec 28, 2007
    #15
  16. Kaz Kylheku said:

    > On Dec 28, 7:34 am, (Kenny McCormack)
    > wrote:
    >
    >> in that order. So, obviously, if i=0 on start, then at end, a[1] will

    >
    > Obviously, you're a moron.


    Obviously, he's a troll. A relatively recent one, though, so you might not
    have caught on to him yet.

    (By the way - welcome back!)

    --
    Richard Heathfield <http://www.cpax.org.uk>
    Email: -http://www. +rjh@
    Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
    "Usenet is a strange place" - dmr 29 July 1999
     
    Richard Heathfield, Dec 28, 2007
    #16
  17. jeniffer

    Rick Guest

    On Fri, 28 Dec 2007 20:17:30 +0000, Flash Gordon
    <> wrote:

    >> I hope I have corrected your misconceptions.


    Good evening, Flash.

    Respectfully, there were no misconceptions. There were only partial
    answers. I gave Jeniffer the answers that I thought were needed
    without getting into a lot of detail that goes way beyond the scope of
    my perception of the questions asked. Of course, my perception may
    have been wrong.
     
    Rick, Dec 28, 2007
    #17
  18. jeniffer

    Rick Guest

    On Fri, 28 Dec 2007 12:23:01 -0800, Keith Thompson <>
    wrote:

    >> So, if i started off as, say, 2, then a might be a[2], or it might
    >> be a[3].

    >
    >That's only part of the problem. The behavior is completely
    >undefined; a might be a[42], or your left earlobe.


    Good evening, Keith.

    If...

    i = 2;
    a[ i ] = i++;

    .... then I claim that a[ i ] will be either a[ 2 ] or a[ 3 ],
    depending on whether i++ gets evaluated first or last, but it must be
    either one or the other.

    Wrong?
     
    Rick, Dec 28, 2007
    #18
  19. Rick said:

    <snip>

    > If...
    >
    > i = 2;
    > a[ i ] = i++;
    >
    > ... then I claim that a[ i ] will be either a[ 2 ] or a[ 3 ],
    > depending on whether i++ gets evaluated first or last, but it must be
    > either one or the other.
    >
    > Wrong?


    Er, yeah, wrong. C doesn't actually guarantee this at all. But
    realistically, how could it have any other value? Well, I don't plan to
    work an example for you, but I recommend the following page, which gives
    some hard data on the various results you get from different compilers for
    similar expressions:

    http://www.phaedsys.demon.co.uk/chris/sweng/swengtips3a.htm

    --
    Richard Heathfield <http://www.cpax.org.uk>
    Email: -http://www. +rjh@
    Google users: <http://www.cpax.org.uk/prg/writings/googly.php>
    "Usenet is a strange place" - dmr 29 July 1999
     
    Richard Heathfield, Dec 28, 2007
    #19
  20. In article <>,
    Rick <> wrote:
    >On Fri, 28 Dec 2007 12:23:01 -0800, Keith Thompson <>
    >wrote:
    >
    >>> So, if i started off as, say, 2, then a might be a[2], or it might
    >>> be a[3].

    >>
    >>That's only part of the problem. The behavior is completely
    >>undefined; a might be a[42], or your left earlobe.

    >
    >Good evening, Keith.
    >
    >If...
    >
    >i = 2;
    >a[ i ] = i++;
    >
    >... then I claim that a[ i ] will be either a[ 2 ] or a[ 3 ],
    >depending on whether i++ gets evaluated first or last, but it must be
    >either one or the other.
    >
    >Wrong?


    Wrong, by the standards of this newsgroup.

    Here, what actually happens in the real world is irrelevant. In fact,
    the real world itself is pretty much irrelevant. What matters is what
    the standard requires, and the possible existence of a machine which has
    read and understands the standard as well as the language lawyers
    (aka, "the regulars") here have done.

    So, the theory is that once you invoke "undefined behavior", anything
    can happen (and does on the hypothetical machine described above),
    including assigning a value to a[42] or starting global thermonuclear
    war.
     
    Kenny McCormack, Dec 28, 2007
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Tim Hubberstey
    Replies:
    0
    Views:
    1,089
    Tim Hubberstey
    Jul 3, 2003
  2. joon
    Replies:
    1
    Views:
    528
    Roedy Green
    Jul 8, 2003
  3. Dan

    correct or not correct?

    Dan, Oct 2, 2003, in forum: HTML
    Replies:
    7
    Views:
    461
  4. J.Ram
    Replies:
    7
    Views:
    670
  5. froil
    Replies:
    12
    Views:
    321
    Gunnar Hjalmarsson
    Mar 2, 2006
Loading...

Share This Page