Is a[i] = i++ correct?

J

jeniffer

Hi

I want to know why is a = i++ ; wrong? People say that it is
because of different parsing during compilation.Please explain
technically why it is wrong/behaviour undefined?

Regards,
Jeniffer
 
K

Keith Thompson

jeniffer said:
I want to know why is a = i++ ; wrong? People say that it is
because of different parsing during compilation.Please explain
technically why it is wrong/behaviour undefined?


This is question 3.1 in the comp.lang.c FAQ, <http://www.c-faq.com>.
A number of other questions in section 3 address this point.

It's not a matter of "different parsing"; there's no syntactic
ambiguity. The ambiguity is semantic.

n1256 6.5p2 says:

Between the previous and next sequence points an object shall have
its stored value modified at most once by the evaluation of an
expression. Furthermore, the prior value shall be read only to
determine the value to be stored.

The use of the word "shall" outside a constraint means that any
violation invokes undefined behavior. In ``a = i++;'' the value of
i is read (that's ok) and modified (ok), and the previous value is
read to determine the value to be stored in i (ok) -- but the value is
*also* read to determine which element of the array to modify
(kaboom!).

Even without the above rule, the standard doesn't specify the order of
evaluation; the determination of which array element to modify could
occur either before or after i is incremented. If that were the only
issue, then if i==2 before the statement is executed, it could modify
either a[2] or a[3]. But 6.5p2 says that (more or less) that whenever
such an ambiguity occurs, the results aren't limited to differing
orders of evaluation; *anything* can happen. The point of all this is
to allow for more aggressive optimization; if the compiler doesn't
need to worry about consistent results for ambiguous expressions, it
can generate better code for unambiguous expressions.
 
R

Richard Heathfield

jeniffer said:
Hi

I want to know why is a = i++ ; wrong?


What do you think it should mean? Given this code:

int a[3] = { 5, 7, 9 };
i = 0;
a = i++; /* bug */

which member of a[] do you think will be updated, and to what value?
 
K

Kaz Kylheku

Hi

I want to know why is  a = i++ ; wrong? People say that it is
because of different parsing during compilation.Please explain
technically why it is wrong/behaviour undefined?


This has been covered in a thousand and one previous discussion
threads in this newsgroup. It has coverage in the FAQ also.

Or you could just search the newsgroup archives for the one thousand
and one threads that have re-hashed this problem over and over again.

I just typed this exact query into Google:

comp.lang.c FAQ "a = i++"

At the time of writing, the first hit points to a copy of the FAQ, and
highlights the section for you where "a = i++" is discussed.

Visit:

http://c-faq.com/expr/index.html
http://c-faq.com/expr/evalorder4.html

If that's not enough, you can easily obtain a draft copy of the ISO
standard for the programming language and look up the exact rules. The
quickest way is to google by document number. Use:

ISO 9899:1999

and click "I'm feeling lucky" to get to a PDF. Look at section 6.5
Expressions, second paragraph.
 
C

Chris Dollin

jeniffer said:
I want to know why is a = i++ ; wrong? People say that it is
because of different parsing during compilation.Please explain
technically why it is wrong/behaviour undefined?


Even supposing it were not specifically undefined (ie, the C
Standard says that it doesn't care what this code does), what
would you expect it to /mean/?

The evaluation order of the address of `a` and `i++` is
implementation-specific, and the timing of the increment to `i`
is implementation-specific. The rule the implementation uses
to order these things can be arbitrarily obscure. So, even
if it means something specific on your implementation, today,
when there's left-over Christmas pudding in the fridge and
you can't face another turkey sandwich, it needn't mean the
same thing tomorrow; which is not a good recipe for portable
code.

Let's not mention that the intention of the writer of that
code is completely unclear.
 
K

Kenny McCormack

jeniffer said:
I want to know why is a = i++ ; wrong? People say that it is
because of different parsing during compilation.Please explain
technically why it is wrong/behaviour undefined?


Even supposing it were not specifically undefined (ie, the C
Standard says that it doesn't care what this code does), what
would you expect it to /mean/?


Get off the "the C standard says..." horse and think logically, like a
human being for a second, and it becomes clear. When a human sees the
above, they logically think:
1) Evaluate the RHS
2) Assign it to the LHS
in that order. So, obviously, if i=0 on start, then at end, a[1] will
have been assigned the value 0 [*]. Though C doesn't always get this right
(of course, since the C standard allows it, there's no crime here),
C-like scripting languages (e.g., AWK) do, IME, get it right.

[*] Whether or not doing this makes any sense is, of course, not for us
to say.
Let's not mention that the intention of the writer of that
code is completely unclear.

Wrong. See above.
 
R

Richard Harter

jeniffer said:
Hi

I want to know why is a = i++ ; wrong?


What do you think it should mean? Given this code:

int a[3] = { 5, 7, 9 };
i = 0;
a = i++; /* bug */

which member of a[] do you think will be updated, and to what value?


If C used a left to right order of application similar to that
for arithmetic (with the as-if rule as a back door) then the
results would be well defined. After the statement a[0] would be
0 and i would be 1. Similarly, the statements

i=0;
a[i++] = ++i + i++;

would evaluate as follows:

The target of the assignment is a[0].
i is incremented after computing the location to become 1.
On the RHS i is incremented to become 2. (++i)
i is added to i to produce 4; once the addition is completed i is
incremented to become 3. (i++).

Of course C does not guarantee the order of evaluation except in
special cases, and it is important to understand that it does
not. One can argue that not guaranteeing the preservation of
code order is a design flaw in the C language, but it doesn't
matter - C is cast in stone.
 
C

christian.bau

Get off the "the C standard says..." horse and think logically, like a
human being for a second, and it becomes clear.  When a human sees the
above, they logically think:
        1) Evaluate the RHS
        2) Assign it to the LHS
in that order.  So, obviously, if i=0 on start, then at end, a[1] will
have been assigned the value 0 [*].  Though C doesn't always get this right
(of course, since the C standard allows it, there's no crime here),
C-like scripting languages (e.g., AWK) do, IME, get it right.

Java programmers all over the world think you are completely wrong. In
Java, a = i++; has defined behaviour. The expression is evaluated
from left to right. So first it evaluates the lvalue a , then the
right hand side i++. The array element changed is determined by the
original value of i.

On the other hand, nobody cares about expressions like this. What
programmers and compiler writers care about are expressions that most
likely don't do this kind of thing, but possibly might, like a =
(*p)++; . A C compiler can evaluate the address of a and the value
of (*p)++ in any order it likes, without having to care about the
perverted case that p == &i. If evaluating the left side first is
faster, the compiler can evaluate the left hand side first. If
evaluating the right hand side first is faster, it does that the right
hand first.
 
R

Rick

I want to know why is a = i++ ; wrong? People say that it is
because of different parsing during compilation.Please explain
technically why it is wrong/behaviour undefined?


Good afternoon, Jeniffer.

Looks like I'm the only one here who's going to give you a straight
answer without chastising you first for not reading the FAQ. :)

The answer is that C does not guarantee order of evaluation.
Therefore, i++ might be evaluated first, before being applied as an
index into a[], or it might be evaluated last, and the compiler is
perfectly free to do it either way.

So, if i started off as, say, 2, then a might be a[2], or it might
be a[3].

Hope this helps...
 
K

Kenny McCormack

Get off the "the C standard says..." horse and think logically, like a
human being for a second, and it becomes clear.  When a human sees the
above, they logically think:
        1) Evaluate the RHS
        2) Assign it to the LHS
in that order.  So, obviously, if i=0 on start, then at end, a[1] will
have been assigned the value 0 [*].  Though C doesn't always get this right
(of course, since the C standard allows it, there's no crime here),
C-like scripting languages (e.g., AWK) do, IME, get it right.

Java programmers all over the world think you are completely wrong. In
Java, a = i++; has defined behaviour. The expression is evaluated
from left to right. So first it evaluates the lvalue a , then the
right hand side i++. The array element changed is determined by the
original value of i.


Well, that doesn't actually prove anything. What it means is that Java
defined it that way (probably because it was easier to implement) and
the programmers accepted it. It doesn't mean it is desirable (nor, of
course, does it mean it is undesirable).
On the other hand, nobody cares about expressions like this.

Agreed.
 
F

Flash Gordon

Rick wrote, On 28/12/07 19:42:
I want to know why is a = i++ ; wrong? People say that it is
because of different parsing during compilation.Please explain
technically why it is wrong/behaviour undefined?


Good afternoon, Jeniffer.

Looks like I'm the only one here who's going to give you a straight
answer without chastising you first for not reading the FAQ. :)


Well, since it is only polite to read the FAQ first and the FAQ is more
accurate than your answer...

Actually, *you* should have read the FAQ first as well so that you could
provide correct information.
The answer is that C does not guarantee order of evaluation.
Therefore, i++ might be evaluated first, before being applied as an
index into a[], or it might be evaluated last, and the compiler is
perfectly free to do it either way.

No, that is NOT why it is undefined. As others have stated it is
undefined because i is modified and read for a reason other than
determining the new value between sequence points. This means that the
compiler is NOT restricted to the possibilities you suggested.
So, if i started off as, say, 2, then a might be a[2], or it might
be a[3].


Or it could be 97 or cause your program to crash or anything else. Yes,
there *are* reasons it could crash a program on some possible
implementations.
Hope this helps...

I hope I have corrected your misconceptions.
 
K

Keith Thompson

Rick said:
I want to know why is a = i++ ; wrong? People say that it is
because of different parsing during compilation.Please explain
technically why it is wrong/behaviour undefined?


Good afternoon, Jeniffer.

Looks like I'm the only one here who's going to give you a straight
answer without chastising you first for not reading the FAQ. :)


Since reading the FAQ would have answered the question, I don't see
any problem with reminding people to check it first.
The answer is that C does not guarantee order of evaluation.
Therefore, i++ might be evaluated first, before being applied as an
index into a[], or it might be evaluated last, and the compiler is
perfectly free to do it either way.

So, if i started off as, say, 2, then a might be a[2], or it might
be a[3].


That's only part of the problem. The behavior is completely
undefined; a might be a[42], or your left earlobe.

There are cases where C's unspecified order of evaluation doesn't lead
to undefined behavior (for example, if the two subexpressions don't
refer to any of the same variables). But in this particular case, the
standard places absolutely no restrictions on how the program can
behave.
 
K

Kaz Kylheku

Get off the "the C standard says..." horse and think logically, like a
human being for a second, and it becomes clear.  When a human sees the
above, they logically think:

A twit who has somehow been pushed through a computer science
undergraduate program is hardly representative of all humans.
        1) Evaluate the RHS
        2) Assign it to the LHS

Except, doh, the left hand side requires evaluation. To know what a
refers to requires you to evaluate i.

A human being with no preconception of any of these concepts could
interpret it in various ways.

For example, strict left-to-right evaluation would be this:

1) Evaluate the left side completely to determine what location a
is.
2) Evaluate the right side, performing the increment of i,
yielding the previous value.
3) Store the value to the location computed in 1.

Or, rvalue-first evaluation would be:

1) Fully evaluate the expression which produces the value to be
assigned.
2) Then evaluate the left side, if necessary, to determine
the location where the value will be stored.
3) Store the value computed in 1 into the location
computed in 3.
a human, but rather a twit who has somehow been pushed through a
computer science undergraduate program.
in that order.  So, obviously, if i=0 on start, then at end, a[1] will

Obviously, you're a moron.
 
K

Kaz Kylheku

jeniffer said:
Hi
I want to know why is  a = i++ ; wrong?

What do you think it should mean? Given this code:
int a[3] = { 5, 7, 9 };
i = 0;
a = i++; /* bug */

which member of a[] do you think will be updated, and to what value?

If C used a left to right order of application


Then it would be a better safer language.

similar to that
for arithmetic (with the as-if rule as a back door) then the
results would be well defined.

But it isn't and so they are not. Your point is?


 After the statement a[0] would be
0 and i would be 1.  Similarly, the statements

i=0;
a[i++] = ++i + i++;

would evaluate as follows:

The target of the assignment is a[0].
i is incremented after computing the location to become 1.
On the RHS i is incremented to become 2. (++i)
i is added to i to produce 4; once the addition is completed i is
incremented to become 3.  (i++).

Of course C does not guarantee the order of evaluation except in
special cases, and it is important to understand that it does
not.  One can argue that not guaranteeing the preservation of
code order is a design flaw in the C language, but it doesn't

I would agree. Today, it's no longer a good engineering tradeoff.

What the loose order of evaluation buys you is the ability to optimize
code whose operands are accessed through indirection that can't be
analyzed at compile time.

Even the order of evaluation is well-defined, you can still optimize
code like

a[j] = i++;

quite nicely. The compiler still can change the order of actual
evaluation to make it run fast on the given CPU, because the objects
a[], i and j are distinct, non-overlapping. And, also, they are not
volatile objects. So the order in which anything takes place is not
externally visible behavior. Only the correctness of the end result
matters. It doesn't matter whether a[j] receives the value first, or
whether i receives the value first.

However, if you have indirection, like:

a[*p] = (*q)++

then the order matters. In C the way it is, this is undefined if p and
q point to the same memory location. But if they point to different
integers, then it's well-defined! In the general case, it is only
known at run-time whether p and q are aliased. Because of the
undefinedness of the behavior if p and q are aliased, the compiler
doesn't have to care about that case, and can generate code to do it
in any arbitrary order.

If you make the order well-defined, then the compiler has to work with
the suspicion that p and q may be the same object. That of course
affects code generation decisions. If p and q are never in fact
aliased, then that code may be less than optimal.

In modern C, we now have the "restrict" qualifier which makes code
undefined when pointers are aliased. I.e. in a C language dialect
which is like C99, but in which evaluation order is well-defined, we
could still get the undefined behavior of p and q being overlapped,
like this:

int *restrict p, * restrict q;

/* ... point them to the same thing ... */

a[*p] = (*q)++;

The compiler can assume that p and q are not aliased and optimize the
code accordingly.

Loose evaluation order is merely an optimization crutch which was
needed before restrict qualifiers were introduced.

Speaking of optimization crutches, ultimately, what would be a good
solution would be the ability to define optimization parameters over
specific blocks of code. Suppose you had a way to express the idea
``over this block of code, please use classic loose evaluation
order''. You could have the safety benefit of well-defined order
throughout most of the program, as well as the optimization benefits
of loose order in hotspots.

So basically, the argument that loose evaluation order is a necessary
design decision for good code generation simply doesn't hold water.
It's true with regard to 1970's compiler technology, if even that.
matter - C is cast in stone.

C is not cast in stone. Past undefined behaviors can easily be defined
in the future, without breaking any correctly written code.
 
C

christian.bau

christian.bau said:
Java programmers all over the world think you are completely wrong. In
Java, a = i++; has defined behaviour. The expression is evaluated
from left to right. So first it evaluates the lvalue a , then the
right hand side i++. The array element changed is determined by the
original value of i.


Well, that doesn't actually prove anything.  What it means is that Java
defined it that way (probably because it was easier to implement) and
the programmers accepted it.  It doesn't mean it is desirable (nor, of
course, does it mean it is undesirable).


You actually think anything in Java is defined the way it is defined
because "it was easier to implement"? Seriously?
 
R

Richard Heathfield

Kaz Kylheku said:
in that order. So, obviously, if i=0 on start, then at end, a[1] will

Obviously, you're a moron.

Obviously, he's a troll. A relatively recent one, though, so you might not
have caught on to him yet.

(By the way - welcome back!)
 
R

Rick

Good evening, Flash.

Respectfully, there were no misconceptions. There were only partial
answers. I gave Jeniffer the answers that I thought were needed
without getting into a lot of detail that goes way beyond the scope of
my perception of the questions asked. Of course, my perception may
have been wrong.
 
R

Rick

So, if i started off as, say, 2, then a might be a[2], or it might
be a[3].


That's only part of the problem. The behavior is completely
undefined; a might be a[42], or your left earlobe.


Good evening, Keith.

If...

i = 2;
a[ i ] = i++;

.... then I claim that a[ i ] will be either a[ 2 ] or a[ 3 ],
depending on whether i++ gets evaluated first or last, but it must be
either one or the other.

Wrong?
 
R

Richard Heathfield

Rick said:

If...

i = 2;
a[ i ] = i++;

... then I claim that a[ i ] will be either a[ 2 ] or a[ 3 ],
depending on whether i++ gets evaluated first or last, but it must be
either one or the other.

Wrong?

Er, yeah, wrong. C doesn't actually guarantee this at all. But
realistically, how could it have any other value? Well, I don't plan to
work an example for you, but I recommend the following page, which gives
some hard data on the various results you get from different compilers for
similar expressions:

http://www.phaedsys.demon.co.uk/chris/sweng/swengtips3a.htm
 
K

Kenny McCormack

So, if i started off as, say, 2, then a might be a[2], or it might
be a[3].


That's only part of the problem. The behavior is completely
undefined; a might be a[42], or your left earlobe.


Good evening, Keith.

If...

i = 2;
a[ i ] = i++;

... then I claim that a[ i ] will be either a[ 2 ] or a[ 3 ],
depending on whether i++ gets evaluated first or last, but it must be
either one or the other.

Wrong?


Wrong, by the standards of this newsgroup.

Here, what actually happens in the real world is irrelevant. In fact,
the real world itself is pretty much irrelevant. What matters is what
the standard requires, and the possible existence of a machine which has
read and understands the standard as well as the language lawyers
(aka, "the regulars") here have done.

So, the theory is that once you invoke "undefined behavior", anything
can happen (and does on the hypothetical machine described above),
including assigning a value to a[42] or starting global thermonuclear
war.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top