Is a[i] = i++ correct?

F

Flash Gordon

Rick wrote, On 28/12/07 23:25:
Good evening, Flash.

Respectfully, there were no misconceptions.

There were I'm afraid, and the FAQ addresses them.
There were only partial
answers.

They were incorrect answers because they specified a limited number of
possibilities.
I gave Jeniffer the answers that I thought were needed
without getting into a lot of detail that goes way beyond the scope of
my perception of the questions asked. Of course, my perception may
have been wrong.

Your perception is wrong.

The standard *explicitly* states that the behaviour is undefined. The
standard also explains what it means by "undefined behaviour". It says
that undefined behaviour is behaviour upon which it poses no
requirements. I.e. as far as the language standard is concerned, if the
program invokes undefined behaviour literally *anything* that happens is
acceptable.

Now to give a possible example of *why* you could get something as
unexpected as a crash. Imagine a processor which can do multiple
operations in parallel (many processors have been able to do this for
years). Further imagine that it has multiple data paths, so it is
possible for it to read from one (cached) location at the same time as
writing to another. Now, the compiler knows that it does not have to
worry about you reading a variable at the same time as writing to it
because you can only read from it as part of generating the new value.
So it does minimal analysis and writes code that does...
Increment the location that contains i and in parallel read it in to
an index register.
Unfortunately on the processor I'm suggesting this is not possible since
it cannot simultaneously do an increment on a location and read it. So
this generates an "illegal parallel operation" exception which the C
runtime does not catch (because there is no "legal" way to generate it)
and your program crashes.

I've not used a processor precisely like the one I've described, but I
have used processors which can do things in parallel but only if the
combination of requested operations is valid and some of the
combinations were quite surprising.
 
C

Charlton Wilbur

R> Good evening, Keith.

R> If...

R> i = 2; a[ i ] = i++;

R> ... then I claim that a[ i ] will be either a[ 2 ] or a[ 3 ],
R> depending on whether i++ gets evaluated first or last, but it
R> must be either one or the other.

R> Wrong?

Wrong. a = i++; is explicitly undefined behavior; you've been
pointed at the FAQ entry that explains this in detail, so it's not
worth going into here. As soon as you write a = i++; the compiler
is free to do whatever it wants with the rest of your program. It
could decide to fill the entire a array with junk, or halt because it
detected undefined behavior at runtime.

And seriously, even if it were somewhat defined, and you knew that
that statement wound up being at least one of the following, depending
on quirks of the compiler --

a[2] = 3;
a[3] = 3;
a[2] = 2;

-- why not just write the one you want? That way it will produce the
same results on all compilers, which is the point of a portable
programming language. a = i+1; is *much* clearer, and has the
significant advantage of only one plausible interpretation.

Charlton
 
C

Chris Torek

If...

i = 2;
a[ i ] = i++;

... then I claim that a[ i ] will be either a[ 2 ] or a[ 3 ],
depending on whether i++ gets evaluated first or last, but it must be
either one or the other.

What if they are done simultaneously?

If you are familiar with the Itanium, or with "out of order"
execution on various CPUs, consider the kind of CPU called a "VLIW",
or "Very Long Instruction Word", CPU. (The Itanium is actually a
sort of "hybrid VLIW", as it were.)

A VLIW CPU has multiple functional units (just like most modern
pipelining processors), but instead of feeding and controlling them
semi-automatically by reading a sequence of ostensibly linear
instructions -- "do this step, then when it is finished, do that
step, then when that is finished, do a third step" -- and attempting
to assemble them into multiple simultaneous actions, a VLIW CPU
simply reads a very long instruction word that gives a bunch of
actions to take simultaneously.

For instance, to do something like:

total = val1 * 1.07 + val2 * 3 + array[index];

on a conventional CPU, we might get:

shl %r4, 2, %r7 # %r4 = index (in %r7) * 4
load %f1, %r3(%r7) # sets %f1 to array[index]
load %f4, CONSTANT_POOL_3_0(%rPOOL)
mul %f4, %f5, %f4 # set %f4 to val2 * 3.0
load %f3, CONSTANT_POOL_1_07(%rPOOL)
mul %f3, %f2, %f3 # set %f3 to val1 * 1.07
add %f3, %f4, %f3 # %f3 = %f3 + %f4
add %f1, %f1, %f3 # %f1 = %f1 + %f3
stor %f1, addr(total) # total = <result>

for a total of 9 instructions. On the VLIW, however, we might
get something more like:

block {
%r4 = (%r7 << 2) + %r3 # actually uses two integer units
# (output of shifter fed as input
# to adder)
%f4 = CONSTANT_POOL_3.0 # one load/store unit
%f3 = CONSTANT_POOL_1.07 # another load/store unit
# four units run simultaneously
}
# wait for results
block {
%f1 = load(%r4) # one load/store unit
%f4 *= %f5 # one FPU unit
%f3 *= %f2 # another FPU unit
# three units run simultaneously
}
# wait for results
block {
%t1 = %f1 + %f3 + %f4 # actually uses two FPU units (output
# of adder1 fed as input to adder2)
store(%t1, addr(total)) # one load/store unit (using output of adder2)
}

for a total of 3 instructions.

The tricky part occurs when the VLIW compiler is fed code with
undefined behavior like a = i++ -- we end up with an "invalid"
block that reads:

block {
%t1 = (%r7 << 2) + %r3 # a, with a in %r3 and i in %r7
store(%r7, %t1)
%r7 += 1
}

which tries to use one load/store unit to store the old value of
i (in %r7) to a (computed into the entity denoted %t1 here,
although really we just route the output of the adder into the
load/store unit) while at the same time trying to use a third
integer unit to increment %r7. Since registers have three or more
read ports and one write port, this is OK electrically, but the
result is unpredictable because the timing of the output of the
adder-and-shifter ALUs is not defined with respect to the timing
of the incrementing ALU and the timing of the load-store unit.

On early implementations, you get one a different answer from
intermediate implementations. Because debugging this turns out to
be so difficult, the final implementation of the chip includes
circutry to detect the "collision", causing a runtime trap. The
effect of a = i++ then becomes:

illegal instruction - core dumped

rather than setting a[2] or a[3] to either 2 or 3.
 
K

Keith Thompson

Charlton Wilbur said:
R> Good evening, Keith.

R> If...

R> i = 2; a[ i ] = i++;

R> ... then I claim that a[ i ] will be either a[ 2 ] or a[ 3 ],
R> depending on whether i++ gets evaluated first or last, but it
R> must be either one or the other.

R> Wrong?

Wrong. a = i++; is explicitly undefined behavior; you've been
pointed at the FAQ entry that explains this in detail, so it's not
worth going into here. As soon as you write a = i++; the compiler
is free to do whatever it wants with the rest of your program. It
could decide to fill the entire a array with junk, or halt because it
detected undefined behavior at runtime.

And seriously, even if it were somewhat defined, and you knew that
that statement wound up being at least one of the following, depending
on quirks of the compiler --

a[2] = 3;
a[3] = 3;
a[2] = 2;

-- why not just write the one you want? That way it will produce the
same results on all compilers, which is the point of a portable
programming language. a = i+1; is *much* clearer, and has the
significant advantage of only one plausible interpretation.


The expression a = i++ clearly *intends* to do two different
things; it modifies an element of the array a, and it increments i.
None of your proposed replacements modify i.

If it were well-defined, it could make sense to use a = i++ in a
loop to set a[0] to 0, a[1] to 1, etc. If you want to achieve the
intended effect, you need to separate the modification of a and of i
with a sequence point.
 
G

Golden California Girls

Flash said:
Rick wrote, On 28/12/07 23:25:

There were I'm afraid, and the FAQ addresses them.


They were incorrect answers because they specified a limited number of
possibilities.


Your perception is wrong.

The standard *explicitly* states that the behaviour is undefined. The
standard also explains what it means by "undefined behaviour". It says
that undefined behaviour is behaviour upon which it poses no
requirements. I.e. as far as the language standard is concerned, if the
program invokes undefined behaviour literally *anything* that happens is
acceptable.

So when it is run, it empties your bank account and transfers it all to mine.

So I'm sure we will hear about a complier that some undergrad wrote in some
class that does something other than evaulate one side or the other first, or
some obscure chip that has some strange instruction that this one shot compiler
emits that causes all havoc and stops every train in Boston. No one cares
because they aren't asking about that specific implementation defined example!

Post a table about what the top ten most used compliers do when generating code
for the top ten most used chips and someone might consider it useful information.
 
P

Peter Nilsson

Golden California Girls said:
So I'm sure we will hear about a complier that some
undergrad wrote in some class that does something other
than evaulate one side or the other first, ...
... No one cares because they aren't asking about that
specific implementation defined example!

True. If they're in comp.lang.c, then they should only
be interested in what happens on the virtual C machine.
Post a table about what the top ten most used compliers
do when generating code for the top ten most used chips
and...

That is precisely what happens in misguided texts and
other forums. The result is people who think all the world
is intel, or they think they've discovered a compiler bug
when ill-formed expressions don't do what they expect.
someone might consider it useful information.

Sadly, too many people do.
 
C

CBFalconer

Keith said:
.... snip ...

The expression a = i++ clearly *intends* to do two different
things; it modifies an element of the array a, and it increments
i. None of your proposed replacements modify i.

If it were well-defined, it could make sense to use a = i++ in
a loop to set a[0] to 0, a[1] to 1, etc. If you want to achieve
the intended effect, you need to separate the modification of a
and of i with a sequence point.


Of course it is trivially easy to rewrite the statement to do
something accurately defined. For instance:

a = i + 1; i++;
or
a[i + 1] = i; i++;
 
C

Chris Dollin

Kenny said:
jeniffer said:
I want to know why is a = i++ ; wrong? People say that it is
because of different parsing during compilation.Please explain
technically why it is wrong/behaviour undefined?


Even supposing it were not specifically undefined (ie, the C
Standard says that it doesn't care what this code does), what
would you expect it to /mean/?


Get off the "the C standard says..." horse and think logically, like a
human being for a second,


Humans don't generally think logically; this is a Good Thing
except in unusual circumstances.
and it becomes clear. When a human sees the
above, they logically think:
1) Evaluate the RHS
2) Assign it to the LHS
in that order.

No, they don't. Humans who have been /taught/ to do this will
do this; humans who have not, won't. Logic has nothing to do
with it.

Another "logical" order is to evaluate the LHS (for its address)
first, then to evaluate the RHS. "Logically" one does everything
left-to-right, in "logical" reading order.
Wrong. See above.

We don't know who wrote the snippet in question, so we have no
idea what their intent was; your assertion that they "think
logically" and do right-to-left evaluation has no evidence.

Good try.
 
G

Golden California Girls

Peter said:
True. If they're in comp.lang.c, then they should only
be interested in what happens on the virtual C machine.

Yes, an imaginary compiler on an imaginary chip! Must be why there are so many
moron taliban ass kissing trolls just to quote the last couple of days of
messages. People's imaginations must not match, maybe a vulcun mind meld is
where the group should meet rather than on usenet. It would be as useful to the
real world as what happens now.
 
W

William Hughes

Yes, an imaginary compiler on an imaginary chip! Must be why there are so many
moron taliban ass kissing trolls just to quote the last couple of days of
messages. People's imaginations must not match, maybe a vulcun mind meld is
where the group should meet rather than on usenet. It would be as useful to the
real world as what happens now.


If you had bothered to read the thread you would have noticed that
the real problem is with optimizations which may use more than one
execution
path. But the plausibility of problems is beside the point.
(Note, that such a problem might not have seemed as plausible a few
years ago when such techniques were not as widespread.) And this is
the
problem with "don't give me stuff about the imaginary machine, tell me
what real
implementations do". Implementations change, the popularity of
implementations
changes. If you only care about what happens on a few popular
implementations
today, then continue with your "real world" crap. However, those of
us in
the real world know that just because something works today, it might
not
work on a future release of a compiler *if the behaviour is
undefined*. (True.
it might not work even for defined behaviour, but this is much less
likely).
The fact that we cannot see a plausible reason that the behaviour
should
change is not important.
- William Hughes
 
G

Golden California Girls

William said:
If you had bothered to read the thread you would have noticed that
the real problem is with optimizations which may use more than one
execution
path. But the plausibility of problems is beside the point.
(Note, that such a problem might not have seemed as plausible a few
years ago when such techniques were not as widespread.) And this is
the
problem with "don't give me stuff about the imaginary machine, tell me
what real
implementations do". Implementations change, the popularity of
implementations
changes. If you only care about what happens on a few popular
implementations
today, then continue with your "real world" crap. However, those of
us in
the real world know that just because something works today, it might
not
work on a future release of a compiler *if the behaviour is
undefined*. (True.
it might not work even for defined behaviour, but this is much less
likely).
The fact that we cannot see a plausible reason that the behaviour
should
change is not important.
- William Hughes

I was taught long ago '70's that you never trust the output of an optimizer.
That there are 99 ways of 100 it will be wrong for your specific problem. Seems
that lesson has been lost to time.

Any complier that doesn't shout a warning at undefined statements shouldn't be
trusted.
 
W

William Hughes

I was taught long ago '70's that you never trust the output of an optimizer.
That there are 99 ways of 100 it will be wrong for your specific problem. Seems
that lesson has been lost to time.


In other words: when I said that no real compiler could produce the
type
of behaviour specified I was spouting bullshit, but maybe if I try to
change
the subject no one will notice this or the fact that I ignored the
point that the plausiblity of the behaviour was irrelevant.

- William Hughes
 
M

Mark McIntyre

Yes, an imaginary compiler on an imaginary chip! Must be why there are
so many moron taliban ass kissing trolls just to quote the last couple
of days of messages. People's imaginations must not match, maybe a
vulcun mind meld is where the group should meet rather than on usenet.
It would be as useful to the real world as what happens now.

okay - you're back in the killfile. Have a nice day.
 
T

Tim Smith

... then I claim that a[ i ] will be either a[ 2 ] or a[ 3 ],
depending on whether i++ gets evaluated first or last, but it must be
either one or the other.

Wrong?

Er, yeah, wrong. C doesn't actually guarantee this at all. But
realistically, how could it have any other value? Well, I don't plan to
work an example for you, but I recommend the following page, which gives
some hard data on the various results you get from different compilers for
similar expressions:

http://www.phaedsys.demon.co.uk/chris/sweng/swengtips3a.htm

For i == 2 initially, one of those is probably realistically what you'll
get, but how about for other values of i?

Imagine an implementation using 64-bit ints, on a processor where
arithmetic is done using 32-bit operations, and the processor has
multiple hardware threads, which the compiler takes advantage of, and
imagine that i is a value such that i++ has changes to the bits in both
the lower and the upper 32-bits.

Neither increment nor assignment would be atomic for integers on this
setup, and the i in a could be wildly off, using, say, the
pre-increment upper 32 bits and post increment lower 32 bits. It would
then be off by around 4 billion.
 
A

Army1987

Kaz said:
C is not cast in stone. Past undefined behaviors can easily be defined
in the future, without breaking any correctly written code.

Considering that the C99 standard has yet to be widely implemented, C is
cast in stone for most practical purposes.
 
K

Kelsey Bjarnason

[snips]

Here, what actually happens in the real world is irrelevant. In fact,
the real world itself is pretty much irrelevant.

Most of us write code that is expected to work in the real world.
What matters is what
the standard requires

Correct, as this is the only realistic guideline we have as to how the
code we write is supposed to behave. It defines exactly how our code is
expected to work, but *only* if we adhere to the standard.
 
D

dj3vande

Golden California Girls said:
I was taught long ago '70's that you never trust the output of an optimizer.

You were taught wrong. (Not that that's a surprise.)
A non-broken optimizer will preserve correctness (i.e. given correct
inputs will not produce incorrect outputs), and there's a good chance
that the compiler's optimizer is better-tested than your code.
Therefore, if it produces incorrect output, there is a bug somewhere,
and it's a lot safer to bet that it's in your code than that it's in
the optimizer.

Of course, if you're in the habit of writing code that Just Happens To
Work because you compiled it with the right compiler when the moon was
in the right phase, you can expect that an optimizer will reveal some
of the bugs. That doesn't mean that your code was correct to begin
with, and it doesn't mean that the optimizer is broken.

Any complier that doesn't shout a warning at undefined statements shouldn't be
trusted.

I am eagerly awaiting your solution to the halting problem.


dave
 
R

Richard Tobin

Golden California Girls said:
I was taught long ago '70's that you never trust the output of an optimizer.
That there are 99 ways of 100 it will be wrong for your specific problem. Seems
that lesson has been lost to time.

No, the lesson, if it was ever applicable, is not today. Many modern
compilers don't even attempt to produce reasonable code with
optimisation switched off, because they have effective, correct
optimising passes.

Optimisaation isn't some kind of magic, it's just producing better
code for a given input. Why would you not want to do that?

And as others have pointed out, the same analysis is required for
good diagnostics as is required for optimisation.

-- Richard
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top