i++ + i++ questions

O

Ook

I had a coworker present this problem:

int i = 0;
int j,k;
j = i++ + i++;
k = ++i + ++i;

And asked what j and k will have. I would expect that the compiler would add
i to i (0+0), store the 0 in j, and then increment i twice. On the next
line, we start with i = 2. I is incremented twice so that it now equals 4.
Then it is added to itself (i + i) and the result, 8, is stored in k. I run
it, and it does this. My coworker stated that this was actually undefined
and that you can't read a variable in an expression where you write it. Is
this really undefined and both of my compilers do what I described above
because that is what they wrote the compiler? Or is this functionality the
way c++ should work?
 
V

Victor Bazarov

Ook said:
I had a coworker present this problem:

int i = 0;
int j,k;
j = i++ + i++;
k = ++i + ++i;

And asked what j and k will have. I would expect that the compiler would add
i to i (0+0), store the 0 in j, and then increment i twice. On the next
line, we start with i = 2. I is incremented twice so that it now equals 4.
Then it is added to itself (i + i) and the result, 8, is stored in k. I run
it, and it does this. My coworker stated that this was actually undefined
and that you can't read a variable in an expression where you write it. Is
this really undefined and both of my compilers do what I described above
because that is what they wrote the compiler? Or is this functionality the
way c++ should work?

It's undefined.
 
F

Ferdi Smit

Ook said:
I had a coworker present this problem:

int i = 0;
int j,k;
j = i++ + i++;
k = ++i + ++i;

And asked what j and k will have. I would expect that the compiler would
add i to i (0+0), store the 0 in j, and then increment i twice. On the
next line, we start with i = 2. I is incremented twice so that it now
equals 4. Then it is added to itself (i + i) and the result, 8, is stored
in k. I run it, and it does this. My coworker stated that this was
actually undefined and that you can't read a variable in an expression
where you write it. Is this really undefined and both of my compilers do
what I described above because that is what they wrote the compiler? Or is
this functionality the way c++ should work?

You can't really tell, because it results in undefined behaviour... You are
only allowed to change a value at most once between sequence points (in
short). Just look up "C++ sequence points" on the net, and there should be
some clearer explanation with examples of what sequence points are etc.
 
K

Kaz Kylheku

Ook said:
I had a coworker present this problem:

int i = 0;
int j,k;
j = i++ + i++;
k = ++i + ++i;

And asked what j and k will have.

The behavior is undefined.
I would expect that the compiler would add
i to i (0+0), store the 0 in j, and then increment i twice.

Based, exactly, on what? I mean, you expectation has to come from
somewhere, right? What is the source of your expectation?

Why can't it take the zero value from the left i++, increment i, then
take the 1 value from the right i++, increment i, and produce a result
of 1?

Why can't it make a note that i is to be incremented in this
expression, add together the old value (0), and then just store the new
value 1 into i, not caring that two increments were written?

Nothing in the language standard contradicts any of these possible
interpretations.

The C++ implementation can halt with a diagnostic message, and not
translate the program at all. Or it can silently produce a program that
doesn't work at all: it may crash, or compute any values whatsoever.

Watch out: your coworker may be an expert who is on a mission to
uncover the incompetents in your organization.
My coworker stated that this was actually undefined
and that you can't read a variable in an expression where you write it.

Busted! Probably something like 95% of the C++ programming population
doesn't know this stuff, so you're not like a lone idiot or anything.
 
R

Ron Natalie

Ook said:
I had a coworker present this problem:

int i = 0;
int j,k;
j = i++ + i++;
k = ++i + ++i;

And asked what j and k will have. I would expect that the compiler would add
i to i

Your expectations are wrong. Your coworker is right for the reasons he
stated.

The increment of the ++ operator is a side effect. There is NO sequence
implied by it. It is not the case that says ++i increments the
expression immediately before evaluation nor i++ immediately after.
 
K

Kristo

Ook said:
I had a coworker present this problem:

int i = 0;
int j,k;
j = i++ + i++;
k = ++i + ++i;

And asked what j and k will have. I would expect that the compiler would add
i to i (0+0), store the 0 in j, and then increment i twice. On the next
line, we start with i = 2. I is incremented twice so that it now equals 4.
Then it is added to itself (i + i) and the result, 8, is stored in k. I run
it, and it does this. My coworker stated that this was actually undefined
and that you can't read a variable in an expression where you write it. Is
this really undefined and both of my compilers do what I described above
because that is what they wrote the compiler? Or is this functionality the
way c++ should work?

Your coworker is right, it is undefined behavior (see the FAQ for an
explanation). Undefined behavior means the program could do
*absolutely anything*. One possibility is that it does what you think
it should do. Another possibility is that demons fly out of your nose.

The moral of the story here is this: do not invoke undefined behavior.
Ever.

Kristo
 
M

Marcus Kwok

Ook said:
I had a coworker present this problem:

int i = 0;
int j,k;
j = i++ + i++;
k = ++i + ++i;

And asked what j and k will have. I would expect that the compiler would add
i to i (0+0), store the 0 in j, and then increment i twice. On the next
line, we start with i = 2. I is incremented twice so that it now equals 4.
Then it is added to itself (i + i) and the result, 8, is stored in k. I run
it, and it does this. My coworker stated that this was actually undefined
and that you can't read a variable in an expression where you write it. Is
this really undefined and both of my compilers do what I described above
because that is what they wrote the compiler? Or is this functionality the
way c++ should work?

Essentially your coworker is correct, it is undefined. Undefined
behavior means the program can do ANYTHING, including what you think it
ought to do.
 
J

John Harrison

Ook said:
I had a coworker present this problem:

int i = 0;
int j,k;
j = i++ + i++;
k = ++i + ++i;

And asked what j and k will have. I would expect that the compiler would add
i to i (0+0), store the 0 in j, and then increment i twice. On the next
line, we start with i = 2. I is incremented twice so that it now equals 4.
Then it is added to itself (i + i) and the result, 8, is stored in k. I run
it, and it does this. My coworker stated that this was actually undefined
and that you can't read a variable in an expression where you write it. Is
this really undefined and both of my compilers do what I described above
because that is what they wrote the compiler? Or is this functionality the
way c++ should work?

What amazes me about this issue is how often it comes up and how much
people seem to care what the 'correct' answer is.

Programming is a practical discipline and therefore the only important
consideration for code like the above is that it is obscure and should
be avoided on stylistic grounds alone. Whether is does/does not,
should/should not have defined behaviour is secondary.

john
 
M

Markus Moll

Hi

John said:
Programming is a practical discipline and therefore the only important
consideration for code like the above is that it is obscure and should
be avoided on stylistic grounds alone. Whether is does/does not,
should/should not have defined behaviour is secondary.

Do you really mean it?
I, for my part, would rather write code that runs, not code that merely
happened to run once by pure chance...

Markus
 
O

Ook

Hmm...learn something new every day. What exactly is the limit, what exactly
can be done and what can't? Is it that I'm referencing i twice that makes
this undefined? IOW is

int i,j,k;
i=0;
k=1;

j = i++ + k;

OK?
 
V

Victor Bazarov

Ook said:
Hmm...learn something new every day. What exactly is the limit, what exactly
can be done and what can't? Is it that I'm referencing i twice that makes
this undefined? IOW is

int i,j,k;
i=0;
k=1;

j = i++ + k;

OK?

Don't top-post, please.

Yes, that one is OK. Changing 'i' twice between sequence points is what
made the previous one undefined.


V
 
M

Markus Moll

Hi

Ook said:
Hmm...learn something new every day. What exactly is the limit, what
exactly
can be done and what can't?

Between two consecutive sequence points the value of i must be modified at
most once.

int i = 5;
i = i++;

already violates this rule, and so does

int i = 5;
i = i++ + i++;
Is it that I'm referencing i twice that makes
this undefined? IOW is

int i,j,k;
i=0;
k=1;

j = i++ + k;

OK?

Yes, it is okay, but now you're too restrictive.

int i = 5;
int j = i + i; // perfectly fine, j == 10

Markus
 
K

Kaz Kylheku

John said:
What amazes me about this issue is how often it comes up and how much
people seem to care what the 'correct' answer is.

Programming is a practical discipline and therefore the only important
consideration for code like the above is that it is obscure and should
be avoided on stylistic grounds alone. Whether is does/does not,
should/should not have defined behaviour is secondary.

That is nonsense. If we followed such silly principles, we'd be writing

i = i + 1;

everywhere instead of i++.

There is no substitute for understanding the language and knowing what
the heck you are doing. That is the number one concern, not stylistic
quibbles.

If the rules for evaluating multiple side effects in an expression were
well defined, there wouldn't be any problems with it at all.
Programmers will always find a stylistically elegant way of using some
device that is constantly held up as poor style.

There are languages in which, like in C and C++, updates of storage
locations are performed by ordinary expressions that can be embedded in
other expressions as part of a nested evaluation. Yet, these languages
don't suffer from the idiotic undefined behavior problems.

For example, in Common Lisp, everything happens left to right. So there
is no need for the sequence point concept at all.

You can write

(+ (incf i) (incf i))

and it has a well-defined meaning. The incf operator increments a
location, and returns its new value, so (incf i) is like ++i. What
happens is that the arguments to + are evaluated strictly left to
right. The first (incf i) completely does its job: it produces a value
and updates the value of i. The second (incf i) sees the new value of i
reliably; it increments it again, and returns the new value which
becomes the second argument. The + function is then called. End of
story.

There is no need to avoid this on any stylistic grounds, because it's
perfectly obvious what it does. There are plenty of occasions when it's
exploited. E.g. suppose you have a function that generates objects in
some well defined sequence:

(foo (gen) (gen) (gen))

You know that you will get succesive elements of the generated sequence
passed to foo in the left-to-right order. Moreover, you can replace gen
by a macro that inlines the code, for instance:

(defmacro gen () '(incf i))

still, no problems. In C and C++, we'd be shifting from unspecified
behavior into undefined territory at this point.

Java is another example of a language which also defines a strict left
to right order, and provides a glimpse of what it's like to have this
in a C-like expresion syntax.

Regardless of what you may think of the style of programs which do this
sort of thing, by far the orders of magnitude greater idiocy is
designing languages in which writing undefined expressions is easier
than sneezing.
 
J

John Harrison

Kaz said:
That is nonsense. If we followed such silly principles, we'd be writing

i = i + 1;

everywhere instead of i++.

My criterion was obscurity, if an average programmer cannot look at a
simple piece of code and know what is going on that that is a stylistic
defect in the code. I fail to see how i++ is obscure.
There is no substitute for understanding the language and knowing what
the heck you are doing. That is the number one concern, not stylistic
quibbles.

One of C++ is problems is that it is such a huge language that very few
people understand the whole language. Avoid obscure corners of the
language was Bjarne Stroustrup's advice, and presumably he knows C++
fairly well.
If the rules for evaluating multiple side effects in an expression were
well defined, there wouldn't be any problems with it at all.
Programmers will always find a stylistically elegant way of using some
device that is constantly held up as poor style.

There are languages in which, like in C and C++, updates of storage
locations are performed by ordinary expressions that can be embedded in
other expressions as part of a nested evaluation. Yet, these languages
don't suffer from the idiotic undefined behavior problems.

For example, in Common Lisp, everything happens left to right. So there
is no need for the sequence point concept at all.

You can write

(+ (incf i) (incf i))

and it has a well-defined meaning. The incf operator increments a
location, and returns its new value, so (incf i) is like ++i. What
happens is that the arguments to + are evaluated strictly left to
right.


But Scheme does not define any ordering of subexpressions. So your code
has potential for confusion because two dialects of Lisp treat the same
issue very differently. In fact this issue varies greatly from one
language to the next which is exactly why I think it is normally best
avoided.

The first (incf i) completely does its job: it produces a value
and updates the value of i. The second (incf i) sees the new value of i
reliably; it increments it again, and returns the new value which
becomes the second argument. The + function is then called. End of
story.

There is no need to avoid this on any stylistic grounds, because it's
perfectly obvious what it does. There are plenty of occasions when it's
exploited. E.g. suppose you have a function that generates objects in
some well defined sequence:

(foo (gen) (gen) (gen))

Of course there are always exceptions to any stylistic rule.
You know that you will get succesive elements of the generated sequence
passed to foo in the left-to-right order. Moreover, you can replace gen
by a macro that inlines the code, for instance:

(defmacro gen () '(incf i))

still, no problems. In C and C++, we'd be shifting from unspecified
behavior into undefined territory at this point.

Java is another example of a language which also defines a strict left
to right order, and provides a glimpse of what it's like to have this
in a C-like expresion syntax.

Regardless of what you may think of the style of programs which do this
sort of thing, by far the orders of magnitude greater idiocy is
designing languages in which writing undefined expressions is easier
than sneezing.

Well that's a different issue.

john
 
J

John Harrison

Markus said:
Hi

John Harrison wrote:




Do you really mean it?
I, for my part, would rather write code that runs, not code that merely
happened to run once by pure chance...

So would I, I don't see how your comment relates to what I said.

john
 
K

Kaz Kylheku

Markus said:
Hi



Do you really mean it?
I, for my part, would rather write code that runs, not code that merely
happened to run once by pure chance...

I'd rather have a much smaller theoretical model for analyzing programs
in the language that I'm using.

Come on, that sequence point stuff isn't even well defined! A lot of
debate has gone into this, proposing various models for how it actually
works, and what the formally precise rules are for determining that the
behavior is undefined.

Check out the Informative Annex of the C99 standard which proposes a
``Formal Model of Sequence Points''. Its' very interesting. But it'
also completely unnecessary if all you do is define the damn evaluation
order in the language.

Where I can understand undefined orders is in multiprocessor hardware
(specifically, NUMA). There are real performance hits if you try to
make distributed memory look like one big block of RAM that everyone
hangs off of. But even in these systems, you can run machine code that
was developed for a single thread, and tested on non-SMP hardware.

I don't think that the undefined evaluation orders contribute anything
significant to the ability to develop high quality optimizing compilers
which are targetting single instruction flows for one processor.

There are only rare cases when you have objects that are referenced
through pointers, when you know that the objects are not aliased. If
you are given some expression like

*(*d)++ = *(*s)++;

its behavior is well-defined if s and d point to different objects.
Therefore the compiler can just assume that the the programmer wrote a
well-defined program, and generate the code based on the assumption
that s and s are different objects.

We are now in the realm of expressions that are undefined based on
run-time values. If d and s are the same pointer, we have two
modifications to the same object between sequence points. If they are
different pointers, all is cool. This is where a static assumption
about well-definedness gives you an optimization edge, at the cost of
safety.

If the language rules require strict evaluation, then the assumption
does not hold; (*s)++ may change the value of *d, and that is
well-defined and has to work (even though it may well be a bug: maybe
the programmer considered the aliased case, maybe not). So, whopee dee,
you have to generate less optimal code. (Or generate two variants of
the code, switched at run time on a comparison of the pointers. Still a
hit to do the comparison).

That kind of thing can be overcome in other ways. Rather than screwing
up the entire language, you can add ways of declaring that the pointers
are to different objects. Oh, but wait, wait, we have that, now!
restrict pointers in C99 for asserting that things are not aliased.
 
J

Jay Nabonne

Hi

int i = 5;
int j = i + i; // perfectly fine, j == 10

Not to beat the dead horse too much, but is:

int i = 5;
int j = i++ + i;

ok or undefined or simply unspecified (i.e. implementation-defined)?

Is the order of evaluation for arguments to '+' guaranteed to be
left-to-right?

- Jay
 
K

Kaz Kylheku

Ook said:
Hmm...learn something new every day.

.... such as fundamental knowledge for the job you are already doing? :)
What exactly is the limit, what exactly
can be done and what can't? Is it that I'm referencing i twice that makes
this undefined? IOW is

The rules are 1) that you can't modify the same object two or more
times between the same two sequence points, and that 2) you can't
modify an object that was accessed, unless the access was done for the
purpose of computing the new value to be stored.

That little ``unless'' clause is needed in rule 2, so that expressions
like:

i = i + 1;

j = (j + 2) * (j + 3);

are defined. The variable is both read and written in the same
expression, but that must be allowed in such cases. This is okay,
because the new value cannot be computed until the object is accessed,
and that new value cannot be stored into the object until it is
computed. This data flow dependency keeps everything ordered.

But consider

i + (i = i * i);
R1 W R2 R3

Here the read accesses to i labelled as R2 and R3 happen before the
write W, so considered by themselves, they are okay. But R1 is
floating; it has no place in that dependency. R1 could happen before,
during or after W. Access R1 is not needed for computing the new value
being stored by the assignment, so rule 2 is violated.
 
M

Mike Wahler

Ook said:
I had a coworker present this problem:

int i = 0;
int j,k;
j = i++ + i++;
k = ++i + ++i;

And asked what j and k will have. I would expect that the compiler would
add i to i (0+0), store the 0 in j, and then increment i twice. On the
next line, we start with i = 2. I is incremented twice so that it now
equals 4. Then it is added to itself (i + i) and the result, 8, is stored
in k. I run it, and it does this. My coworker stated that this was
actually undefined and that you can't read a variable in an expression
where you write it. Is this really undefined and both of my compilers do
what I described above because that is what they wrote the compiler? Or is
this functionality the way c++ should work?

This issue has been beat to death here. So much so that it's
in the FAQ:

http://www.parashift.com/c++-faq-lite/
See item 39.15

(It's a very good idea to read the *entire* FAQ while you're there.)

One should always consult the FAQ *first* before posting
questions here.

-Mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,198
Latest member
JaimieWan8

Latest Threads

Top