i++ + i++ questions

J

Jack Klein

Not to beat the dead horse too much, but is:

int i = 5;
int j = i++ + i;

ok or undefined or simply unspecified (i.e. implementation-defined)?

Actually it is undefined.
Is the order of evaluation for arguments to '+' guaranteed to be
left-to-right?

No, it is specifically not guaranteed to be in any particular order at
all.

Several people have paraphrased a part of the relevant wording from
the C++ standard. Here is an actual quote of the entire relevant
paragraph, 5 p4:

"Except where noted, the order of evaluation of operands of individual
operators and subexpressions of individual expressions, and the order
in which side effects take place, is unspecified. Between the
previous and next sequence point a scalar object shall have its stored
value modified at most once by the evaluation of an expression.
Furthermore, the prior value shall be accessed only to determine the
value to be stored. The requirements of this paragraph shall be met
for each allowable ordering of the subexpressions of a full
expression; otherwise the behavior is undefined."

So your example:
int i = 5;
int j = i++ + i;

The initialization of 'j' produces undefined behavior because 'i' is
modified and you also access its prior value as the second operand of
the addition operator, and such access is not related to determining
its new value.
 
J

Jon Slaughter

Victor Bazarov said:
It's undefined.

I don't understand why it should be undefined. IMO this is something that is
very stupid to do and just causes problems when one can find a better way to
make these well defined. Such as following standard arithmetic precedence
rules and left to right ordering...


i.e.

k = ++i + ++j

is

1:

++i
k = i + ++j

2:

++i
++j
k = i + j


which makes complete sense


Ofcourse

k = ++i + (i ? 0 : 3)

would be different from

k = (i ? 0 : 3) + ++i

but who cares? they are both well defined if one just remembers that + is no
longer associative in cases such as these and by simply following left to
right ordering and basic arithemtic laws one will end up with a well defined
result that may or may not be intuitive but is atleast well defined and not
undefined which is ridiculous.

I'm not saying the method above is the best ofcourse, just that there is
atleast a method that should resolve the ambiguity(one may have to introduce
more precedence rules though and potentially special cases which depending
on how many might not be worth it. It would be nice to know why the commitee
chose to keep this undefined because surely they though about it more than I
have)

Jon
 
J

Jon Slaughter

John Harrison said:
So would I, I don't see how your comment relates to what I said.

john

Because you are implying that the code should be undefined for some reason
when it shouldn't. There is no reason for it to be undefined since one can
easily make those types of statements well defined. Um, and wheather things
should be well defined are of primary importance and not secondary. So
unless you were being sarcastic you are totally off base. This is like
saying that its ok in math to have statements that are undefined because
they are not artistic enough when in reality they can be perfectly
defined(and make complete sense too)
 
D

deane_gavin

Jon said:
Because you are implying that the code should be undefined for some reason
when it shouldn't. There is no reason for it to be undefined since one can
easily make those types of statements well defined. Um, and wheather things
should be well defined are of primary importance and not secondary. So
unless you were being sarcastic you are totally off base. This is like
saying that its ok in math to have statements that are undefined because
they are not artistic enough when in reality they can be perfectly
defined(and make complete sense too)

I think you've missed the point. If

int i = 0;
int j = i++ + i++; // Undefined

was well defined, you would have 4 things going on in one short
statement: an addition, two increments and an assignment.

Even if the order of those four was well defined, you as the
programmer, and anyone else reading your code, would have to remember
the order. If you or someone else remembered incorrectly - not unlikely
I would suggest - they would misinterpret the meaning of the code.
Writing code that is easy to misinterpret is bad.

The point is nothing to do with whether the code should be well
defined. The point is why do so many people seem to care? Even if you
were allowed to write code like that, you shouldn't ever do it, so why
does the question come up?

Given that you shouldn't be writing actual code that looks like that in
the first place, the question of whether it happens to be well defined
or not is a secondary, academic one.

Gavin Deane
 
J

Jim Langston

Ook said:
I had a coworker present this problem:

int i = 0;
int j,k;
j = i++ + i++;
k = ++i + ++i;

And asked what j and k will have. I would expect that the compiler would
add i to i (0+0), store the 0 in j, and then increment i twice. On the
next line, we start with i = 2. I is incremented twice so that it now
equals 4. Then it is added to itself (i + i) and the result, 8, is stored
in k. I run it, and it does this. My coworker stated that this was
actually undefined and that you can't read a variable in an expression
where you write it. Is this really undefined and both of my compilers do
what I described above because that is what they wrote the compiler? Or is
this functionality the way c++ should work?

Undefined. Quoting another post in this NG that quoted the standard:

5/4 of the Standard:

<QUOTE>
Except where noted, the order of evaluation of operands of individual
operators and subexpressions of individual
expressions, and the order in which side effects take place, is
unspecified.53) Between the previous
and next sequence point a scalar object shall have its stored value
modified at most once by the evaluation
of an expression. Furthermore, the prior value shall be accessed only
to determine the value to be stored.
The requirements of this paragraph shall be met for each allowable
ordering of the subexpressions of a full
expression; otherwise the behavior is undefined. [Example:
i = v[i++]; // the behavior is unspecified
i = 7, i++, i++; // i becomes 9
i = ++i + 1; // the behavior is unspecified
i = i + 1; // the value of i is incremented
-end example]
 
G

Greg Comeau

I don't understand why it should be undefined. IMO this is something that is
very stupid to do and just causes problems when one can find a better way to
make these well defined. Such as following standard arithmetic precedence
rules and left to right ordering...

i.e.

k = ++i + ++j

is

1:

++i
k = i + ++j

2:

++i
++j
k = i + j

which makes complete sense

Ofcourse

k = ++i + (i ? 0 : 3)

would be different from

k = (i ? 0 : 3) + ++i

but who cares? they are both well defined if one just remembers that + is no
longer associative in cases such as these and by simply following left to
right ordering and basic arithemtic laws one will end up with a well defined
result that may or may not be intuitive but is atleast well defined and not
undefined which is ridiculous.

I'm not saying the method above is the best ofcourse, just that there is
atleast a method that should resolve the ambiguity(one may have to introduce
more precedence rules though and potentially special cases which depending
on how many might not be worth it. It would be nice to know why the commitee
chose to keep this undefined because surely they though about it more than I
have)

The idea is to allow C to run on a diversity of systems
and furthermore to be optimized when possible. So the
"loose" rules allow that, while the undefined behavior
allows your possibilty as oneof the undefined choices.
Is this annoying? Seems to me be. However, the other
side of the coin is that C would probably not be as
popular w/o it. Some may say this is a good thing.
 
G

Greg Comeau

Because you are implying that the code should be undefined for some reason
when it shouldn't. There is no reason for it to be undefined since one can
easily make those types of statements well defined. Um, and wheather things
should be well defined are of primary importance and not secondary. So
unless you were being sarcastic you are totally off base. This is like
saying that its ok in math to have statements that are undefined because
they are not artistic enough when in reality they can be perfectly
defined(and make complete sense too)

Ignoring the style question for a second, your counter
argument is invalid because the reality is that some of
the statements shoun are undefine accto Standard C++.
Whether they can be defined in say a subsequent standard
is another issue.
 
J

Jay Nabonne

So your example:


The initialization of 'j' produces undefined behavior because 'i' is
modified and you also access its prior value as the second operand of
the addition operator, and such access is not related to determining
its new value.

That makes sense now. Thanks!
 
J

Jon Slaughter

Greg Comeau said:
The idea is to allow C to run on a diversity of systems
and furthermore to be optimized when possible. So the
"loose" rules allow that, while the undefined behavior
allows your possibilty as oneof the undefined choices.
Is this annoying? Seems to me be. However, the other
side of the coin is that C would probably not be as
popular w/o it. Some may say this is a good thing.
--

I didn't know that optimization was part of the C++ standard? I think that
undefined behavoir is completely unacceptable and only in specific cases
should this be allowed(things that exist external to the language itself
such as using hardware)
 
K

Kaz Kylheku

Greg said:
The idea is to allow C to run on a diversity of systems
and furthermore to be optimized when possible. So the
"loose" rules allow that, while the undefined behavior
allows your possibilty as oneof the undefined choices.
Is this annoying? Seems to me be. However, the other
side of the coin is that C would probably not be as
popular w/o it. Some may say this is a good thing.

A compiler can reorder the evaluation of an expression arbitrarily, as
long as any necessary side effects are produced, and the outcome is
correct.

Defining the order of evaluation does not hinder this freedom, except
in cases when there is insufficient compile-time info.

That is to say, whenever you write an expression that /clearly/ does
not violate the rules (modify something twice, or read something that
is modified in a way that does not compute the new value), it doesn't
make any difference whether the evaluation order is unspecified, or
whether it's strict left to right. By clearly, I mean that it's
statically analyzable.

There are cases whose defined-ness is conditional on run-time values.

(*x) = (*y)++;

The above is undefined if x and y point to the same object, otherwise
it is well-defined. It may be impractical, or impossible to determine
this statically. The rules of C and C++ allow the compiler the freedom
to optimize without analyzing this at all. Since the behavior is
undefined if x == y, the compiler doesn't have to care about that;
undefined behavior is the programmer's problem. It can assume that *x
and *y are distinct objects, and generate the target code in any of the
ways which are correct under that assumption.

Under strict left to right evaluation, the compiler cannot do that. It
must pessimistically assume that x and y could be aliased. The
semantics goes something like this: *x must be evaluated first to
produce a stable lvalue which is remembered in some temporary location.
Then, *y is incremented, and its old value goes to the location
remembered in the temporary.

So basically the design rationale for expression evaluation in C and
C++ is this: that (1) the risky possibility that ambiguous evaluation
may lead to less reliable software is /worth/ whatever benefit provided
by the freedom to make the optimistic assumptions with regard to
aliasing, and moreover, (2) that this is a better design than
restricting evaluation order, and instead providing a feature whereby
the programmer can declare that specific objects are distinct.

But of course, with regard to (2), we do now have that declaration
mechanism in C99 (albeit not in C++). Since there is a way to declare
that x and y point to distinct objects, there is no more need for
ambiguous evaluation orders. You can just write:

decl spec ... * restrict x, * restrict y;
// ...
(*x) = (*y)++;

and this will now be undefined if x and y are aliased, /even if/ we pin
the evaluation order to be strict left to right!!! With near pinpoint
precision, the programmer can get the undefined behavior in there,
undefined behavior which occurs exactly in the same case as when
evaluation order is left unspecified, without the whole language having
to have unspecified evaluation order everywhere!

What are the remaining optimization opportunities that are not
addressed by restrict pointers, but critically depend on loose
evaluation order?
 
R

Richard Herring

Jon Slaughter said:
[...]
The idea is to allow C to run on a diversity of systems
and furthermore to be optimized when possible. So the
"loose" rules allow that, while the undefined behavior
allows your possibilty as oneof the undefined choices.
Is this annoying? Seems to me be. However, the other
side of the coin is that C would probably not be as
popular w/o it. Some may say this is a good thing.

I didn't know that optimization was part of the C++ standard?

There's a difference between prescribing optimisation and not thwarting
it. What the Standard attempts to do is to specify what a well-formed
program is, and how such a program must behave, without restricting the
freedom of implementors to make it work efficiently on a variety of
architectures. Unnecessarily specifying things like order of evaluation
makes that task much harder.
I think that
undefined behavoir is completely unacceptable and only in specific cases
should this be allowed(things that exist external to the language itself
such as using hardware)

You are entitled to your opinion.
 
K

Kai-Uwe Bux

Richard said:
Jon Slaughter said:
[...]
The idea is to allow C to run on a diversity of systems
and furthermore to be optimized when possible. So the
"loose" rules allow that, while the undefined behavior
allows your possibilty as oneof the undefined choices.
Is this annoying? Seems to me be. However, the other
side of the coin is that C would probably not be as
popular w/o it. Some may say this is a good thing.

I didn't know that optimization was part of the C++ standard?

There's a difference between prescribing optimisation and not thwarting
it. What the Standard attempts to do is to specify what a well-formed
program is, and how such a program must behave, without restricting the
freedom of implementors to make it work efficiently on a variety of
architectures. Unnecessarily specifying things like order of evaluation
makes that task much harder.

Well, let's have a look at the wording of the standard:

[5/4]
Except where noted, the order of evaluation of operands of individual
operators and subexpressions of individual expressions, and the order in
which side effects take place, is unspecified.53) Between the previous
and next sequence point a scalar object shall have its stored value modified
at most once by the evaluation of an expression. Furthermore, the prior
value shall be accessed only to determine the value to be stored.
The requirements of this paragraph shall be met for each allowable ordering
of the subexpressions of a full expression; otherwise the behavior is
undefined.


Now, if the standard just dropped the line last 5 words, the clause would
still specify what a well defined program looks like. Just the compiler
would be required to issue an error if the shall-clause is violated. Since
the set of allowable orderings of sub-expressions is finite, the compiler
could detect this. [Also, I venture the conjecture that as part of an
optimization effort, the compiler would look at a sizable number of them
anyway.]

Such a wording would leave all options for optimizing well formed programs.
Now, why do we need undefined behavior here?


Best

Kai-Uwe Bux
 
?

=?ISO-8859-15?Q?Juli=E1n?= Albo

Kai-Uwe Bux said:
Now, if the standard just dropped the line last 5 words, the clause would
still specify what a well defined program looks like. Just the compiler
would be required to issue an error if the shall-clause is violated. Since
the set of allowable orderings of sub-expressions is finite, the compiler
could detect this. [Also, I venture the conjecture that as part of an
optimization effort, the compiler would look at a sizable number of them
anyway.]

I think this is a quality of implementation issue. The compiler can emit a
diagnostic or fail even if the standard does not force to do that.

Probably if the standard were modified many compilers will allow it anyway
as a non-standard extension to allow legacy code that uses the known
behaviour of that concrete compiler.
 
R

Richard Herring

Kai-Uwe Bux said:
Richard said:
Jon Slaughter said:
[...]

The idea is to allow C to run on a diversity of systems
and furthermore to be optimized when possible. So the
"loose" rules allow that, while the undefined behavior
allows your possibilty as oneof the undefined choices.
Is this annoying? Seems to me be. However, the other
side of the coin is that C would probably not be as
popular w/o it. Some may say this is a good thing.

I didn't know that optimization was part of the C++ standard?

There's a difference between prescribing optimisation and not thwarting
it. What the Standard attempts to do is to specify what a well-formed
program is, and how such a program must behave, without restricting the
freedom of implementors to make it work efficiently on a variety of
architectures. Unnecessarily specifying things like order of evaluation
makes that task much harder.

Well, let's have a look at the wording of the standard:

[5/4]
Except where noted, the order of evaluation of operands of individual
operators and subexpressions of individual expressions, and the order in
which side effects take place, is unspecified.53) Between the previous
and next sequence point a scalar object shall have its stored value modified
at most once by the evaluation of an expression. Furthermore, the prior
value shall be accessed only to determine the value to be stored.
The requirements of this paragraph shall be met for each allowable ordering
of the subexpressions of a full expression; otherwise the behavior is
undefined.


Now, if the standard just dropped the line last 5 words, the clause would
still specify what a well defined program looks like. Just the compiler
would be required to issue an error if the shall-clause is violated. Since
the set of allowable orderings of sub-expressions is finite, the compiler
could detect this. [Also, I venture the conjecture that as part of an
optimization effort, the compiler would look at a sizable number of them
anyway.]

Such a wording would leave all options for optimizing well formed programs.
Now, why do we need undefined behavior here?
Maybe we don't, in this case, if the compiler has enough information.
But does your analysis work for _all_ such problems, and if not can you
still find a form of words that works?

My point was really addressing the implicit question "why does the
standard say this code is not well-formed?", rather than "what should
the compiler do about it?".
 
K

Kai-Uwe Bux

Julián Albo said:
Kai-Uwe Bux said:
Now, if the standard just dropped the line last 5 words, the clause would
still specify what a well defined program looks like. Just the compiler
would be required to issue an error if the shall-clause is violated.
Since the set of allowable orderings of sub-expressions is finite, the
compiler could detect this. [Also, I venture the conjecture that as part
of an optimization effort, the compiler would look at a sizable number of
them anyway.]

I think this is a quality of implementation issue. The compiler can emit a
diagnostic or fail even if the standard does not force to do that.

Well, I concede that there is a need for undefined behavior. It follows from
undecidability of <whatever famous problem> that a compiler cannot always
prove a path of execution is not taken. Thus statements like

delete p;

must allow for undefined behavior in some cases since they cannot be
statically analyzed. However, I fail to see why evaluation of an integral
expression should fall into that category. Why should this be deemed QoI?
Sometimes, I get the impression that the standard has way more undefined
behavior than necessary. This makes it unnecessarily hard to ensure the
correctness of a program.
Probably if the standard were modified many compilers will allow it anyway
as a non-standard extension to allow legacy code that uses the known
behaviour of that concrete compiler.

Ok, let' have a command line switch for that.

BTW: that legacy code would probably benefit from a little rewrite. What
happens when those folks upgrade their compiler and the UB changes
silently?


Best

Kai-Uwe Bux
 
K

Kai-Uwe Bux

Richard said:
In message <[email protected]>, Kai-Uwe Bux
Well, let's have a look at the wording of the standard:

[5/4]
Except where noted, the order of evaluation of operands of individual
operators and subexpressions of individual expressions, and the order in
which side effects take place, is unspecified.53) Between the previous
and next sequence point a scalar object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the
prior value shall be accessed only to determine the value to be stored.
The requirements of this paragraph shall be met for each allowable
ordering of the subexpressions of a full expression; otherwise the
behavior is undefined.


Now, if the standard just dropped the line last 5 words, the clause would
still specify what a well defined program looks like. Just the compiler
would be required to issue an error if the shall-clause is violated. Since
the set of allowable orderings of sub-expressions is finite, the compiler
could detect this. [Also, I venture the conjecture that as part of an
optimization effort, the compiler would look at a sizable number of them
anyway.]

Such a wording would leave all options for optimizing well formed
programs. Now, why do we need undefined behavior here?
Maybe we don't, in this case, if the compiler has enough information.
But does your analysis work for _all_ such problems, and if not can you
still find a form of words that works?

What do you mean by "all such problems"? I did not suggest to get rid of UB
in general, just in clause 5/4. And I think the analysis applies indeed to
all evaluations of scalar expressions between sequence points. Since
function calls induce sequence points, these expressions never involve
recursion and thus have finite depth known at compile time.

My point was really addressing the implicit question "why does the
standard say this code is not well-formed?", rather than "what should
the compiler do about it?".

I agree that the code should not be well-formed.


Best

Kai-Uwe Bux
 
R

Richard Herring

Kai-Uwe Bux said:
Richard said:
In message <[email protected]>, Kai-Uwe Bux
Well, let's have a look at the wording of the standard:

[5/4]
Except where noted, the order of evaluation of operands of individual
operators and subexpressions of individual expressions, and the order in
which side effects take place, is unspecified.53) Between the previous
and next sequence point a scalar object shall have its stored value
modified at most once by the evaluation of an expression. Furthermore, the
prior value shall be accessed only to determine the value to be stored.
The requirements of this paragraph shall be met for each allowable
ordering of the subexpressions of a full expression; otherwise the
behavior is undefined.


Now, if the standard just dropped the line last 5 words, the clause would
still specify what a well defined program looks like. Just the compiler
would be required to issue an error if the shall-clause is violated. Since
the set of allowable orderings of sub-expressions is finite, the compiler
could detect this. [Also, I venture the conjecture that as part of an
optimization effort, the compiler would look at a sizable number of them
anyway.]

Such a wording would leave all options for optimizing well formed
programs. Now, why do we need undefined behavior here?
Maybe we don't, in this case, if the compiler has enough information.
But does your analysis work for _all_ such problems, and if not can you
still find a form of words that works?

What do you mean by "all such problems"?

Just thinking out loud.
I did not suggest to get rid of UB
in general, just in clause 5/4.

In that case, the answer to my question is "yes" ;-)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,197
Latest member
Sean29G025

Latest Threads

Top