Benefit of not defining the order of execution

K

Keith Thompson

Stephen Sprunk said:
somenath said:
In case of relational and logical operator the order of execution is
not defined except for && and || which is left to right.
What could be the reason for not defining the order of execution?
What benefit it provides ? is it the speed?
[...]

Also, remember that ANSI's job was primarily to document existing
practice, not define a perfect language. Where existing
implementations didn't all agree on a particular behavior, they
usually left the meaning undefined or unspecified rather than break
some of them.

On the other hand, requiring a particular order of evaluation in a new
C standard would break existing implementations, but it wouldn't break
existing code (except for code that's can already be broken by
compiling it with a different compiler). I don't advocate such a
change to the language, but it would have a lower hurdle to jump than
a change that would actually break code.
 
K

Kenny McCormack

Stephen Sprunk said:
Requiring a particular order would provide no gain at best and a loss
of performance in all other cases -- all cost, no benefit.

This is clearly untrue, in the abstract. In the sense of "If somebody
were developing a C-like language today, and left this undefined, they'd
be shot."
If the order actually matters, you can modify the code slightly to
insert sequence points as desired to force a particular ordering. And,
of course, there is the "as if" rule which mucks everything up anyways.

I think what we're really saying is that there should be a compiler
option - leave it undefined - the compiler can do that which is most
efficient - or do it deterministically, albeit at the cost of being
possibly not quite as fast.

I believe that in today's day and age, where computers are cheap, but
even bad programmers are expensive (*), most shops will opt for the later.

(*) And this being the age of .NET, where this philosophy is God.
Also, remember that ANSI's job was primarily to document existing
practice, not define a perfect language. Where existing
implementations didn't all agree on a particular behavior, they usually
left the meaning undefined or unspecified rather than break some of
them.

On this, we agree.
 
K

Kaz Kylheku

As others have said, it does indeed often allow compilers to produce
faster code.

Even if that's the case, you pay for that with a risk. That risk is acceptable
to some, not to others. Why should the viewpoints of those others be ignored?

It's possible to support both behaviors: the unsafe behavior can be confined to
well-delineated pockets of a program, which are being optimized for maximum
performance.
On reason for this is that many computers (e.g. x86)
have only a small number of registers, and different orders of
evaluation may need different numbers of registers.

** Abstract order is not actual order! **

The abstract order only must be adhered to /if/ there are side effects which
reveal the order!

The rest of your article seems to depend on the idea that the abstract order of
evaluation must be followed literally.
 
K

Kaz Kylheku

Not specifying the order allows implementations to do whatever is most
efficient in their circumstances;

Why is that important?
ABC might be the fastest on one
system, while CBA is the fastest on another, and doing all three in
parallel might be fastest on a third.
Requiring a particular order
would provide no gain at best
and a loss of performance in all other
cases -- all cost, no benefit.

All cost, and no /performance/ benefit. You are confusing "benefit" and
"performance benefit", as if the only possible benefits are performance
benefits.

Do we justify the very existence of higher level language by citing performance
benefits?
If the order actually matters, you can
modify the code slightly

Here is 250,000 lines of code. Go find places where the order matters and fix
the code ``slightly'' so that it actualy expresses what you think its
author wanted it to do!

All done? Great, let's recompile and ship the new version to the customers!

See, not every piece of C code is a small utility program that you yourself
wrote for your own use.

But it shows where you're coming from, doesn't it!

Don't you think that language design should be concerned with its impact on
large-scale software development?

Fact is that in 35 years of C, compilers have failed to develop good support
for diagnosing situations where a calculation produces a different result based
on order of evaluation.

Here is a quote:

Having the order of evaluation undefined is claimed to yield better
performing code. Compilers could warn about such examples, which are
typically subtle bugs (or potential subtle bugs). I'm disappointed that after
decades, most compilers still don't warn, leaving that job to specialized,
separate, and underused tools.

Who wrote this? I will leave that as a Google exercise for the reader. Note
how the author carefully wrote ``is claimed'' in the passive voice, to avoid
actually asserting the idea as something he knows to be a fact.
particular ordering. And, of course, there is the "as if" rule which
mucks everything up anyways.

The way you express your understanding of the "as if" rule says a lot.
Also, remember that ANSI's job was primarily to document existing
practice, not define a perfect language.

That reasoning applies to the first revision of the standard.

Are variable-length arrays and _Bool existing practice?

The second revision of the standard invents a lot of features, and fails to go
back and re-visit basic issues like this.
 
T

Tim Rentsch

Kaz Kylheku said:
The reason is myopia: making it easy to write optimizing compilers,
at the cost of introducing risk into the code-base written in the programming
language.

Undefined orders of execution have no place in an imperative programming
language; i.e. one in which the majority of programs that are considered
idiomatic do their job by means of side effects in expressions.

If you were designing a C-like language from scratch today, and left
evaluation order undefined, you should be shot.


The belief that it provides speed is not a fact.

Even if evaluation order is well-defined, compilers can still rearrange the
evaluation, provided that the result of the computation is correct.

Only in cases where there are side effects does the order have to be
sufficienty constrained so that the effects are done in the proper order.

Note that C does define the order among full expressions, with the concept of
sequencing.

Any compiler that literally obeys sequence points cannot be called optimizing
by modern standards. A quality compiler must reorder computation across
sequence points! So if you write

x = a + b;
y = a + z;

even though there is a sequence point between these expressions,
they can be reordered, because they are independent. Even
these could be significantly rearranged:

a = i++;
b = i++;

The generated code could do something like

a = i;
b = i + 1;
i += 2;

Sequencing doesn't mean that the computation must literally take place as
written in every detail; it's an abstract concept tied to the ``as if'' rule.

Even if it can be shown than unspecified order of evaluation provides an
undeniable performance benefit, there is no reason why that order has to be
unspecified in every part of the program, in every expression in every function
in every source file.

Apparently several people are of the opinion that having language
semantics be more deterministic is better than being not as
deterministic, because... well, just because. To that I say,
just 'tisn't so.

Even if a clever optimizing compiler could recover (relative to C as
it is now) all the possible parallelism of a C-like language with a
defined left-to-right order of evaluation (and it can't, but never
mind that now), it still isn't automatically better to impose a
left-to-right evaluation order, or any fixed evaluation order. In
fact, doing that to C expressions would make C a worse language,
not a better one. Here's why.

If I see a piece of C code like, for example,

a = f( g(x), h(y) );

then I don't have to look at the definitions for g() and h() to know
they don't interact (by which I mean, interact in any nontrivial
way). The reason? If they did interact, the code would have been
written differently, to force a particular order of evaluation.

Of course, technically I don't know that g() and h() don't interact;
I know only that the person who wrote the code didn't think it was
important to force a particular order of evaluation. But knowing
the intent (or in this case, the lack of intent) of the code's
author is just as valuable here, or perhaps more valuable. I can
always go look at the definitions for g() and h() to see if they
interact, but I can't go back and look at what the code's author
was thinking.

Now consider the case where the language specifies a left-to-right
evaluation order. Let's look again at the example line. Now I have
to wonder if g() and h() interact; to find out I have to go read
their definitions. If they don't interact, I can breathe a big sigh
of relief and go back to reading the function where they were
called. But suppose they do interact; in that case I have to
wonder if the interaction was known and deliberate, or whether it
might have been unintentional. Short of going back and asking the
original author, there really isn't any way of knowing. Discovering
what the program does and what the program was intended to do has
become a lot more work.

Conversely, if I discover that g() and h() interact in C as it
is now, it's simply an error. The original programmer either
forgot something, or misunderstood something, or was confused
about something, or whatever; which ever of these is these is
the case, the program is simply in error, and I don't have to
wonder whether the interaction was intended or not -- it wasn't.[1]

Requiring a left-to-right evaluation order has made the job of code
reading harder rather than easier. And it encourages, even if only
indirectly, the writing of non-obvious code that depends on that
evaluation order. Both of those trends are trending in the wrong
direction.


[1] Of course, it's possible to construct situations where g() and
h() interact in a non-trivial way, yet the overall program behavior
is correct no matter which order is chosen. But the vast majority
of cases are not this way; moreover, any programmer who writes such
code without putting in a comment on that aspect deserves no better
treatment than a programmer who made an outright error, or at the
very least should be prepared for some sharp criticism during code
review.

If unspecified evaluation order is indeed an optimization tool, then it should
be recognized that it's a dangerous optimization tool, and a way should be
provided for the programmer to choose the regions of the program where the
order is unspecified. Suppose you had a reorder operator:

reorder /expression/

Everything under the reorder operator is subject to unspecified evaluation
order, other than sequencing operators. The return value and type of the
reorder operator are those of the expression.

There are two problems with this approach, one mechanical, one
behavioral.

The mechanical problem is that most expressions are reorderable, so
'reorder' would end up being used in almost every instance. That's
the wrong side of the 80/20 rule (and probably closer to 90/10 for
the situation of which expressions can be reordered).

The behavioral problem is that, because it's more work, most
programmers won't ever put in the 'reorder' directives, or will put
them in only very sparingly. But that's the opposite of what's good
in terms of reading the code -- the order /should/ be unspecified
unless there is a deliberate effort to do something different, so
that cases where the order matters stand out. If it's more work to
put in the option to reorder, in many cases that won't be done
simply out of laziness, which leaves us wondering in the cases where
the 'reorder' option wasn't specified whether that was deliberate or
not.

Futhermore, there already is a perfectly good way to force a
particular evaluation order if/when that is desired. In fact, two
ways -- assignment statements, and the comma operator. And, not
only can these be used to force an evaluation order, they offer
the choice of which evaluation order makes the most sense in
any individual circumstance.

Certainly the C language has some areas of semantic weakness. But
unspecified order of expression evaluation isn't one of them.
 
K

Kaz Kylheku

Apparently several people are of the opinion that having language
semantics be more deterministic is better than being not as
deterministic, because... well, just because. To that I say,
just 'tisn't so.

When you say deterministic, what is the implied opposite? Do you have some
particular nondeterminism in mind, or all-out chaos?

What does it mean to be ``more'' deterministic? Either something is
deterministic or it isn't, right? Or not?

Can semantics be non-deterministics? Semantics is meaning. If the behavior of a
construct is undefined, does it still have semantics, or are semantics lacking?

Does a = i++ have semantics that is deterministic? Or does it have no
semantics at all? And what is the practical difference between the two?
Even if a clever optimizing compiler could recover (relative to C as
it is now) all the possible parallelism of a C-like language with a
defined left-to-right order of evaluation (and it can't, but never
mind that now), it still isn't automatically better to impose a
left-to-right evaluation order, or any fixed evaluation order.

It certainly is. Order is better than chaos. This is a human value.
In
fact, doing that to C expressions would make C a worse language,
not a better one. Here's why.

If I see a piece of C code like, for example,

a = f( g(x), h(y) );

then I don't have to look at the definitions for g() and h() to know
they don't interact

The logic of your argument, if it has such a property, is truly bizarre.

You do /not/ know that g and h do not interact!
(by which I mean, interact in any nontrivial
way). The reason? If they did interact, the code would have been
written differently, to force a particular order of evaluation.

Hahahaha, I see. You are assuming that nobody makes mistakes.

<div id="asshole-mode">

See, I actually work in this industry on actual software, rather than
just textbook examples for a newsgroup or class.

So let me condescend to you for a minute to give you a better informed point of
view.

:)

</div>


There is a program made up of millions of lines of code. Somewhere, it contains
the code f(g(x), h(y)). You don't know where, or whether. Now it so happens
that this g and h do interact, okay? That's because someone made a programming
error.

Now imagine we are in a parallel universe in which C has well-defined order of
evaluation. Well, there is no error, because in this universe, everyone knows
that evaluation of function arguments goes left to right. Whoever wrote the
code hopefully tested it. In testing, the code would have produced some
expected behaviors. Those behaviors would have been, in part, due to an
evaluation order, and that evaluation order is the one that is defined in the
language.

Now in the real universe that we live in, whoever wrote that mistake also
tested the code, and it produced the expected behavior, at the time.
The compiler used did in fact appear to ``define'' the behavior,
and that code passed tests.

Problem is, that code is a time bomb. If you change something about the way the
compiler is run, or port the program to a different compiler, the behavior may
change.

You are now charged with the task of making such an compiler upgrade, or
porting job, for this million line codebase.

Code like f(g(x), h(y)) is hiding from you. Happy hunting!
Of course, technically I don't know that g() and h() don't interact;

So you do know this non-technically. Wonderful!
I know only that the person who wrote the code didn't think it was
important to force a particular order of evaluation.

You know no such thing. The person who wrote the code may have been an
idiot who thought that it goes left to right. The person may have been
a C expert who made a mistake, but by fluke the expected behavior was
produced causing the mistake to recede into the depths of the history
of that code base, turning into a ticking bomb.

That's the problem with unspecified orders: getting the expected behavior,
but without that behavior being codified by the language!

See, it's a nice property of the language that if you get the behavior you
expect, the order of evaluation which produced that behavior is actually
required, and so will not change.
But knowing
the intent (or in this case, the lack of intent) of the code's
author is just as valuable here, or perhaps more valuable.

How do you know the intent of the author? What author? Remember,
we are in a million line program, and we don't even know where
these erroneous constructs are located, never mind who their author
was or what was the intent.

When a bug is found in code, was it the author's intent to write that bug?

Often you don't know what the intent is because the bug hides the intent.

You have to trace back to the functional requirement specification
which is relevant for that piece of code, and re-implement it.
Now consider the case where the language specifies a left-to-right
evaluation order. Let's look again at the example line. Now I have
to wonder if g() and h() interact; to find out I have to go read
their definitions. If they don't interact, I can breathe a big sigh
of relief and go back to reading the function where they were
called. But suppose they do interact; in that case I have to
wonder if the interaction was known and deliberate, or whether it
might have been unintentional.

Suppose they do interact! Well, you go back to the test cases and figure out
whether they actually tickle the interaction. You might be lucky and find that
it's been exposed to coverage.

In any case, the interaction is at least portable.

You ship a version of this program for five different platforms, using
different compilers and cross-compilers.

The interaction is the same on all of them.
Short of going back and asking the
original author, there really isn't any way of knowing.

There are ways, if you work for an organization that documents at least
requirements, if nothing more.

There may be a domain expert who has coverage of that area, even if the
original programmer doesn't work there.

If there is no documentation, and no expert with a firm opinion, then what you
do is enumerate the choices for that behavior, and get input from product
management people, other developers and maybe even customers of the program.

Unclear intent can be replaced with new intent.

But all of that assumes you found the problem.
Discovering
what the program does and what the program was intended to do has
become a lot more work.

If the evaluation order is unspecified, you additionally have to investigate
what it does on all the different target compilations.
Conversely, if I discover that g() and h() interact in C as it
is now, it's simply an error.

It's not ``simply'' an error! Because the construct occurs in working code
that has perhaps even shipped numerous revisions to real users.

It may be that although g and h interact, they are called in a particular order
which produced an expected behavior. So in the binary compiled program, there
is no error.

Suppose you're only looking at this g and h because the program was ported to a
different compiler, and this ported program shipped, and a problem was reported
from the field by users. Maybe you are finally looking at this g and h as a
result of days of effort trying to isolate the cause of the reported problem.
Sure, /now/ it's obvious that there is an evaluation order dependency!

I'd rather not get that bug report in the first place; I'd rather that the
ported program called g and h in the same darn order.
The original programmer either
forgot something, or misunderstood something, or was confused
about something, or whatever; which ever of these is these is
the case, the program is simply in error, and I don't have to
wonder whether the interaction was intended or not -- it wasn't.[1]

What if it was intended?
Requiring a left-to-right evaluation order has made the job of code
reading harder rather than easier. And it encourages, even if only
indirectly, the writing of non-obvious code that depends on that
evaluation order. Both of those trends are trending in the wrong
direction.


[1] Of course, it's possible to construct situations where g() and
h() interact in a non-trivial way, yet the overall program behavior
is correct no matter which order is chosen. But the vast majority
of cases are not this way; moreover, any programmer who writes such
code without putting in a comment on that aspect deserves no better
treatment than a programmer who made an outright error, or at the
very least should be prepared for some sharp criticism during code
review.

If unspecified evaluation order is indeed an optimization tool, then it should
be recognized that it's a dangerous optimization tool, and a way should be
provided for the programmer to choose the regions of the program where the
order is unspecified. Suppose you had a reorder operator:

reorder /expression/

Everything under the reorder operator is subject to unspecified evaluation
order, other than sequencing operators. The return value and type of the
reorder operator are those of the expression.

There are two problems with this approach, one mechanical, one
behavioral.

The mechanical problem is that most expressions are reorderable, so
'reorder' would end up being used in almost every instance.

This is precisely why it would NOT be used. Expressions are reorderable by
themselves (``as if'' rule: actual semantics is not abstract semantics!)

This reorder would have to be used with care.

Moreover, its presence would explicitly signal that something funny is
going on in that section of code.

You can grep code for ``reorder''.
The behavioral problem is that, because it's more work, most
programmers won't ever put in the 'reorder' directives, or will put

Good! They shouldn't! The less of these they put in, the beter.
them in only very sparingly. But that's the opposite of what's good
in terms of reading the code -- the order /should/ be unspecified
unless there is a deliberate effort to do something different, so
that cases where the order matters stand out.

Cases where the order matters are not special. This is an imperative
language. Order of operations is king.
If it's more work to
put in the option to reorder, in many cases that won't be done
simply out of laziness, which leaves us wondering in the cases where
the 'reorder' option wasn't specified whether that was deliberate or
not.
What???


Futhermore, there already is a perfectly good way to force a
particular evaluation order if/when that is desired. In fact, two
ways -- assignment statements, and the comma operator.

So why don't we banish these from the language?

If you had strict evaluation of order in C, right from 1974 onward,
would you today be arguing for its elimination?

So why should it be the case that

puts("foo");
puts("bar");

produce output in a particular order. Let's get rid of that.

And, not
only can these be used to force an evaluation order, they offer
the choice of which evaluation order makes the most sense in
any individual circumstance.

Certainly the C language has some areas of semantic weakness. But
unspecified order of expression evaluation isn't one of them.

That leaves me wondering what you consider an actual semantic weakness.

Is there a more powerful way to duck out of a requirement than
to leave it undefined?

Doubly undefined? Undefined squared? N to the power of undefined?
 
R

Richard Tobin

Kaz Kylheku said:
The rest of your article seems to depend on the idea that the
abstract order of evaluation must be followed literally.

You must have completely misunderstood my article. It was showing why
it may be advantageous to not use the left-to-right order, and had
nothing to do with what aspect of the language definition allows this.

-- Richard
 
T

Tim Rentsch

Kaz Kylheku said:
Apparently several people are of the opinion that having language
semantics be more deterministic is better than being not as
deterministic, because... well, just because. To that I say,
just 'tisn't so.

When you say deterministic, what is the implied opposite? Do you have some
particular nondeterminism in mind, or all-out chaos?

What does it mean to be ``more'' deterministic? Either something is
deterministic or it isn't, right? Or not?

Can semantics be non-deterministics? Semantics is meaning. If the behavior of a
construct is undefined, does it still have semantics, or are semantics lacking?

Does a = i++ have semantics that is deterministic? Or does it have no
semantics at all? And what is the practical difference between the two?
Even if a clever optimizing compiler could recover (relative to C as
it is now) all the possible parallelism of a C-like language with a
defined left-to-right order of evaluation (and it can't, but never
mind that now), it still isn't automatically better to impose a
left-to-right evaluation order, or any fixed evaluation order.

It certainly is. Order is better than chaos. This is a human value.
In
fact, doing that to C expressions would make C a worse language,
not a better one. Here's why.

If I see a piece of C code like, for example,

a = f( g(x), h(y) );

then I don't have to look at the definitions for g() and h() to know
they don't interact

The logic of your argument, if it has such a property, is truly bizarre.

You do /not/ know that g and h do not interact!
(by which I mean, interact in any nontrivial
way). The reason? If they did interact, the code would have been
written differently, to force a particular order of evaluation.

Hahahaha, I see. You are assuming that nobody makes mistakes.

<div id="asshole-mode">

See, I actually work in this industry on actual software, rather than
just textbook examples for a newsgroup or class.

So let me condescend to you for a minute to give you a better informed point of
view.

:)

</div>


There is a program made up of millions of lines of code. Somewhere, it contains
the code f(g(x), h(y)). You don't know where, or whether. Now it so happens
that this g and h do interact, okay? That's because someone made a programming
error.

Now imagine we are in a parallel universe in which C has well-defined order of
evaluation. Well, there is no error, because in this universe, everyone knows
that evaluation of function arguments goes left to right. Whoever wrote the
code hopefully tested it. In testing, the code would have produced some
expected behaviors. Those behaviors would have been, in part, due to an
evaluation order, and that evaluation order is the one that is defined in the
language.

Now in the real universe that we live in, whoever wrote that mistake also
tested the code, and it produced the expected behavior, at the time.
The compiler used did in fact appear to ``define'' the behavior,
and that code passed tests.

Problem is, that code is a time bomb. If you change something about the way the
compiler is run, or port the program to a different compiler, the behavior may
change.

You are now charged with the task of making such an compiler upgrade, or
porting job, for this million line codebase.

Code like f(g(x), h(y)) is hiding from you. Happy hunting!
Of course, technically I don't know that g() and h() don't interact;

So you do know this non-technically. Wonderful!
I know only that the person who wrote the code didn't think it was
important to force a particular order of evaluation.

You know no such thing. The person who wrote the code may have been an
idiot who thought that it goes left to right. The person may have been
a C expert who made a mistake, but by fluke the expected behavior was
produced causing the mistake to recede into the depths of the history
of that code base, turning into a ticking bomb.

That's the problem with unspecified orders: getting the expected behavior,
but without that behavior being codified by the language!

See, it's a nice property of the language that if you get the behavior you
expect, the order of evaluation which produced that behavior is actually
required, and so will not change.
But knowing
the intent (or in this case, the lack of intent) of the code's
author is just as valuable here, or perhaps more valuable.

How do you know the intent of the author? What author? Remember,
we are in a million line program, and we don't even know where
these erroneous constructs are located, never mind who their author
was or what was the intent.

When a bug is found in code, was it the author's intent to write that bug?

Often you don't know what the intent is because the bug hides the intent.

You have to trace back to the functional requirement specification
which is relevant for that piece of code, and re-implement it.
Now consider the case where the language specifies a left-to-right
evaluation order. Let's look again at the example line. Now I have
to wonder if g() and h() interact; to find out I have to go read
their definitions. If they don't interact, I can breathe a big sigh
of relief and go back to reading the function where they were
called. But suppose they do interact; in that case I have to
wonder if the interaction was known and deliberate, or whether it
might have been unintentional.

Suppose they do interact! Well, you go back to the test cases and figure out
whether they actually tickle the interaction. You might be lucky and find that
it's been exposed to coverage.

In any case, the interaction is at least portable.

You ship a version of this program for five different platforms, using
different compilers and cross-compilers.

The interaction is the same on all of them.
Short of going back and asking the
original author, there really isn't any way of knowing.

There are ways, if you work for an organization that documents at least
requirements, if nothing more.

There may be a domain expert who has coverage of that area, even if the
original programmer doesn't work there.

If there is no documentation, and no expert with a firm opinion, then what you
do is enumerate the choices for that behavior, and get input from product
management people, other developers and maybe even customers of the program.

Unclear intent can be replaced with new intent.

But all of that assumes you found the problem.
Discovering
what the program does and what the program was intended to do has
become a lot more work.

If the evaluation order is unspecified, you additionally have to investigate
what it does on all the different target compilations.
Conversely, if I discover that g() and h() interact in C as it
is now, it's simply an error.

It's not ``simply'' an error! Because the construct occurs in working code
that has perhaps even shipped numerous revisions to real users.

It may be that although g and h interact, they are called in a particular order
which produced an expected behavior. So in the binary compiled program, there
is no error.

Suppose you're only looking at this g and h because the program was ported to a
different compiler, and this ported program shipped, and a problem was reported
from the field by users. Maybe you are finally looking at this g and h as a
result of days of effort trying to isolate the cause of the reported problem.
Sure, /now/ it's obvious that there is an evaluation order dependency!

I'd rather not get that bug report in the first place; I'd rather that the
ported program called g and h in the same darn order.
The original programmer either
forgot something, or misunderstood something, or was confused
about something, or whatever; which ever of these is these is
the case, the program is simply in error, and I don't have to
wonder whether the interaction was intended or not -- it wasn't.[1]

What if it was intended?
Requiring a left-to-right evaluation order has made the job of code
reading harder rather than easier. And it encourages, even if only
indirectly, the writing of non-obvious code that depends on that
evaluation order. Both of those trends are trending in the wrong
direction.


[1] Of course, it's possible to construct situations where g() and
h() interact in a non-trivial way, yet the overall program behavior
is correct no matter which order is chosen. But the vast majority
of cases are not this way; moreover, any programmer who writes such
code without putting in a comment on that aspect deserves no better
treatment than a programmer who made an outright error, or at the
very least should be prepared for some sharp criticism during code
review.

If unspecified evaluation order is indeed an optimization tool, then it should
be recognized that it's a dangerous optimization tool, and a way should be
provided for the programmer to choose the regions of the program where the
order is unspecified. Suppose you had a reorder operator:

reorder /expression/

Everything under the reorder operator is subject to unspecified evaluation
order, other than sequencing operators. The return value and type of the
reorder operator are those of the expression.

There are two problems with this approach, one mechanical, one
behavioral.

The mechanical problem is that most expressions are reorderable, so
'reorder' would end up being used in almost every instance.

This is precisely why it would NOT be used. Expressions are reorderable by
themselves (``as if'' rule: actual semantics is not abstract semantics!)

This reorder would have to be used with care.

Moreover, its presence would explicitly signal that something funny is
going on in that section of code.

You can grep code for ``reorder''.
The behavioral problem is that, because it's more work, most
programmers won't ever put in the 'reorder' directives, or will put

Good! They shouldn't! The less of these they put in, the beter.
them in only very sparingly. But that's the opposite of what's good
in terms of reading the code -- the order /should/ be unspecified
unless there is a deliberate effort to do something different, so
that cases where the order matters stand out.

Cases where the order matters are not special. This is an imperative
language. Order of operations is king.
If it's more work to
put in the option to reorder, in many cases that won't be done
simply out of laziness, which leaves us wondering in the cases where
the 'reorder' option wasn't specified whether that was deliberate or
not.
What???


Futhermore, there already is a perfectly good way to force a
particular evaluation order if/when that is desired. In fact, two
ways -- assignment statements, and the comma operator.

So why don't we banish these from the language?

If you had strict evaluation of order in C, right from 1974 onward,
would you today be arguing for its elimination?

So why should it be the case that

puts("foo");
puts("bar");

produce output in a particular order. Let's get rid of that.

And, not
only can these be used to force an evaluation order, they offer
the choice of which evaluation order makes the most sense in
any individual circumstance.

Certainly the C language has some areas of semantic weakness. But
unspecified order of expression evaluation isn't one of them.

That leaves me wondering what you consider an actual semantic weakness.

Is there a more powerful way to duck out of a requirement than
to leave it undefined?

Doubly undefined? Undefined squared? N to the power of undefined?


Based on the tone of this response, it seems clear the responder
is either unwilling or unable to process the comments he's posting
a response to, at least on this topic.

Generally I find it isn't useful to try to have a discussion with
someone who is either not willing or not able to listen. In this
particular instance, besides the discussion not being useful for
itself, I believe there would be no benefit to the group either.

If anyone else is interested in responding/replying to my comments,
either to offer a different point of view or to ask questions,
please feel free to get in touch with me by email if that offers
a more suitable venue to continue the discussion.

My apologies for including the entire earlier posting before these
short comments; I thought it important that the context that
prompted them be readily available.
 
R

Richard Bos

Kaz Kylheku said:
Here is 250,000 lines of code. Go find places where the order matters and fix
the code ``slightly'' so that it actualy expresses what you think its
author wanted it to do!

Look here, boy, if you wrote 250,000 lines of code without a fucking
clue about the language you wrote them in, that's _your_ problem,
Buster, not mine.
All done? Great, let's recompile and ship the new version to the customers!

_Your_ customers. Mine are used to getting more solidly written code
than that. Yours, apparently, won't give a shit.
Here is a quote:

Having the order of evaluation undefined is claimed to yield better
performing code. Compilers could warn about such examples, which are
typically subtle bugs (or potential subtle bugs). I'm disappointed that after
decades, most compilers still don't warn, leaving that job to specialized,
separate, and underused tools.

Who wrote this?

It doesn't matter, because its meaning is muddled and its concept is
wrong from the start. "Such examples"? _Which_ examples, exactly, ones
which depend on the order of evaluation or ones which (correctly!) do
not?

Richard
 
R

Richard Bos

Kaz Kylheku said:
Even if that's the case, you pay for that with a risk. That risk is acceptable
to some, not to others. Why should the viewpoints of those others be ignored?

Because C is C, and not Ada.
It's possible to support both behaviors: the unsafe behavior can be confined to
well-delineated pockets of a program, which are being optimized for maximum
performance.

In C, _all_ parts of a program should be optimised for maximum
performance. That's what C is for. It's not what Ada and Pascal are for.

Richard
 
K

Kaz Kylheku

Because C is C, and not Ada.

Good thing this ``reasoning'' was not applied when function prototyping was
added to the language, or when implicit int was thrown out! Or when
member selection became scoped to the struct/union type being accessed.

Defining evaluation orders is not a move toward greater programming discipline,
under which more kinds of programs are rejected as incorrect. In fact, more
kinds of programs become correct and portable.

You gain freedom in the use of side effects in expressions, and at
the same time freedom from worry that a program will behave differently
when ported, on account of evaluation order issues hiding in that program.
In C, _all_ parts of a program should be optimised for maximum
performance. That's what C is for.

Serious language maintainers cannot take the point of view that C is a
language with exactly one purpose. Sometimes people write large projects in
C even though maximum performance is not required everywhere. They may
choose C due to the availability of compilers, and the relatively
lean run-time support requirements.

Maximum performance is predictably obtained only from hand-tuned machine code.

And note that, speaking of which, in many machine languages, the order is
well-defined. In general, if you write instruction X before Y, the behavior is
as if X was executed before Y.

Different implementations of the same instruction set bend over backwards to
support the abstract order, so that existing binaries run correctly.

So it's ironic that this goes out the window when you use C to target these
same machine languages.

For instance AMD can make a chip that works enough like the Intel chip to boot
an entire operating system and run applications without anything having to be
re-assembled or re-compiled.

But two C compilers for the Intel instruction set might not agree about the
behavior of C code, simply because that code used perfectly valid syntax,
designating the application of of otherwise well-defined operations on correct
operands, with the only problem being ambiguity of order.

Your ``high-level assembler'' is less deterministic than some of the
target languages it is used for.
 
K

Keith Thompson

Because C is C, and not Ada.

Ada doesn't define order of evaluation in many of the same cases where
C doesn't, and for the same reason.
In C, _all_ parts of a program should be optimised for maximum
performance. That's what C is for. It's not what Ada and Pascal are for.

It certainly is one of the design criteria for Ada. I don't have
specific examples, but when I took a course in code generation and
optimization, I suddenly understood why a lot of Ada's rules were
written that way.

On the other hand, Ada has fewer forms of expressions with side
effects (for example, assignment is a statement, not an expression),
so there are fewer ways to shoot yourself in the foot -- there's no
direct equivalent of "i = i++;". The language design is intended to
allow both efficiency and safety. (Whether it's achieved these goals
is certainly not a question for this newsgroup.)
 
U

user923005

Even with a strict order of evaluation, the Rationale's "as if" rule would
still give the compiler such provisions for optimization.

In other words, given

    y = a + b + c;

with a strict order of evaluation, the compiler may still be allowed to
reorder evaluation and rewrite expressions. The standard *does* provide
*some* helpful guarantees with the order of evaluation -- namely, that if
evaluating the above as

   y = a + (b + c);

would produce a different result (such as with floating point arithmetic
in certain cases), then such "regrouping" is prohibited. This wasn't the
case with K&R C, where such "regrouping" was allowed. But ignoring those
pathological cases, we can see that a strict order of evaluation wouldn't
inhibit the compiler from evaluating c, then a, then b, then (b + c), then
a + (b + c) if it can get away with doing so.

I submit that this is impossible to achieve (IOW, it cannot be known
if a+(b+c)==(a+b)+c).
Allow me to expand:
Suppose that a,b,c are doubles and a is 2.0 and b and c are
DBL_EPSILON.
Then (a+b)+ c is 2.0, whereas a+(b+c) is larger than 2 by
DBL_EPSILON*2.
A strict order of evaluation would help only with expressions such
as

    f() + g()

*if* there are clashing side-effects. Without clashing side-effects, there
would still be the "as if" rule, and so the compiler is at liberty to
evaluate g() before f().

On a related note, the "as if" rule allows the compiler to jump across
sequence points as well -- many forms of optimization depend on this
ability. My point is that imposing a strict order of evaluation and/or
throwing in more sequence points won't necessarily prohibit optimization.
That leads me to believe the decision to be loose on evaluation order may
have had more to do with, as Kenny says, codifying existing practice than
anything else.

The Rationale introduces what I believe to be a more helpful term for this
kind of discussion than "sequence points": "agreement points". The
Rationale describes agreement points as those points where the state of an
object in the actual machine agrees with the state of the object in the
abstract machine. Not all sequence points need be agreement points. It's
this fact that allows aggressive optimization.

C is not deterministic. Neither are Fortran, C++, Java or any other
language that I am familiar with.
It is a terrible mistake to imagine that they are deterministic or
even that you will get the same result from a computation if you
perform it twice with the same inputs.

For instance, some Intel chips have an 80 bit float register. Some
compilers will use this register if you ask them to. But if an
interrupt occurs, the bottom precision bits will get scrubbed off and
you will be left with 64 correct bits instead of 80.

Some compilers may have a mode to maximize reproducibility. For
instance, one compiler that I use has these options:
/fp:<except[-]|fast|precise|strict> choose floating-point model:
except[-] - consider floating-point exceptions when generating
code
fast - "fast" floating-point model; results are less predictable
precise - "precise" floating-point model; results are predictable
strict - "strict" floating-point model (implies /fp:except)

Here is a funny thing:
When you call a math function from your math library, it is *very*
unlikely to be correctly rounded unless you are using:
"CRlibm: a Library of Elementary Functions with Correct Rounding"

I do think that correctness is a very laudable goal. But it is also
important to consider that all floating point computations have a ball
of fuzz around them, no matter how careful we are.
It is usually a good idea to shrink the size of the ball of fuzz as
much as possbile. But it may be that it costs so much to shrink the
fuzzy ball that it would be much better to leave it alone.
For instance, we could perform all computations using MPFR with 1000
digits of precision. But to do a large matrix multiplication that way
may be so expensive (or consume so much memory) that it would be
infeasible. Or it could also be that our inputs into the system are
so imprecise that the 16 digits or so of our floating point pale in
comparison to the 5 digits of precision given in the inputs and
carrying out computations with extended precision would not buy us
anything.

Ideally, we want calculations that are fast, correct, and repeatable.
Sometimes, we have to compromise a bit to be able to perform the
operations at all (in fact all floating point computations are a
compromise since we cannot carry them out in infinite precision).

IMO-YMMV.
 
U

user923005

On Feb 16, 2:05 pm, (e-mail address removed) (Richard Harter) wrote:
[snip]
In short, undefined order of evaluation in C is undesirable and
is not really necessary for compiling efficient code.  However C
is what it is and is not likely to change, so the issue is
academic.  It and the associated little bits of klutziness are
just something C programmers have to live with.

double a,b,c,d,e;
const double my_pi=3.1415926535897932384626433832795;
double sum = 0;
size_t i,j,k;

for (i = 0; i < 100; i++)
for (j = 0; j < 100; j++)
for (k = 0; k < 100; k++)
{
a = i;
b = j;
c = k;
e = my_pi * a * sqrt(2.0) * b;
d = e + c;
sum += d / (a+1.0);
}

If I am not allowed to reorder things, then my compiler will have to
do a lot more work than if it is allowed to reorder things.
Is this good or bad?
For that matter, if your compiler boiled that whole mess down into:
sum = <value>;
would you be pleased or displeased?
 
U

user923005

user923005 said:
On Feb 16, 2:05 pm, (e-mail address removed) (Richard Harter) wrote:
[snip]
In short, undefined order of evaluation in C is undesirable and
is not really necessary for compiling efficient code.  However C
is what it is and is not likely to change, so the issue is
academic.  It and the associated little bits of klutziness are
just something C programmers have to live with.
double a,b,c,d,e;
const double my_pi=3.1415926535897932384626433832795;
double sum = 0;
size_t i,j,k;
for (i = 0; i < 100; i++)
for (j = 0; j < 100; j++)
for (k = 0; k < 100; k++)
{
a = i;
b = j;
c = k;
e = my_pi * a * sqrt(2.0) * b;
d = e + c;
sum += d / (a+1.0);
}
If I am not allowed to reorder things, then my compiler will have to
do a lot more work than if it is allowed to reorder things.
Is this good or bad?
For that matter, if your compiler boiled that whole mess down into:
sum = <value>;
would you be pleased or displeased?

I'd be displeased, because it leaves, a, b, c, d, e, i, j and k with the wrong
values.

That assumes that they are used in the program somewhere else.

#include <stdio.h>
#include <stdlib.h>
#include <math.h>

int main(void)
{
double a,
b,
c,
d,
e;
const double my_pi = 3.1415926535897932384626433832795;
double sum = 0;
size_t i,
j,
k;

for (i = 0; i < 100; i++)
for (j = 0; j < 100; j++)
for (k = 0; k < 100; k++) {
a = i;
b = j;
c = k;
e = my_pi * a * sqrt(2.0) * b;
d = e + c;
sum += d / (a + 1.0);
}
printf("sum = %g\n", sum);
return 0;
}
 
U

user923005

On Feb 16, 2:05=A0pm, (e-mail address removed) (Richard Harter) wrote:
[snip]
In short, undefined order of evaluation in C is undesirable and
is not really necessary for compiling efficient code. =A0However C
is what it is and is not likely to change, so the issue is
academic. =A0It and the associated little bits of klutziness are
just something C programmers have to live with.
double a,b,c,d,e;
const double my_pi=3D3.1415926535897932384626433832795;
double sum =3D 0;
size_t i,j,k;
for (i = 0; i < 100; i++)
for (j = 0; j < 100; j++)
for (k = 0; k < 100; k++)
{
a = i;
b = j;
c = k;
e = my_pi * a * sqrt(2.0) * b;
d = e + c;
sum += d / (a+1.0);
}
If I am not allowed to reorder things, then my compiler will have to
do a lot more work than if it is allowed to reorder things.
Is this good or bad?

I'm sorry, but I can't make heads or tails of what you are saying
here.  Is this what you mean?

If the *compiler* is not allowed to reorder things then *the
compiled code* will have to do a lot more work than if *the
compiler* is allowed to reorder things?

If that is the case it's not relevant to the discussion.  Obvious
things such as hoisting outside loops is covered by the "as if"
rule. The topic is undefined order of evaluation that is not
covered and cannot be covered by the "as if" rule.

Example 1:
double a,b,c,d,e;
....
a = b * c + b * d + b * e;

/* can the compiler do this: */
a = b * (c + d + e);

Example 2:
a = b + c + d - b + b - c + c -d + d + d -d;
/* can the compiler do this: */
a = b + c + d;

Example 3:
b = 2.0;
c = DBL_EPSILON;
d = DBL_EPSILON;
a = b + c + d;
/* can the compiler do this: */
a = b + (c + d);
/* which produces a more accurate answer? */

Example 4:
Suppose that we have a vector;
double A[1024 * 1024];

A is populated with data containing very large and very small values.
If the code does this:

size_t index;
double sum = 0;
for (index = 0; index < sizeof A / sizeof A[0]; index++)
sum += A[index]; /* Is it OK if the compiler sorts the data
first? It will produce a far better answer. */

What is a case where you think reordering should not be allowed and
yet where it might happen (e.g. we have not declared variables as
volatile)?

I guess that I really cannot see what the fuss is about because I do
not seem to grasp the actual issue at hand.
 
K

Kaz Kylheku

On Feb 16, 2:05 pm, (e-mail address removed) (Richard Harter) wrote:
[snip]
In short, undefined order of evaluation in C is undesirable and
is not really necessary for compiling efficient code.  However C
is what it is and is not likely to change, so the issue is
academic.  It and the associated little bits of klutziness are
just something C programmers have to live with.

double a,b,c,d,e;
const double my_pi=3.1415926535897932384626433832795;
double sum = 0;
size_t i,j,k;

for (i = 0; i < 100; i++)
for (j = 0; j < 100; j++)
for (k = 0; k < 100; k++)
{
a = i;
b = j;
c = k;
e = my_pi * a * sqrt(2.0) * b;
d = e + c;
sum += d / (a+1.0);
}

This is not that pertinent to the discussion. You have no expressions
with side effects, and liberal use of sequence points.

Note that the the final values for a, b, c, d, e and sum, as well as i, j and k
can in fact be determined at compile time, provided that the translator can
precisely replicate the floating point arithmetic of the execution environment.

What constrains the order of the computation is that floating point operations
produce a different outcome if they are reordered. For instance

my_pi * a * sqrt(2.0) * b

has to produce the result that is obtained by first multiplying my_pi by a,
then that quantity by sqrt(2.0), and finally by b.

This is not what we mean by constraining the unspecified order of evaluation in
C. The order in question is that of evaluating the operands themselves, not
which floating point multiplication to do first.

If a and b were macros expanding to expressions with side effects, it is not
specified which happen first.
 
U

user923005

On Feb 16, 2:05 pm, (e-mail address removed) (Richard Harter) wrote:
[snip]
In short, undefined order of evaluation in C is undesirable and
is not really necessary for compiling efficient code.  However C
is what it is and is not likely to change, so the issue is
academic.  It and the associated little bits of klutziness are
just something C programmers have to live with.
double a,b,c,d,e;
const double my_pi=3.1415926535897932384626433832795;
double sum = 0;
size_t i,j,k;
for (i = 0; i < 100; i++)
for (j = 0; j < 100; j++)
for (k = 0; k < 100; k++)
{
a = i;
b = j;
c = k;
e = my_pi * a * sqrt(2.0) * b;
d = e + c;
sum += d / (a+1.0);
}

This is not that pertinent to the discussion.  You have no expressions
with side effects, and liberal use of sequence points.

Note that the the final values for a, b, c, d, e and sum, as well as i, j and k
can in fact be determined at compile time, provided that the translator can
precisely replicate the floating point arithmetic of the execution environment.

What constrains the order of the computation is that floating point operations
produce a different outcome if they are reordered.  For instance

  my_pi * a * sqrt(2.0) * b

has to produce the result that is obtained by first multiplying my_pi by a,
then that quantity by sqrt(2.0), and finally by b.

Which would rule forming the constant c = pi * sqrt_2 for lifing from
the loop. There is no way to guarantee that reordering won't change
the result. It may be with IEEE semantics there is some sort of
formal guarantee for some operations, but I would be hesitant to
believe even a promise like that.
Consider:
a = tiny * tiny * tiny * tiny * big * big * big * big;
verses
a = tiny * big * tiny * big * tiny * big * tiny * big;
This is not what we mean by constraining the unspecified order of evaluation in
C. The order in question is that of evaluating the operands themselves, not
which floating point multiplication to do first.

If a and b were macros expanding to expressions with side effects, it is not
specified which happen first.

Aren't the macros red herrings, though? A macro must expand into some
code or expression, which is subsequently compiled. Macros with side
effects are bad juju anyway. Be that as it may, if the result of the
macro expansion is not something that leads to undefined behavior, it
seems to me that the behavior would not be different than if you had
typed in the macro contents inline instead of using macros.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top