The undefinedness of a common expression.

André Gillibert · Jan 26, 2010

Wojtek Lerch said:
It's the other way around: a lvalue-to-value conversion involves a read
access, but that doesn't mean that every read access is part of an
lvalue-to-value conversion.

Indeed, you're right.
However, the standard doesn't specify where there are no read access...
e.g.

int y=0;
volatile int x;
x=y;

The standard doesn't specify if y is to be read 153 times or only once.

Twice? This is about whether an assignment to a volatile variable reads
it *once* or not at all. I don't think anybody has claimed that it reads
is twice.

Excuse me... I had in mind the case of i = i + 1 where i is a volatile
variable.

It's undefined in C++? Why?

This has even been mentioned in a C++ DR (DR #222).
From http://std.dkuug.dk/JTC1/SC22/WG21/docs/cwg_active.html#222:

One could argue that as the C++ standard currently stands, the effect of x
= y = 0; is undefined. The reason is that it both fetches and stores the
value of y, and does not fetch the value of y in order to compute its new
value.

This DR also deals with volatile variable in assignments... In the context
of C++.

The standard doesn't specify that all "reads" are due to lvalue-to-rvalue
conversions, but I think one can state that all lvalue-to-rvalue conversions
count as "reads".
In C++, x = y = 0; makes a lvalue-to-rvalue conversion of the expression (y
= 0), and thus, "reads" the y object. This read is not done to compute its
new value. This is UB.

In C, there's nothing y=0 is a rvalue, and there's nothing stating the value
is read after the assignment (so, a fetch is either "unspecified" or
forbidden).

n1124> An assignment expression has the value of the left operand after the
assignment, but is not an lvalue.

I was curious enough to test existing practice.

#include <stdio.h>
volatile int x,y;
int main(void) {
printf("hello");
x=y=0;
printf("hello");
}

(The printf statements are useful as delimiters around the relevant compiled
code).

GCC on GNU/Linux/i386 fetches y, in both C99 and C++98 modes.
TCC on GNU/Linux/i386 doesn't fetch y.

Consequently, existing behavior differs between C99 implementations. I think
that's bad. It may be due to the ambiguous/defective standard wording, or a
GCC bug, perhaps because GCC uses the same back-end for C and C++.

Christopher Dearlove · Jan 26, 2010

André Gillibert said:
This has even been mentioned in a C++ DR (DR #222).
From http://std.dkuug.dk/JTC1/SC22/WG21/docs/cwg_active.html#222:

It's a long way from "One could argue" in 1999 to "undefined in C++"
in 2010, given that there's been a standard update in 2003 where had
it been seriously thought there was a problem it could have been fixed.
(I'm referring to the non-volatile case.)

lawrence.jones · Jan 26, 2010

In comp.std.c "Andr? Gillibert said:
I was curious enough to test existing practice.

#include <stdio.h>
volatile int x,y;
int main(void) {
printf("hello");
x=y=0;
printf("hello");
}

(The printf statements are useful as delimiters around the relevant compiled
code).

GCC on GNU/Linux/i386 fetches y, in both C99 and C++98 modes.
TCC on GNU/Linux/i386 doesn't fetch y.

Consequently, existing behavior differs between C99 implementations. I think
that's bad. It may be due to the ambiguous/defective standard wording, or a
GCC bug, perhaps because GCC uses the same back-end for C and C++.

I maintain that you deserve whatever behavior you get when you write
code like that. Instead of the compound assignment, you should be
explicit about the behavior you want. Either:

y = 0;
x = 0;

or:

y = 0;
x = y;

Wojtek Lerch · Jan 26, 2010

André Gillibert said:
Indeed, you're right.
However, the standard doesn't specify where there are no read access...
e.g.

int y=0;
volatile int x;
x=y;

The standard doesn't specify if y is to be read 153 times or only once.

The standard doesn't say that y is read; it says that the lvalue expression
"y", in a context such as the above, is converted to the value stored in the
designated object. A read access is only implied, not stated explicitly;
but it seems to be clear enough that the conversion is done by a single read
access to the object.

The standard also says that the value of the expression "x=y" is the value
of the object after the assignment. Whether those words also imply a read
access to x or not is much less clear, at least to some of us, and that's
what we were debating here.

This has even been mentioned in a C++ DR (DR #222).
From http://std.dkuug.dk/JTC1/SC22/WG21/docs/cwg_active.html#222:

That's interesting reading. Thanks for the pointer.

But the C++ standard, just like the C standard, only forbids fetching the
*prior* value for purposes other than to compute the nev value. I think
you'll agree that people who write things like x=y=0 expect the *new* value
of y to be stored in x, not the old value -- but indeed, it seems that the
C++ standard does not guarantee that clearly enough.

André Gillibert · Jan 26, 2010

Christopher Dearlove said:
It's a long way from "One could argue" in 1999 to "undefined in C++"
in 2010, given that there's been a standard update in 2003 where had
it been seriously thought there was a problem it could have been fixed.
(I'm referring to the non-volatile case.)

That's undefined in C++03, but, the new sequencing rules introduced in 2008
(Oxford) fix this issue. C++1x will include these new rules.

André Gillibert · Jan 26, 2010

Wojtek Lerch said:
The standard doesn't say that y is read; it says that the lvalue
expression "y", in a context such as the above, is converted to the value
stored in the designated object. A read access is only implied, not
stated explicitly; but it seems to be clear enough that the conversion is
done by a single read access to the object.

The standard also says that the value of the expression "x=y" is the value
of the object after the assignment. Whether those words also imply a read
access to x or not is much less clear, at least to some of us, and that's
what we were debating here.

That's interesting reading. Thanks for the pointer.

But the C++ standard, just like the C standard, only forbids fetching the
*prior* value for purposes other than to compute the nev value.

Ok.

:Between the previous and next sequence point an object shall have its
stored value
:modified at most once by the evaluation of an expression. Furthermore, the
prior value
:shall be read only to determine the value to be stored.71

The glitch is that, the assignment y = 0 may take effect (side effect) after
the y value is fetched (in C++03) for the x= assignment.
: the order of evaluation of subexpressions and the order in which side
effects take place are both unspecified.

I think you'll agree that people who write things like x=y=0 expect the
*new* value of y to be stored in x, not the old value --

But, a C++03 implementation might give the old value....
In C++03, x=y=0 is equivalent to (y=0,x=y), but, without the sequence point
of the comma...
If x and y are numeric, it's equivalent to (y=0)+(x=y).

but indeed, it seems that the C++ standard does not guarantee that
clearly enough.

Anyway, this is an old issue that C never had and C++1x fixes.

Christopher Dearlove · Jan 26, 2010

André Gillibert said:
That's undefined in C++03, but, the new sequencing rules introduced in
2008 (Oxford) fix this issue. C++1x will include these new rules.

Clearly no one thought the undefinedness was more than a theoretical issue
for non-volatiles even in 2003. It's still a jump from "One could argue" to
your unqualified assertion that it is undefined.

Christopher Dearlove · Jan 26, 2010

André Gillibert said:
But, a C++03 implementation might give the old value....

Is this anything more than speculation? Is there any evidence of any C++
compiler actually not doing what everyone expects? (Reminder, this is
the non-volatile case.)

AndrÃ© Gillibert · Jan 26, 2010

Christopher Dearlove said:
Is this anything more than speculation? Is there any evidence of any C++
compiler actually not doing what everyone expects? (Reminder, this is
the non-volatile case.)

This is speculation.
The standard committee never intended to make x=y=0 undefined. It's
very widely used, and supported by all compilers I've ever seen.

Wojtek Lerch · Jan 26, 2010

I maintain that you deserve whatever behavior you get when you write
code like that.

Why? Becuase the C standard is unclear on what that code does?

AndrÃ© Gillibert · Jan 26, 2010

Wojtek Lerch said:
Why? Becuase the C standard is unclear on what that code does?

Yes, if x and y are volatile, the C standard is unclear and existing
implementations differ.

lawrence.jones · Jan 27, 2010

In comp.std.c Wojtek Lerch said:
Why? Becuase the C standard is unclear on what that code does?

No, because the meaning of the code in inherently fuzzy. If you're
concerned about exactly what accesses get made to an object (which you
probably are if the object is volatile), then using it in a complex
expression is just asking for trouble. Writing simple code that clearly
expresses your intent is a much better strategy.

Wojtek Lerch · Jan 27, 2010

AndrÃ© Gillibert said:
Yes, if x and y are volatile, the C standard is unclear and existing
implementations differ.

And whose fault is that -- the programmer's or the standard's?

Besides, even if x and y are not volatile, there still is a problem.

The standard attempts to define the semantics of C by describing the
behaviour of the abstract machine, and then, separately, telling us what
aspects of that behaviour must be reflected by the implementation. Before
we start talking about how the behaviour of the abstract machine maps to the
real hardware, we need to know what it can possibly be. In the case of the
assignment operators, it's the standard's job to tell us whether the
abstract machine obtains the value by reading it back from the object being
assigned to, by returning a copy of the value being assigned, or whether
it's up to the implementation. And the standard fails to do that job.

At the first glance, it seems that the answer doesn't matter unless the
object is volatile; but are you certain that it's not possible to have
situations where the presence of a read access in the abstract machine may
trigger undefined behaviour by violating some seemingly unrelated rule --
maybe something about the effective type, or about the restrict qualifier,
or perhaps something else? If such cases are possible (for non-volatile
objects), would you say that the programmer deserves whatever behaviour he
gets there as well?

Antoine Leca · Jan 27, 2010

Wojtek said:
AndrÃ© Gillibert wrote in message

And whose fault is that -- the programmer's or the standard's?

Perhaps existing implementations, hence implementers?

Since they pick up different interpretations, and both appear to make
sense, and furthermore different programmers might have learn the
different behaviours, you do not have easy solution to that situation,
hence the "problem"; seeking "whose fault" won't help, by the way;
trying to define a consensus about how to make the Standard (or the
Standard's reading) clearer, on the other hand, might.

So it might be comp.*.c gurus' fault, which are failing to cast clear
interpretations? ;-)

Besides, even if x and y are not volatile, there still is a problem.

If there is a practical problem with int x,y; x=y=0; I fail to see which
one. I believe this code is given for granted if you read K&R1.

In the case of the assignment operators, it's the standard's job to tell us
whether the abstract machine obtains the value by reading it back from
the object being assigned to, by returning a copy of the value being
assigned, or whether it's up to the implementation. And the standard
fails to do that job.

I fail to understand the result of your past discussions. Where does it
prevent the implementer to choose?
If it does not prevent it, where does it fail to fall back in the last
case of your enumeration?

but are you certain that it's not possible to have situations where

<snip>

There is a general answer to that rhetoric: the Standard explicitely
forecasts such cases to exist, and even forges a name at this effect:
unspecified behaviour.

Antoine

Daniel Giaimo · Jan 27, 2010

And whose fault is that -- the programmer's or the standard's?

The programmer's. It is the programmer's job to do things the way the
standard specifies. If a programmer attempts to do something that the
standard is unclear on, then it is the programmer's fault if it
doesn't work the way they expect it to.

jameskuyper · Jan 27, 2010

Wojtek said:
And whose fault is that -- the programmer's or the standard's?

Both. The standard should not be unclear, and programmers should avoid
writing code that depends upon the precise interpretation of parts of
the standard that are not clear.

Wojtek Lerch · Jan 28, 2010

No, because the meaning of the code in inherently fuzzy.

No, the meaning of the code wouldn't be fuzzy if the words in the standard
that define it weren't ambiguous. Nothing inherent about it.

If you're
concerned about exactly what accesses get made to an object (which you
probably are if the object is volatile), then using it in a complex
expression is just asking for trouble. Writing simple code that clearly
expresses your intent is a much better strategy.

It seems to me that what constitutes "simple code" is relative -- to me,
something like

x = y = complicated_expression;

is simpler and more clearly expresses my intent than

tmp = complicated_expression;
x = tmp;
y = tmp;

But maybe that's just because I've spent many years under the apparently
mistaken impression that my interpretation of that paragraph in the standard
was the intended one, and that those two ways of writing code are
equivalent. If the real intention was to make volatile so "flexible" that a
conforming implementation is free to declare that the expression a+b
constitues a write acces to the volatile variable named c, then of course I
should not rely on what my instincts tell me about the simplicity or clarity
of C code.

Wojtek Lerch · Jan 28, 2010

Antoine Leca said:
Perhaps existing implementations, hence implementers?

It's implementers fault that the words of the standard allow different
conflicting interpretations that all seem reasonable? Or is it their fault
because they just picked their own interpretations without complaining to
the committee about the ambiguity in the standard? I have to admit that I
don't blame them for that -- to me, this particular ambiguity is of te sort
that makes it easy to think that the other interpretation is just plain
silly and therefore obviously wrong. Regardless of which of the two
interpretations you pick.

Since they pick up different interpretations, and both appear to make
sense, and furthermore different programmers might have learn the
different behaviours, you do not have easy solution to that situation,

Sending a DR is fairly easy (for some people), and it would solve the
situation even if the answer was that it's unspecified.

hence the "problem"; seeking "whose fault" won't help, by the way;
trying to define a consensus about how to make the Standard (or the
Standard's reading) clearer, on the other hand, might.

The first step would be for the committee to define a consensus about their
intent.

If there is a practical problem with int x,y; x=y=0; I fail to see which
one. I believe this code is given for granted if you read K&R1.

Sure, but I was thinking about an assignment being a subexpression in a big
and complicated expression, in a context where a read access to x triggers
undefined behaviour by violating some rule in a subtle way. I don't know of
such situation and doubt it would be a practical problem if it existed; but
here in comp.std.c there's nothing wrong about being concerned about
theoretical problems too.

I fail to understand the result of your past discussions. Where does it
prevent the implementer to choose?

But it's the purpose of the standard to prevent implementers from having too
much choice!!! In the case in question, my complaint is not that the
standard dos or does not allow implementers to choose -- it's that it forces
them to choose between different possible interpretations of the text.
That's the kind of choice that the standard should avoid giving to people.
If the standard wants to give implementers choice, it should just say what
they can choose from, rather than say something unclear and make them guess
what it was supposed to mean.

If it does not prevent it, where does it fail to fall back in the last
case of your enumeration?

The last case sounds like the least reasonable interpretation of the words
to me -- the words just don't sound like they're meant to let implementation
choose. The words sound like they say that a particular thing happens.
They just fail to clearly describe what that thing is.

Think about the other place that talks about obtaining the value of an
object -- 6.3.2.1#2. It says that lvalues are "converted to the value
stored in the designated object". Note that this one doesn't clearly say
that this "conversion" is done *by* accessing the object -- should we think
that the standard means to say that it's unspecified whether an access
occurs here as well?

<snip>

There is a general answer to that rhetoric: the Standard explicitely
forecasts such cases to exist, and even forges a name at this effect:
unspecified behaviour.

No, unspecified behaviour is when the standard PROVIDES two or more choices
and allows the implementation to choose. That's not the same as providing
some unclear words that can be interpreted in two or more different ways.

Richard Bos · Jan 28, 2010

Wojtek Lerch said:
It's implementers fault that the words of the standard allow different
conflicting interpretations that all seem reasonable?

Who says it's anybody's fault? Perhaps there really are several ways of
handling the situations which are reasonable on different systems.

But it's the purpose of the standard to prevent implementers from having too
much choice!!!

Define "too much". (In the case of exclamation marks, it's 2...)

Richard

Wojtek Lerch · Jan 28, 2010

Richard Bos said:
Who says it's anybody's fault? Perhaps there really are several ways of
handling the situations which are reasonable on different systems.

Perhaps, but if the standard is ambiguous about which of them are allowed,
that's a defect in the standard, isn't it?

Define "too much". (In the case of exclamation marks, it's 2...)

I won't tell you where the line lies exactly, but for sure having a standard
that doesn't forbid anything, or not having a standard at all, is definitely
on the side of too much choice. Anyway, my point was that I never
complained about the standard preventing the implementer to choose -- what I
complained about is the standard being unclear about what the allowed
choices are.

sequence points and expression evaluation	10	Jul 2, 2010
Two expression variant of ()	13	Mar 30, 2012
Initializer as Full Expression	3	Aug 12, 2010
The cost of the cheapest routes between cities	3	Jan 7, 2023
Average of MultiMode of a list of a list	1	Oct 28, 2022
How does a HEAD pointer end up pointing to the first node in a linked list?	3	Jan 24, 2023
constant string as controlling expression in _Generic gives error	8	Dec 8, 2013
Need help making the position of an infinite animation sticky	1	Dec 18, 2022

The undefinedness of a common expression.

André Gillibert

Christopher Dearlove

lawrence.jones

Wojtek Lerch

André Gillibert

André Gillibert

Christopher Dearlove

Christopher Dearlove

AndrÃ© Gillibert

Wojtek Lerch

AndrÃ© Gillibert

lawrence.jones

Wojtek Lerch

Antoine Leca

Daniel Giaimo

jameskuyper

Wojtek Lerch

Wojtek Lerch

Richard Bos

Wojtek Lerch

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads