Why is this expression detected as undefined by GCC, Splint?

T

Tim Rentsch

Eric Sosman said:
Undefined behavior: `x' is modified twice without an intervening
sequence point.

No, the sequence point of the comma operator must precede the
evaluation of the assignment operation.
Yes, there's a sequence point between `x++' and `p'. But there's
no sequence point between `x=' and `x++'.

Of course there is, just as there is for

x = f( x++ );
 
T

Tim Rentsch

amuro said:
Hi, I wonder why the following expression is detected as undefined
expression. In my opinion, this is a "defined" expression.

x = *(x++, p); // line 21

$ gcc -Wsequence-point test.c
test.1:21: warning: operation on 'x' may be undefined

I suspect this comes from 'x' being incremented before being
assigned a value. Was 'x' initialized?
$ splint test.c
test.c:21:6: Expression has undefined behavior (value of left operand
x is modified by right operand *(x++, p)): x = *(x++, p)

Splint is broken. The 'x++' being before a comma operator of a RHS
subexpression of the assignment, the increment of 'x' must be complete
before the assignment operation is initiated.
Another expression looks similar with previous expression shows
expected result by GCC but not Splint. That means GCC does not
consider it as undefined but Splint considers it as undefined
expression.

x = (x++, y); // line 19

$ splint test.c
test.c:19:6: Expression has undefined behavior (value of left operand
x is modified by right operand (x++, y)): x = (x++, y)

Splint is broken here also.
 
I

Ike Naar

This statement is not correct. The expression 'lhs x' is
not affected by 'x++' so it doesn't matter whether 'x++'
is evaluated before or after 'lhs x'.

You're correct, x's value is irrelevant for ``lhs x'' because
lhs x is only written to. My bad.
 
S

Seebs

Except in this case the point made clear _is_ true in the old
document. And C89 as well.

Is it? I was under the impression that consensus was that, even though
it's obvious what "should" happen, an assignment to something which is
also modified during evaluation of the value to be assigned was undefined
behavior because the same object could in theory be modified twice between
sequence points, because assignment wasn't formally specified to imply
sequencing.

-s
 
T

Tom St Denis

The comma's introduces a sequence point between the evaluations of x++
and p, and the assignment cannot be evaluated until after p is
evaluated, so how is the behavior undefined?  AFAICT, that expression
must have identical behavior to the well-defined "x++; x = *p;".

I thought the order of eval of comma expressions is not defined. So
it could be x = *p; x++; as well.

Tom
 
D

David Resnick

I thought the order of eval of comma expressions is not defined.  So
it could be x = *p; x++; as well.

Tom

[#2] The left operand of a comma operator is evaluated as a
void expression; there is a sequence point after its
evaluation. Then the right operand is evaluated; the result
has its type and value.

-David
 
M

Marcin Grzegorczyk

Tim said:
Except in this case the point made clear _is_ true in the old
document. And C89 as well.

I'd say the point was ambiguous in C89 and C99.

IIRC, Larry Jones said on comp.std.c something to the effect of "the new
sequencing specification is supposed to say what we always meant the
Standard to say" (sorry if I grossly misquote you, Larry) ;-)
 
M

Marcin Grzegorczyk

amuro wrote:
[x = *(x++, p);]
Hm.. Is the assignment to x sequenced after post-increment of x?
[snip quotes of the N1539 draft]
So in the expression "x = *(x++, p)", assignment to x is sequenced
after the value computation
of right operand "*(x++, p)". It does not mean assignment to x is
sequenced after the side effect
of right operand. But, during value computation, it causes sequence
point. After all,
the assignement to x is sequenced after the side effect of post-
increment of x.
That means those two side effect actions are not in between two
sequence points.

Yes, the side effect of the assignment is sequenced after the side
effect of the post-increment (because any side effects of the
sub-expression `x++` are sequenced before the value computation of `p`
(6.5.17p2 and the definition of a sequence point in 5.1.2.3p3), which is
sequenced before the value computation of the indirection operator
(6.5p1), which is sequenced before the side effect of the assignment
(6.5.16p3). That is my reading of N1539, anyway.)

However, the side effects of `x++` are unsequenced relative to the value
computation of the left-hand side of the assignment operator (the lvalue
`x`). And since value computation of an lvalue involves "determining
the identity of the designated object" (5.1.2.3p2), it appears the
behaviour is still undefined (6.5p2).
 
B

Barry Schwarz

I thought the order of eval of comma expressions is not defined. So
it could be x = *p; x++; as well.

When the comma is used to separate function arguments, the order of
evaluation is unspecified. When the comma operator is used, it
specifies a sequence point which requires the left operand to be
evaluated and all side effects to be completed prior to evaluating the
right operand. Not the only example of a token having multiple
meanings depending on context.
 
T

Tim Rentsch

pete said:
You are implying that the right operand of an assignment operator
must be evaluated prior to assignment,

It is true that in the abstract machine the right operand (and
also the (lvalue of the) left operand) of an assignment statement
must be evaluated before the assignment is performed.

However, it is also important to distinguish between the time
that an expression is evaluated and the time that any side
effects of that expression complete (which is the following
sequence point). Evaluating the right-hand-side subexpression

*(x++,p)

causes the sub-subexpression 'x++' be evaluated, and also
its side effects are complete _before_ the evaluation of 'p'
(which is used in turn to produce the overall value, after
dereferencing) yields the value for dereferencing and
subsequent assignment.
which implies that there is nothing wrong with
(x = x++);
which is not the case.

No, because this expression has no sequence point after the
'x++' and before producing the value to be assigned. The
corresponding case would be

x = (x++,x-1);

which is well-defined, not undefined.
Evaluation of an expression has two parts
1 determining the value of the expression
2 the side effects of the expression

Merely determining the value of an expression,
is only a partial evaluation,
if the expression also has side effects.

The comma operator has a sequence point after the
evaluation of the first operand, which means that
any of its side effects are complete before the second
operand is evaluated to produce the overall value.
It is not necessary to evaluate (x++, p)
in order to determine its value.

In the abstract machine it _is_ necessary, because in the
abstract machine evaluating an expression is the only thing that
can produce a value.
 
T

Tim Rentsch

Seebs said:
Is it? I was under the impression that consensus was that, even though
it's obvious what "should" happen, an assignment to something which is
also modified during evaluation of the value to be assigned was undefined
behavior because the same object could in theory be modified twice between
sequence points, because assignment wasn't formally specified to imply
sequencing.

What you're saying is right if there is no intervening sequence
point, but here there is. The right-hand-side subexpression
'(x++,p)' must be evaluated (in the abstract machine) prior to
the assignment, to get the value to be assigned. The side
effects of 'x++' must be complete before (starting) the
evaluation of 'p', which in turn must be done to produce the
value for the subsequent assignment.
 
T

Tim Rentsch

Marcin Grzegorczyk said:
I'd say the point was ambiguous in C89 and C99.

It isn't. There are some ambiguous cases in those wordings,
but this isn't one of them.
IIRC, Larry Jones said on comp.std.c something to the effect of "the
new sequencing specification is supposed to say what we always meant
the Standard to say" (sorry if I grossly misquote you, Larry) ;-)

He did but this case isn't what he was talking about. The old
rules (1) were somewhat unclear about how function calls worked,
and (2) arguably made undefined cases like 'a[a[0]] = 2;', where
initially a[0] == 0. Expressions like the example with sequence
points in them have always been unambiguous. Consider for
example this (somewhat silly) assignment:

x = ++x && y;

The update of x in '++x' must be completed before the '&&'
operator yields its value, and therefore there is no
conflict in assigning to x.
 
T

Tim Rentsch

Marcin Grzegorczyk said:
amuro wrote:
[x = *(x++, p);]
Hm.. Is the assignment to x sequenced after post-increment of x?
[snip quotes of the N1539 draft]
So in the expression "x = *(x++, p)", assignment to x is sequenced
after the value computation
of right operand "*(x++, p)". It does not mean assignment to x is
sequenced after the side effect
of right operand. But, during value computation, it causes sequence
point. After all,
the assignement to x is sequenced after the side effect of post-
increment of x.
That means those two side effect actions are not in between two
sequence points.

Yes, the side effect of the assignment is sequenced after the side
effect of the post-increment (because any side effects of the
sub-expression `x++` are sequenced before the value computation of `p`
(6.5.17p2 and the definition of a sequence point in 5.1.2.3p3), which
is sequenced before the value computation of the indirection operator
(6.5p1), which is sequenced before the side effect of the assignment
(6.5.16p3). That is my reading of N1539, anyway.)

However, the side effects of `x++` are unsequenced relative to the
value computation of the left-hand side of the assignment operator
(the lvalue `x`). And since value computation of an lvalue involves
"determining the identity of the designated object" (5.1.2.3p2), it
appears the behaviour is still undefined (6.5p2).

I already explained this in my response to Ike Naar. Evaluating 'x++'
doesn't change the address of 'x', so it doesn't matter whether the
computation of 'lvalue x' is done before or after (or during) the
evaluation, including completion of side effects, of 'x++'.
 
M

Marcin Grzegorczyk

Tim said:
Marcin Grzegorczyk said:
[x = *(x++, p);]
Yes, the side effect of the assignment is sequenced after the side
effect of the post-increment (because any side effects of the
sub-expression `x++` are sequenced before the value computation of `p`
(6.5.17p2 and the definition of a sequence point in 5.1.2.3p3), which
is sequenced before the value computation of the indirection operator
(6.5p1), which is sequenced before the side effect of the assignment
(6.5.16p3). That is my reading of N1539, anyway.)

However, the side effects of `x++` are unsequenced relative to the
value computation of the left-hand side of the assignment operator
(the lvalue `x`). And since value computation of an lvalue involves
"determining the identity of the designated object" (5.1.2.3p2), it
appears the behaviour is still undefined (6.5p2).

I already explained this in my response to Ike Naar. Evaluating 'x++'
doesn't change the address of 'x', so it doesn't matter whether the
computation of 'lvalue x' is done before or after (or during) the
evaluation, including completion of side effects, of 'x++'.

That's one possible interpretation. I would be wary of relying on it,
though.

Consider this example:

struct { unsigned x:5, y:6, z:7; } s;
unsigned u;
/* ... */
s.x = (s.x++, u);

In a typical implementation, either update of `s.x` will require reading
the container object first. Now the question is: is that read access
considered a part of the side effect, or may it be considered a part of
value computation (as defined in 5.1.2.3)? If the latter, then the
behaviour is undefined.
 
L

lawrence.jones

Marcin Grzegorczyk said:
N1539 and N1547 are supposed to be (and, I believe, are) the same,
except that N1547 has differences to N1516 marked on the margins.

And sometime in the not too distant future there will be N1548 which is
also the same text but with diff marks from N1256 (the final version of
C99 with all three TCs applied). (Note that the Abstract on the first
page explains the diff marks and notes which document they're based on.)
 
T

Tim Rentsch

Marcin Grzegorczyk said:
Tim said:
Marcin Grzegorczyk said:
[x = *(x++, p);]
Yes, the side effect of the assignment is sequenced after the side
effect of the post-increment (because any side effects of the
sub-expression `x++` are sequenced before the value computation of `p`
(6.5.17p2 and the definition of a sequence point in 5.1.2.3p3), which
is sequenced before the value computation of the indirection operator
(6.5p1), which is sequenced before the side effect of the assignment
(6.5.16p3). That is my reading of N1539, anyway.)

However, the side effects of `x++` are unsequenced relative to the
value computation of the left-hand side of the assignment operator
(the lvalue `x`). And since value computation of an lvalue involves
"determining the identity of the designated object" (5.1.2.3p2), it
appears the behaviour is still undefined (6.5p2).

I already explained this in my response to Ike Naar. Evaluating 'x++'
doesn't change the address of 'x', so it doesn't matter whether the
computation of 'lvalue x' is done before or after (or during) the
evaluation, including completion of side effects, of 'x++'.

That's one possible interpretation. I would be wary of relying on it,
though.

There is no other interpretation (up to isomorphism) that is
consistent with how the Standard describes these operations.
Consider this example:

struct { unsigned x:5, y:6, z:7; } s;
unsigned u;
/* ... */
s.x = (s.x++, u);

In a typical implementation, either update of `s.x` will require
reading the container object first. Now the question is: is that read
access considered a part of the side effect, or may it be considered a
part of value computation (as defined in 5.1.2.3)? If the latter,
then the behaviour is undefined.

That may be true but it's irrelevant to the semantics defined for
the abstract machine. In the abstract machine, assigning to an
lvalue commences at the point of evaluting the assignment operator,
not before. Since a C program must behave exactly as if executed
in the abstract machine, whatever an implementation may do to
effect these actions must not change the defined semantics.
 
L

Luca Forlizzi

IIRC, Larry Jones said on comp.std.c something to the effect of "the
new sequencing specification is supposed to say what we always meant
the Standard to say" (sorry if I grossly misquote you, Larry) ;-)

He did but this case isn't what he was talking about.  The old
rules (1) were somewhat unclear about how function calls worked,
and (2) arguably made undefined cases like 'a[a[0]] = 2;', where
initially a[0] == 0.  Expressions like the example with sequence

I am aware of the issue (2) (a[a[0]]=2; where a[0]=0), it was
discussed deeply in a thread some months ago. Which aspects of
function calls are clarified by the new wording in the future standard
drafts?
 
M

Marcin Grzegorczyk

Tim said:
That may be true but it's irrelevant to the semantics defined for
the abstract machine. In the abstract machine, assigning to an
lvalue commences at the point of evaluting the assignment operator,
not before.

That's your interpretation. :)
I think this issue deserves a clarification from the WG14.
(How does one go about filing a Defect Report, anyway?)
 
A

amuro

Tim said:
Marcin Grzegorczyk said:
[x = *(x++, p);]

In short, your example is also defined expression.
The following statement is my intention which is different from you.

update or read of `s.x` does not require reading the container object
's'

Anyway, I'd like answer your question as possible.
Now the question is: is that read access
considered a part of the side effect, or may it be considered a part of
value computation (as defined in 5.1.2.3)? If the latter, then the
behaviour is undefined.
I think read access is a part of value computation. It is well
specified
at 5.1.2.3p2("Accessing a volatile object, modifying an object,
modifying a file, or calling a function ... changes in the state of
the execution environment").

But value computation may involve side effect.
e.g), *(p + x++) = e // value computation of lvalue involves
// side effect of x.

In your example, that has also defined bevaiour not undefined.

s.x = (s.x++, u);
^^^(1) ^^^^^(2)
^^^^^^^^^^(3)

It is unsequenced between (1) and (3). Also unsequenced (1) and (2).
Numbering does not mean the order of evaluation. :)

value computation of (1) is the same with determining the identity of
the
designated object(means lvalue).
Lvalue(s.x) equals to address of s plus offset of field x.
IOW, Lvalue(s,x) = Address(s) + offset(x)
There's no need for reading the contents of s. So no read on s.
(But when pointer dereferencing is used for LHS,it will require
reading
the operand of dereference operator.
e.g) *(s.pointer_field) = e; // reading s.pointer_field)

Evaluation of (2) involves side effect as well as value computation.
i) value computation of (2) : Rvalue(s.x++)
It also does not require reading the variable s. Instead, it requires
reading the memory of Address(s)+offset(x).
ii) side effect of (2) : increment by one at the memory of
Lvalue(s.x).

After evaluation of (1) and (3), assignment operator will commence.
side effect of (1) : nothing.
side effect of (3) :
write to `address(s) + offset(x)' which occurred at `previous SP'.

assignment is writing to Lvalue of (1) which is Address(s)+offset(x).
So write to Address(s)+offset(x) does not conflict with side effects
of
(1) and (3).
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top