The comma operator, and assigning twice between sequence points

Discussion in 'C Programming' started by ais523, Feb 8, 2008.

  1. ais523

    ais523 Guest

    I've been wondering more about Undefined Behaviour, and the way in
    which (i=i++)-like examples can be 'corrected' so they mean something
    defined. This was particularly inspired by a line in some computer-
    generated code, whose essence was as follows:

    void func()
    {
    int a, b, *p;
    a=b=0;
    p=&b;
    *p=1+((p=&a),2);
    }

    Here, the variable actually being assigned to depends on the RHS of
    the assignment; but the comma introduces a sequence point, so I think
    this is a defined unambiguous assignment to a. (I'm not sure, though:
    this is why I'm asking c.l.c.) Some more examples along similar lines
    for statements for which I'm not clear about defined/undefined/
    unspecified:

    c=(a++,b)+(b++,a);

    The question here is whether the implementation is forced to evaluate
    the two parenthesised groups sequentially due to the sequence points
    in them. I think that this line might be UB, because of the
    possibility of incrementing both variables first and then adding the
    new values of a and b.

    a=(a++,a);

    My guess about this one is that it isn't UB because the comma forces
    the increment to happen before the assignment, leaving the line
    equivalent to ++a;.

    So my question is: which of these examples are UB, and why?
    --
    ais523
     
    ais523, Feb 8, 2008
    #1
    1. Advertising

  2. ais523

    Guest

    On Feb 8, 11:09 am, ais523 <> wrote:
    > I've been wondering more about Undefined Behaviour, and the way in
    > which (i=i++)-like examples can be 'corrected' so they mean something
    > defined. This was particularly inspired by a line in some computer-
    > generated code, whose essence was as follows:
    >
    > void func()
    > {
    > int a, b, *p;
    > a=b=0;
    > p=&b;
    > *p=1+((p=&a),2);
    >
    > }
    >
    > Here, the variable actually being assigned to depends on the RHS of
    > the assignment; but the comma introduces a sequence point, so I think
    > this is a defined unambiguous assignment to a. (I'm not sure, though:
    > this is why I'm asking c.l.c.) Some more examples along similar lines
    > for statements for which I'm not clear about defined/undefined/
    > unspecified:
    >
    > c=(a++,b)+(b++,a);
    >
    > The question here is whether the implementation is forced to evaluate
    > the two parenthesised groups sequentially due to the sequence points
    > in them. I think that this line might be UB, because of the
    > possibility of incrementing both variables first and then adding the
    > new values of a and b.
    >
    > a=(a++,a);
    >
    > My guess about this one is that it isn't UB because the comma forces
    > the increment to happen before the assignment, leaving the line
    > equivalent to ++a;.
    >
    > So my question is: which of these examples are UB, and why?


    First of all, I would fire anyone who actually
    wrote such a line of code.

    Sequence points and the order of execution
    are not the same thing.

    a=5;
    b=6;

    For the above, there are definitely sequence points
    for each statement. However, after optimiztion the compiler
    may set b=6 before it sets a=5 - as long as it will not
    affect the outcome.
    --
    Fred Kleinschmidt
     
    , Feb 8, 2008
    #2
    1. Advertising

  3. ais523

    Eric Sosman Guest

    ais523 wrote:
    > I've been wondering more about Undefined Behaviour, and the way in
    > which (i=i++)-like examples can be 'corrected' so they mean something
    > defined. This was particularly inspired by a line in some computer-
    > generated code, whose essence was as follows:
    >
    > void func()
    > {
    > int a, b, *p;
    > a=b=0;
    > p=&b;
    > *p=1+((p=&a),2);
    > }
    >
    > Here, the variable actually being assigned to depends on the RHS of
    > the assignment; but the comma introduces a sequence point, so I think
    > this is a defined unambiguous assignment to a. (I'm not sure, though:
    > this is why I'm asking c.l.c.)


    Undefined. There's no sequence point between the
    assignment to p in the RHS and the use of p's value in
    the LHS. The compiler is not obliged to delay reading
    p on the LHS until after the RHS is evaluated.

    > Some more examples along similar lines
    > for statements for which I'm not clear about defined/undefined/
    > unspecified:
    >
    > c=(a++,b)+(b++,a);
    >
    > The question here is whether the implementation is forced to evaluate
    > the two parenthesised groups sequentially due to the sequence points
    > in them. I think that this line might be UB, because of the
    > possibility of incrementing both variables first and then adding the
    > new values of a and b.


    Undefined. There are sequence points at the comma
    operators, but no sequence point associated with the `+'.
    Hence, there is no sequence point between `a++' and `a',
    nor between `b' and `b++'.

    > a=(a++,a);
    >
    > My guess about this one is that it isn't UB because the comma forces
    > the increment to happen before the assignment, leaving the line
    > equivalent to ++a;.


    I think this one is all right -- but it's so nauseating
    I'd be delighted to be wrong ...

    --
    Eric Sosman
    lid
     
    Eric Sosman, Feb 8, 2008
    #3
  4. Eric Sosman wrote:
    >
    > ais523 wrote:
    > > I've been wondering more about Undefined Behaviour, and the way in
    > > which (i=i++)-like examples can be 'corrected' so they mean something
    > > defined. This was particularly inspired by a line in some computer-
    > > generated code, whose essence was as follows:

    [...]
    > > c=(a++,b)+(b++,a);
    > >
    > > The question here is whether the implementation is forced to evaluate
    > > the two parenthesised groups sequentially due to the sequence points
    > > in them. I think that this line might be UB, because of the
    > > possibility of incrementing both variables first and then adding the
    > > new values of a and b.

    >
    > Undefined. There are sequence points at the comma
    > operators, but no sequence point associated with the `+'.
    > Hence, there is no sequence point between `a++' and `a',
    > nor between `b' and `b++'.


    Actually, there are sequence points between them, AFAICS. But,
    the order of execution is still unspecified. There is no
    guarantee that in "a = foo() + bar()" that foo() would be called
    before bar(). I think this one falls under "unspecified" rather
    than "undefined".

    Of course, I could be wrong. :)

    On further reflection, the unspecified order makes it look to
    me like there is either a sequence point between "a++" and "a",
    but not between "b" and "b++", _or_ there is a sequence point
    between "b++" and "b", but not "a" and "a++". (Eww... Is that
    even possible?)

    In other words:

    a++
    sequence point
    b
    b++
    sequence point
    a
    +

    or

    b++
    sequence point
    a
    a++
    sequence point
    b
    +


    Given:

    (w,x) + (y,z)

    we are guaranteed that "w" will be evaluated before "x", and
    that "y" will be evaluated before "z". But, are we guaranteed
    that "w" and "x" will be evaluated separately from "y" and "z"?

    In other words, can the evaluation order be w, y, x, z, with a
    sequence point between "y" and "x"? I don't see why not, given
    that the result is "as if" they were done in w/x/y/z order, and
    the sequence points are respected.

    > > a=(a++,a);
    > >
    > > My guess about this one is that it isn't UB because the comma forces
    > > the increment to happen before the assignment, leaving the line
    > > equivalent to ++a;.

    >
    > I think this one is all right -- but it's so nauseating
    > I'd be delighted to be wrong ...


    I think the same holds true for all of the OP's eaxmples. :)

    --
    +-------------------------+--------------------+-----------------------+
    | Kenneth J. Brody | www.hvcomputer.com | #include |
    | kenbrody/at\spamcop.net | www.fptech.com | <std_disclaimer.h> |
    +-------------------------+--------------------+-----------------------+
    Don't e-mail me at: <mailto:>
     
    Kenneth Brody, Feb 8, 2008
    #4
  5. ais523

    Eric Sosman Guest

    Kenneth Brody wrote:
    > Eric Sosman wrote:
    >> ais523 wrote:
    >>> I've been wondering more about Undefined Behaviour, and the way in
    >>> which (i=i++)-like examples can be 'corrected' so they mean something
    >>> defined. This was particularly inspired by a line in some computer-
    >>> generated code, whose essence was as follows:

    > [...]
    >>> c=(a++,b)+(b++,a);
    >>>
    >>> The question here is whether the implementation is forced to evaluate
    >>> the two parenthesised groups sequentially due to the sequence points
    >>> in them. I think that this line might be UB, because of the
    >>> possibility of incrementing both variables first and then adding the
    >>> new values of a and b.

    >> Undefined. There are sequence points at the comma
    >> operators, but no sequence point associated with the `+'.
    >> Hence, there is no sequence point between `a++' and `a',
    >> nor between `b' and `b++'.

    >
    > Actually, there are sequence points between them, AFAICS. But,
    > the order of execution is still unspecified. There is no
    > guarantee that in "a = foo() + bar()" that foo() would be called
    > before bar(). I think this one falls under "unspecified" rather
    > than "undefined".
    >
    > Of course, I could be wrong. :)
    >
    > On further reflection, the unspecified order makes it look to
    > me like there is either a sequence point between "a++" and "a",
    > but not between "b" and "b++", _or_ there is a sequence point
    > between "b++" and "b", but not "a" and "a++". (Eww... Is that
    > even possible?)
    >
    > In other words:
    >
    > a++
    > sequence point
    > b
    > b++
    > sequence point
    > a
    > +
    >
    > or
    >
    > b++
    > sequence point
    > a
    > a++
    > sequence point
    > b
    > +


    I think the sequence points impose only a partial
    ordering. Any arrangement of the sub-expressions `a++',
    `b', `b++', `a' is allowed, provided `a++' precedes `b'
    and `b++' precedes `a':

    a++ b++ a b
    a++ b++ b a
    a++ b b++ a
    b++ a a++ b
    b++ a++ a b
    b++ a++ b a

    .... and there may be further possibilities involving
    overlapped evaluation. The situation seems similar to
    that of

    f(g(x=1), h(x=2))

    .... where there are sequence points a-plenty, but none
    that separate the two assignments to `x'.

    --
    Eric Sosman
    lid
     
    Eric Sosman, Feb 8, 2008
    #5
  6. ais523

    Kaz Kylheku Guest

    On Feb 8, 11:09 am, ais523 <> wrote:
    > I've been wondering more about Undefined Behaviour, and the way in
    > which (i=i++)-like examples can be 'corrected' so they mean something
    > defined. This was particularly inspired by a line in some computer-
    > generated code, whose essence was as follows:
    >
    > void func()
    > {
    > int a, b, *p;
    > a=b=0;
    > p=&b;
    > *p=1+((p=&a),2);


    This is well-defined behavior because of the sequencing. The
    assignment to *p cannot take place until the right hand side is
    evaluated, and that evaluation is divided into two phases: before the
    comma and after.

    Without that comma, it would be undefined, because then p, in the same
    expression where it is being modified, would be accessed for a purpose
    other than determining the new value to be stored back into p.

    > Here, the variable actually being assigned to depends on the RHS of
    > the assignment;


    Right. The answer to the question /which/ variable is assigned to
    depends on the modification of p in the right hand side.

    but the comma introduces a sequence point, so I think
    > this is a defined unambiguous assignment to a. (I'm not sure, though:
    > this is why I'm asking c.l.c.) Some more examples along similar lines
    > for statements for which I'm not clear about defined/undefined/
    > unspecified:
    >
    > c=(a++,b)+(b++,a);


    The problem here is that although within the two constituent clauses
    of the + operator, there is sequencing going on, the two clauses
    themselves are not sequenced relative to each other.

    That is to say, given (x,y)+(z,w) there is a sequence point between
    x and y, and between z and w, so these pairs are ordered. But it
    cannot be deduced that there is a sequence point between x and z,
    between x and w, between y and z and between y and w. You know nothing
    about their relative ordering.

    > The question here is whether the implementation is forced to evaluate
    > the two parenthesised groups sequentially due to the sequence points
    > in them.


    Nope. The two subexpressions of the + could be sent to different
    processor pipelines to be done concurrently.

    Even if the two subexpressions are sequenced with respect to each
    other, and fully evaluated, you don't know in which order: left then
    right, or right then left? No evaluation order is specified for the +
    operator.

    > a=(a++,a);
    > My guess about this one is that it isn't UB because the comma forces
    > the increment to happen before the assignment, leaving the line
    > equivalent to ++a;


    That's right.
     
    Kaz Kylheku, Feb 9, 2008
    #6
  7. ais523

    Kaz Kylheku Guest

    On Feb 8, 11:24 am, wrote:
    > Sequence points and the order of execution
    > are not the same thing.
    >
    > a=5;
    > b=6;
    >
    > For the above, there are definitely sequence points
    > for each statement. However, after optimiztion the compiler
    > may set b=6 before it sets a=5 - as long as it will not
    > affect the outcome.


    I think what you're trying to rather say is that actual semantics
    (where optimization takes place) is not the same as abstract
    semantics.
     
    Kaz Kylheku, Feb 9, 2008
    #7
  8. ais523

    Eric Sosman Guest

    Kaz Kylheku wrote:
    > On Feb 8, 11:09 am, ais523 <> wrote:
    >> I've been wondering more about Undefined Behaviour, and the way in
    >> which (i=i++)-like examples can be 'corrected' so they mean something
    >> defined. This was particularly inspired by a line in some computer-
    >> generated code, whose essence was as follows:
    >>
    >> void func()
    >> {
    >> int a, b, *p;
    >> a=b=0;
    >> p=&b;
    >> *p=1+((p=&a),2);

    >
    > This is well-defined behavior because of the sequencing. The
    > assignment to *p cannot take place until the right hand side is
    > evaluated, and that evaluation is divided into two phases: before the
    > comma and after.


    Nothing can be stored at *p until after the RHS is
    evaluated, but cannot the LHS' p be evaluated earlier?

    load R0,p ; get LHS' p
    load R1,1
    load R2,&a
    store R2,p ; p = &a on RHS
    load R2,2
    add R1,R2
    store R1,*R0 ; *p (stale) = RHS

    --
    Eric Sosman
    lid
     
    Eric Sosman, Feb 9, 2008
    #8
  9. On Feb 9, 5:37 am, Kaz Kylheku <> wrote:
    > On Feb 8, 11:09 am, ais523 <> wrote:
    >
    > > I've been wondering more about Undefined Behaviour, and the way in
    > > which (i=i++)-like examples can be 'corrected' so they mean something
    > > defined. This was particularly inspired by a line in some computer-
    > > generated code, whose essence was as follows:

    >
    > > void func()
    > > {
    > > int a, b, *p;
    > > a=b=0;
    > > p=&b;
    > > *p=1+((p=&a),2);

    >
    > This is well-defined behavior because of the sequencing. The
    > assignment to *p cannot take place until the right hand side is
    > evaluated, and that evaluation is divided into two phases: before the
    > comma and after.


    The problem is not the assignment to *p, the problem is the assignment
    to p. p is used on the left side to get the address for the store to
    *p, and on the right side it is changed to &a, without intervening
    sequence point. So this is undefined behaviour.
     
    christian.bau, Feb 9, 2008
    #9
  10. ais523

    Kaz Kylheku Guest

    On Feb 9, 5:48 am, Eric Sosman <> wrote:
    > Kaz Kylheku wrote:
    > > On Feb 8, 11:09 am, ais523 <> wrote:
    > >> void func()
    > >> {
    > >> int a, b, *p;
    > >> a=b=0;
    > >> p=&b;
    > >> *p=1+((p=&a),2);

    >
    > > This is well-defined behavior because of the sequencing. The
    > > assignment to *p cannot take place until the right hand side is
    > > evaluated, and that evaluation is divided into two phases: before the
    > > comma and after.

    >
    >      Nothing can be stored at *p until after the RHS is
    > evaluated, but cannot the LHS' p be evaluated earlier?


    Ah shit, you're right of course. No, this is undefined. The value p
    can of course be used to calculate the lvalue at any time.

    Given

    L = (A, B)

    the timing of the calculation of lvalue L is not sequenced with
    respect to A or B. Only the storage into that L value (which cannot
    take place until B is evaluated).

    We can illustrate it also like this:

    *(l()) = (f(), g())

    The function f must be called before g. But the call to l can be
    interleaved arbitrarily, so any of these three orders are possible:

    l(); f(); g();
    f(); l(); g();
    f(); g(); l();

    In all three cases, the store cannot take place until g and l are
    called since it depends on both fo them.
     
    Kaz Kylheku, Feb 9, 2008
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?YmVub2l0?=

    Using comma's instead of points

    =?Utf-8?B?YmVub2l0?=, Sep 2, 2005, in forum: ASP .Net
    Replies:
    1
    Views:
    427
    =?Utf-8?B?TWlsb3N6IFNrYWxlY2tp?=
    Sep 2, 2005
  2. G Patel

    comma operator and assignment operator

    G Patel, Feb 7, 2005, in forum: C Programming
    Replies:
    4
    Views:
    499
    Barry Schwarz
    Feb 8, 2005
  3. Eric Lilja
    Replies:
    2
    Views:
    338
    David Harmon
    Sep 25, 2006
  4. kalki70
    Replies:
    5
    Views:
    437
    kalki70
    Jul 24, 2007
  5. weston
    Replies:
    1
    Views:
    264
    Richard Cornford
    Sep 22, 2006
Loading...

Share This Page