Question regarding access to struct members

Discussion in 'C Programming' started by Rui Maciel, May 18, 2013.

  1. Rui Maciel

    Rui Maciel Guest

    Consider the following code:

    <code>
    struct Foo
    {
    int a;
    int b;
    };

    struct Bar
    {
    int a;
    int b;
    float c;
    };

    void test(void)
    {
    struct Foo *foo;
    struct Bar bar;

    foo = &bar;
    }
    </code>


    According to the standard, is there any guarantee that foo->b == bar.b?


    Thanks in advance,
    Rui Maciel
     
    Rui Maciel, May 18, 2013
    #1
    1. Advertising

  2. Rui Maciel

    Eric Sosman Guest

    On 5/18/2013 4:07 PM, Rui Maciel wrote:
    > Consider the following code:
    >
    > <code>
    > struct Foo
    > {
    > int a;
    > int b;
    > };
    >
    > struct Bar
    > {
    > int a;
    > int b;
    > float c;
    > };
    >
    > void test(void)
    > {
    > struct Foo *foo;
    > struct Bar bar;
    >
    > foo = &bar;


    Aside: This assignment must elicit a diagnostic message.
    Add a cast.

    > }
    > </code>
    >
    >
    > According to the standard, is there any guarantee that foo->b == bar.b?


    No. The "special guarantee" of 6.5.2.3p4 applies only to
    structs with a "common initial sequence" that inhabit a union,
    not to apparently identical free-standing structs.

    Even if `offsetof(struct Foo, b) == offsetof(struct Bar, b)',
    which seems highly probable, I think the absence of a union leaves
    you vulnerable to aliasing problems. A sufficiently aggressive
    optimizer might think `foo->b = 42;' leaves `bar.b' untouched
    (because `foo' points to things that are not of `bar's type),
    and thus might keep using a stale `bar.b' value. If you were
    to write

    union { struct Foo foo; struct Bar bar; } carbide;
    struct Foo *pfoo = &carbide.foo;
    carbide.bar.b = 56;
    pfoo->b = 42;

    .... I think you'd be safe: The presence of the union "warns" the
    compiler that assignments to the `foo' or `bar' element may
    affect the other element's value.

    --
    Eric Sosman
    d
     
    Eric Sosman, May 18, 2013
    #2
    1. Advertising

  3. Rui Maciel

    Rui Maciel Guest

    Eric Sosman wrote:

    > No. The "special guarantee" of 6.5.2.3p4 applies only to
    > structs with a "common initial sequence" that inhabit a union,
    > not to apparently identical free-standing structs.
    >
    > Even if `offsetof(struct Foo, b) == offsetof(struct Bar, b)',
    > which seems highly probable, I think the absence of a union leaves
    > you vulnerable to aliasing problems. A sufficiently aggressive
    > optimizer might think `foo->b = 42;' leaves `bar.b' untouched
    > (because `foo' points to things that are not of `bar's type),
    > and thus might keep using a stale `bar.b' value. If you were
    > to write
    >
    > union { struct Foo foo; struct Bar bar; } carbide;
    > struct Foo *pfoo = &carbide.foo;
    > carbide.bar.b = 56;
    > pfoo->b = 42;
    >
    > ... I think you'd be safe: The presence of the union "warns" the
    > compiler that assignments to the `foo' or `bar' element may
    > affect the other element's value.



    It appears you're right. Nevertheless, won't this mean that, indirectly,
    the standard guarantees that, when different structs have a common initial
    sequence, the members that represent a common initial sequence can be
    accessed no matter which type the object is casted to?

    For example, consider the following:


    <code>
    #include <stdlib.h>
    #include <stdio.h>

    struct Foo
    {
    int a;
    int b;
    };

    struct Bar
    {
    int a;
    int b;
    float c;
    };

    void processFoo(struct Foo *foo)
    {
    printf("foo.a: %d\tfoo.b: %d\n",foo->a, foo->b);
    }

    void processBar(struct Bar *bar)
    {
    printf("bar.a: %d\tbar.b: %d\n",bar->a, bar->b);
    }

    int main(void)
    {
    union
    {
    struct Foo foo;
    struct Bar bar;
    } baz;
    struct Foo foo2;
    struct Bar bar2;

    baz.foo.a = 1;
    baz.foo.b = 2;

    foo2.a = 1;
    foo2.b = 2;

    bar2.a = 1;
    bar2.b = 2;


    processFoo(&baz.foo);
    processBar(&baz.bar);

    processFoo(&foo2);
    processFoo(&bar2);

    return EXIT_SUCCESS;
    }
    </code>


    If the layout of baz.foo and foo2, as well as baz.bar and bar2, is
    guaranteed to be the same, and both foo2 and bar2 are stand-alone objects
    which weren't defined in a union type, doesn't this guarantee the access to
    the "common initial sequence" whether an object is casted to struct Foo or
    struct Bar?


    Rui Maciel
     
    Rui Maciel, May 19, 2013
    #3
  4. Rui Maciel

    Eric Sosman Guest

    On 5/19/2013 5:54 AM, Rui Maciel wrote:
    > Eric Sosman wrote:
    >
    >> No. The "special guarantee" of 6.5.2.3p4 applies only to
    >> structs with a "common initial sequence" that inhabit a union,
    >> not to apparently identical free-standing structs.
    >>
    >> Even if `offsetof(struct Foo, b) == offsetof(struct Bar, b)',
    >> which seems highly probable, I think the absence of a union leaves
    >> you vulnerable to aliasing problems. A sufficiently aggressive
    >> optimizer might think `foo->b = 42;' leaves `bar.b' untouched
    >> (because `foo' points to things that are not of `bar's type),
    >> and thus might keep using a stale `bar.b' value. If you were
    >> to write
    >>
    >> union { struct Foo foo; struct Bar bar; } carbide;
    >> struct Foo *pfoo = &carbide.foo;
    >> carbide.bar.b = 56;
    >> pfoo->b = 42;
    >>
    >> ... I think you'd be safe: The presence of the union "warns" the
    >> compiler that assignments to the `foo' or `bar' element may
    >> affect the other element's value.

    >
    >
    > It appears you're right. Nevertheless, won't this mean that, indirectly,
    > the standard guarantees that, when different structs have a common initial
    > sequence, the members that represent a common initial sequence can be
    > accessed no matter which type the object is casted to?
    >
    > For example, consider the following:
    >
    >
    > <code>
    > #include <stdlib.h>
    > #include <stdio.h>
    >
    > struct Foo
    > {
    > int a;
    > int b;
    > };
    >
    > struct Bar
    > {
    > int a;
    > int b;
    > float c;
    > };
    >
    > void processFoo(struct Foo *foo)
    > {
    > printf("foo.a: %d\tfoo.b: %d\n",foo->a, foo->b);
    > }
    >
    > void processBar(struct Bar *bar)
    > {
    > printf("bar.a: %d\tbar.b: %d\n",bar->a, bar->b);
    > }
    >
    > int main(void)
    > {
    > union
    > {
    > struct Foo foo;
    > struct Bar bar;
    > } baz;
    > struct Foo foo2;
    > struct Bar bar2;
    >
    > baz.foo.a = 1;
    > baz.foo.b = 2;
    >
    > foo2.a = 1;
    > foo2.b = 2;
    >
    > bar2.a = 1;
    > bar2.b = 2;
    >
    >
    > processFoo(&baz.foo);
    > processBar(&baz.bar);
    >
    > processFoo(&foo2);
    > processFoo(&bar2);


    Diagnostic required here, as in a similar case from your earlier
    post. Do these diagnostics make no impression on you, not even so
    much as to raise a teeny-tiny doubt about the validity of what
    you're trying to do?

    > return EXIT_SUCCESS;
    > }
    > </code>
    >
    >
    > If the layout of baz.foo and foo2, as well as baz.bar and bar2, is
    > guaranteed to be the same, and both foo2 and bar2 are stand-alone objects
    > which weren't defined in a union type, doesn't this guarantee the access to
    > the "common initial sequence" whether an object is casted to struct Foo or
    > struct Bar?


    Let's start by dismissing the layout issue (I think we can do
    this). Argument: Suppose struct Foo and struct Bar are declared
    identically in modules x.c and y.c, but only in x.c do they appear
    in a union. Since corresponding structs in x and y are compatible
    (6.2.7p1) they must be arranged identically: An x function can
    pass a pointer to an instance of x's struct Foo to a y function,
    where the representation must be the same as in y. Therefore the
    union membership in x cannot influence the compiler's choice of
    how to lay out a struct Foo, because it must end up with the same
    layout as is used in union-free y. Layout is not the issue.

    I'm no code-generation and optimization expert, but I think the
    "special guarantee" is about aliasing, not about representation. In
    the absence of a union containing both struct Foo and struct Bar,
    the compiler can assume that the elements in instances of the two
    are distinct: The bytes in a certain memory area represent the value
    of a struct Foo *or* of a struct Bar, not both. (This is just like
    other types: Some batch of bytes belongs to an int *or* to a double,
    and unless there's a union in the picture they cannot belong to both.)
    If you use type-punning to access the bytes via a "foreign" type, the
    compiler is not obliged to notice or respect the pun (6.5p7; some
    specific puns are permitted, but not all).

    The code in your post would be, I think, entirely well-behaved
    and well-defined if the third processFoo() call were removed. With
    the third call in place, it runs afoul of 6.5.2.2p2, violating a
    "shall" in a Constraints clause. If you were to add a cast you'd
    avoid the 6.5.2.2p2 issue, but 6.5p7 still operates.

    What you're doing is "likely to work" in simple cases and with
    compilers that don't optimize aggressively. But remember: Memory
    is s-l-o-w compared to CPU's, so compiler writers have a large and
    growing incentive to find clever ways to avoid accesses. (I write,
    by the way, from experience: More than fifteen years ago a compiler
    of my acquaintance optimized a pun not unlike yours, producing code
    that caught intermittent and hard-to-reproduce SIGSEGV's; it took
    three engineers a week and a half to track down the trouble.)

    --
    Eric Sosman
    d
     
    Eric Sosman, May 19, 2013
    #4
  5. Rui Maciel

    Seebs Guest

    On 2013-05-19, Rui Maciel <> wrote:
    > It appears you're right. Nevertheless, won't this mean that, indirectly,
    > the standard guarantees that, when different structs have a common initial
    > sequence, the members that represent a common initial sequence can be
    > accessed no matter which type the object is casted to?


    I thought that for quite a while, but someone pointed out a thing I had
    not considered adequately:

    Since the behavior is undefined, the compiler is allowed to act with
    absolute certainty that this never happens. So, for instance, it can
    optimize things away like mad.

    So, say:

    #include <stdio.h>

    struct x { int x; };
    struct y { int y; };

    void foo(struct y *bptr) {
    bptr->y = 2;
    }

    int main(void) {
    struct x a = { 1 };
    foo((void *) &a);
    printf("%d\n", a.x);
    return 0;
    }

    So far as I can tell, the compiler is welcome to print 1, because it
    can be quite certain that foo() can't have modified an object of type
    "struct x".

    Not sure whether it actually happens much, but I believe this is a real
    issue.

    Basically, the common initial sequence rule isn't important just because it
    implies that the common sequences must have the same layout; it's important
    because it implies that the compiler has to be aware of the possibility
    that a modification to one member of a union might affect the other in a
    predictable way which is required to work.

    -s
    --
    Copyright 2013, all wrongs reversed. Peter Seebach /
    http://www.seebs.net/log/ <-- lawsuits, religion, and funny pictures
    Autism Speaks does not speak for me. http://http://autisticadvocacy.org/
    I am not speaking for my employer, although they do rent some of my opinions.
     
    Seebs, May 23, 2013
    #5
  6. Rui Maciel

    Tim Rentsch Guest

    Rui Maciel <> writes:

    > Eric Sosman wrote:
    >
    >> No. The "special guarantee" of 6.5.2.3p4 applies only to
    >> structs with a "common initial sequence" that inhabit a union,
    >> not to apparently identical free-standing structs.
    >>
    >> Even if `offsetof(struct Foo, b) == offsetof(struct Bar, b)',
    >> which seems highly probable, I think the absence of a union leaves
    >> you vulnerable to aliasing problems. A sufficiently aggressive
    >> optimizer might think `foo->b = 42;' leaves `bar.b' untouched
    >> (because `foo' points to things that are not of `bar's type),
    >> and thus might keep using a stale `bar.b' value. If you were
    >> to write
    >>
    >> union { struct Foo foo; struct Bar bar; } carbide;
    >> struct Foo *pfoo = &carbide.foo;
    >> carbide.bar.b = 56;
    >> pfoo->b = 42;
    >>
    >> ... I think you'd be safe: The presence of the union "warns" the
    >> compiler that assignments to the `foo' or `bar' element may
    >> affect the other element's value.

    >
    > It appears you're right. Nevertheless, won't this mean that,
    > indirectly, the standard guarantees that, when different
    > structs have a common initial sequence, the members that
    > represent a common initial sequence can be accessed no matter
    > which type the object is casted to? [snip elaboration]


    No. To access (a member of) one kind of struct object
    using (a . or -> selector for a member of) another kind
    of struct type, four conditions must be met:

    1. The members in question must be corresponding
    members in the common initial sequence between
    the two struct types;
    2. There must be a union type containing both of
    the struct types in question;
    3. There must be an actual union object holding
    a struct value for a struct type suitable for
    this inter-struct-type access; and
    4. The completed union type must be visible at
    the point of inter-struct-type access.

    If any of these four conditions does not hold, the
    behavior in such cases is underfined.

    (Note: the term "another kind of struct type" is understood
    not to include the case of compatible types. If the two
    struct types are compatible, they are the same type as
    far as member access is concerned.)
     
    Tim Rentsch, Jun 12, 2013
    #6
  7. Rui Maciel

    Tim Rentsch Guest

    Eric Sosman <> writes:

    > On 5/19/2013 5:54 AM, Rui Maciel wrote:
    > [example snipped]
    >
    > I'm no code-generation and optimization expert, but I think the
    > "special guarantee" is about aliasing, not about representation. In
    > the absence of a union containing both struct Foo and struct Bar,
    > the compiler can assume that the elements in instances of the two
    > are distinct: The bytes in a certain memory area represent the value
    > of a struct Foo *or* of a struct Bar, not both. (This is just like
    > other types: Some batch of bytes belongs to an int *or* to a double,
    > and unless there's a union in the picture they cannot belong to
    > both.) If you use type-punning to access the bytes via a "foreign"
    > type, the compiler is not obliged to notice or respect the pun
    > (6.5p7; some specific puns are permitted, but not all).
    >
    > The code in your post would be, I think, entirely well-behaved
    > and well-defined if the third processFoo() call were removed. With
    > the third call in place, it runs afoul of 6.5.2.2p2, violating a
    > "shall" in a Constraints clause. If you were to add a cast you'd
    > avoid the 6.5.2.2p2 issue, but 6.5p7 still operates. [comments on
    > practical considerations snipped]


    I agree with the conclusions but not all of the reasoning.
    Assuming the argument/parameter type mismatch has been fixed
    (eg, by adding a cast, as suggested), AFAICS the requirements
    of 6.5p7 are not violated. The accessing expression has type
    int, and the object being accessed has type int. There is
    a sub-expression that has a (wrong) struct type, but that
    sub-expression doesn't do any accessing; it's only the
    larger expression, ie, including the part after the member
    selection operator, that does any accessing, and the type
    of that expression is consistent with the effective type
    of the object being accessed.

    I think it's right that the concerns here are about aliasing,
    not representation. However I think the undefinedness occurs
    not as a result of 6.5p7 but just from the description of the
    member selection operators. These operators (. and ->) select
    a member of a struct (or union) object, of the kind of the
    left operand. If there is no such object, then there is no
    way to select one of its members. Hence the behavior is
    undefined, by virtue of having no definition.

    Obviously I would agree that how the Standard describes this
    could be improved, and probably should be. But I don't think
    there is any serious doubt about what was intended.
     
    Tim Rentsch, Jun 12, 2013
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. CoolPint
    Replies:
    8
    Views:
    1,014
    Jeff Schwab
    Dec 14, 2003
  2. BigMan
    Replies:
    1
    Views:
    939
    Malte Starostik
    Mar 29, 2005
  3. Chris Fogelklou
    Replies:
    36
    Views:
    1,423
    Chris Fogelklou
    Apr 20, 2004
  4. Erich Pul
    Replies:
    4
    Views:
    1,418
    Dave Thompson
    Jul 10, 2006
  5. John Reye
    Replies:
    28
    Views:
    1,403
    Tim Rentsch
    May 8, 2012
Loading...

Share This Page