Question regarding access to struct members


R

Rui Maciel

Consider the following code:

<code>
struct Foo
{
int a;
int b;
};

struct Bar
{
int a;
int b;
float c;
};

void test(void)
{
struct Foo *foo;
struct Bar bar;

foo = &bar;
}
</code>


According to the standard, is there any guarantee that foo->b == bar.b?


Thanks in advance,
Rui Maciel
 
Ad

Advertisements

E

Eric Sosman

Consider the following code:

<code>
struct Foo
{
int a;
int b;
};

struct Bar
{
int a;
int b;
float c;
};

void test(void)
{
struct Foo *foo;
struct Bar bar;

foo = &bar;

Aside: This assignment must elicit a diagnostic message.
Add a cast.
}
</code>


According to the standard, is there any guarantee that foo->b == bar.b?

No. The "special guarantee" of 6.5.2.3p4 applies only to
structs with a "common initial sequence" that inhabit a union,
not to apparently identical free-standing structs.

Even if `offsetof(struct Foo, b) == offsetof(struct Bar, b)',
which seems highly probable, I think the absence of a union leaves
you vulnerable to aliasing problems. A sufficiently aggressive
optimizer might think `foo->b = 42;' leaves `bar.b' untouched
(because `foo' points to things that are not of `bar's type),
and thus might keep using a stale `bar.b' value. If you were
to write

union { struct Foo foo; struct Bar bar; } carbide;
struct Foo *pfoo = &carbide.foo;
carbide.bar.b = 56;
pfoo->b = 42;

.... I think you'd be safe: The presence of the union "warns" the
compiler that assignments to the `foo' or `bar' element may
affect the other element's value.
 
R

Rui Maciel

Eric said:
No. The "special guarantee" of 6.5.2.3p4 applies only to
structs with a "common initial sequence" that inhabit a union,
not to apparently identical free-standing structs.

Even if `offsetof(struct Foo, b) == offsetof(struct Bar, b)',
which seems highly probable, I think the absence of a union leaves
you vulnerable to aliasing problems. A sufficiently aggressive
optimizer might think `foo->b = 42;' leaves `bar.b' untouched
(because `foo' points to things that are not of `bar's type),
and thus might keep using a stale `bar.b' value. If you were
to write

union { struct Foo foo; struct Bar bar; } carbide;
struct Foo *pfoo = &carbide.foo;
carbide.bar.b = 56;
pfoo->b = 42;

... I think you'd be safe: The presence of the union "warns" the
compiler that assignments to the `foo' or `bar' element may
affect the other element's value.


It appears you're right. Nevertheless, won't this mean that, indirectly,
the standard guarantees that, when different structs have a common initial
sequence, the members that represent a common initial sequence can be
accessed no matter which type the object is casted to?

For example, consider the following:


<code>
#include <stdlib.h>
#include <stdio.h>

struct Foo
{
int a;
int b;
};

struct Bar
{
int a;
int b;
float c;
};

void processFoo(struct Foo *foo)
{
printf("foo.a: %d\tfoo.b: %d\n",foo->a, foo->b);
}

void processBar(struct Bar *bar)
{
printf("bar.a: %d\tbar.b: %d\n",bar->a, bar->b);
}

int main(void)
{
union
{
struct Foo foo;
struct Bar bar;
} baz;
struct Foo foo2;
struct Bar bar2;

baz.foo.a = 1;
baz.foo.b = 2;

foo2.a = 1;
foo2.b = 2;

bar2.a = 1;
bar2.b = 2;


processFoo(&baz.foo);
processBar(&baz.bar);

processFoo(&foo2);
processFoo(&bar2);

return EXIT_SUCCESS;
}
</code>


If the layout of baz.foo and foo2, as well as baz.bar and bar2, is
guaranteed to be the same, and both foo2 and bar2 are stand-alone objects
which weren't defined in a union type, doesn't this guarantee the access to
the "common initial sequence" whether an object is casted to struct Foo or
struct Bar?


Rui Maciel
 
E

Eric Sosman

It appears you're right. Nevertheless, won't this mean that, indirectly,
the standard guarantees that, when different structs have a common initial
sequence, the members that represent a common initial sequence can be
accessed no matter which type the object is casted to?

For example, consider the following:


<code>
#include <stdlib.h>
#include <stdio.h>

struct Foo
{
int a;
int b;
};

struct Bar
{
int a;
int b;
float c;
};

void processFoo(struct Foo *foo)
{
printf("foo.a: %d\tfoo.b: %d\n",foo->a, foo->b);
}

void processBar(struct Bar *bar)
{
printf("bar.a: %d\tbar.b: %d\n",bar->a, bar->b);
}

int main(void)
{
union
{
struct Foo foo;
struct Bar bar;
} baz;
struct Foo foo2;
struct Bar bar2;

baz.foo.a = 1;
baz.foo.b = 2;

foo2.a = 1;
foo2.b = 2;

bar2.a = 1;
bar2.b = 2;


processFoo(&baz.foo);
processBar(&baz.bar);

processFoo(&foo2);
processFoo(&bar2);

Diagnostic required here, as in a similar case from your earlier
post. Do these diagnostics make no impression on you, not even so
much as to raise a teeny-tiny doubt about the validity of what
you're trying to do?
return EXIT_SUCCESS;
}
</code>


If the layout of baz.foo and foo2, as well as baz.bar and bar2, is
guaranteed to be the same, and both foo2 and bar2 are stand-alone objects
which weren't defined in a union type, doesn't this guarantee the access to
the "common initial sequence" whether an object is casted to struct Foo or
struct Bar?

Let's start by dismissing the layout issue (I think we can do
this). Argument: Suppose struct Foo and struct Bar are declared
identically in modules x.c and y.c, but only in x.c do they appear
in a union. Since corresponding structs in x and y are compatible
(6.2.7p1) they must be arranged identically: An x function can
pass a pointer to an instance of x's struct Foo to a y function,
where the representation must be the same as in y. Therefore the
union membership in x cannot influence the compiler's choice of
how to lay out a struct Foo, because it must end up with the same
layout as is used in union-free y. Layout is not the issue.

I'm no code-generation and optimization expert, but I think the
"special guarantee" is about aliasing, not about representation. In
the absence of a union containing both struct Foo and struct Bar,
the compiler can assume that the elements in instances of the two
are distinct: The bytes in a certain memory area represent the value
of a struct Foo *or* of a struct Bar, not both. (This is just like
other types: Some batch of bytes belongs to an int *or* to a double,
and unless there's a union in the picture they cannot belong to both.)
If you use type-punning to access the bytes via a "foreign" type, the
compiler is not obliged to notice or respect the pun (6.5p7; some
specific puns are permitted, but not all).

The code in your post would be, I think, entirely well-behaved
and well-defined if the third processFoo() call were removed. With
the third call in place, it runs afoul of 6.5.2.2p2, violating a
"shall" in a Constraints clause. If you were to add a cast you'd
avoid the 6.5.2.2p2 issue, but 6.5p7 still operates.

What you're doing is "likely to work" in simple cases and with
compilers that don't optimize aggressively. But remember: Memory
is s-l-o-w compared to CPU's, so compiler writers have a large and
growing incentive to find clever ways to avoid accesses. (I write,
by the way, from experience: More than fifteen years ago a compiler
of my acquaintance optimized a pun not unlike yours, producing code
that caught intermittent and hard-to-reproduce SIGSEGV's; it took
three engineers a week and a half to track down the trouble.)
 
S

Seebs

It appears you're right. Nevertheless, won't this mean that, indirectly,
the standard guarantees that, when different structs have a common initial
sequence, the members that represent a common initial sequence can be
accessed no matter which type the object is casted to?

I thought that for quite a while, but someone pointed out a thing I had
not considered adequately:

Since the behavior is undefined, the compiler is allowed to act with
absolute certainty that this never happens. So, for instance, it can
optimize things away like mad.

So, say:

#include <stdio.h>

struct x { int x; };
struct y { int y; };

void foo(struct y *bptr) {
bptr->y = 2;
}

int main(void) {
struct x a = { 1 };
foo((void *) &a);
printf("%d\n", a.x);
return 0;
}

So far as I can tell, the compiler is welcome to print 1, because it
can be quite certain that foo() can't have modified an object of type
"struct x".

Not sure whether it actually happens much, but I believe this is a real
issue.

Basically, the common initial sequence rule isn't important just because it
implies that the common sequences must have the same layout; it's important
because it implies that the compiler has to be aware of the possibility
that a modification to one member of a union might affect the other in a
predictable way which is required to work.

-s
 
T

Tim Rentsch

Rui Maciel said:
Eric said:
No. The "special guarantee" of 6.5.2.3p4 applies only to
structs with a "common initial sequence" that inhabit a union,
not to apparently identical free-standing structs.

Even if `offsetof(struct Foo, b) == offsetof(struct Bar, b)',
which seems highly probable, I think the absence of a union leaves
you vulnerable to aliasing problems. A sufficiently aggressive
optimizer might think `foo->b = 42;' leaves `bar.b' untouched
(because `foo' points to things that are not of `bar's type),
and thus might keep using a stale `bar.b' value. If you were
to write

union { struct Foo foo; struct Bar bar; } carbide;
struct Foo *pfoo = &carbide.foo;
carbide.bar.b = 56;
pfoo->b = 42;

... I think you'd be safe: The presence of the union "warns" the
compiler that assignments to the `foo' or `bar' element may
affect the other element's value.

It appears you're right. Nevertheless, won't this mean that,
indirectly, the standard guarantees that, when different
structs have a common initial sequence, the members that
represent a common initial sequence can be accessed no matter
which type the object is casted to? [snip elaboration]

No. To access (a member of) one kind of struct object
using (a . or -> selector for a member of) another kind
of struct type, four conditions must be met:

1. The members in question must be corresponding
members in the common initial sequence between
the two struct types;
2. There must be a union type containing both of
the struct types in question;
3. There must be an actual union object holding
a struct value for a struct type suitable for
this inter-struct-type access; and
4. The completed union type must be visible at
the point of inter-struct-type access.

If any of these four conditions does not hold, the
behavior in such cases is underfined.

(Note: the term "another kind of struct type" is understood
not to include the case of compatible types. If the two
struct types are compatible, they are the same type as
far as member access is concerned.)
 
Ad

Advertisements

T

Tim Rentsch

Eric Sosman said:
On 5/19/2013 5:54 AM, Rui Maciel wrote:
[example snipped]

I'm no code-generation and optimization expert, but I think the
"special guarantee" is about aliasing, not about representation. In
the absence of a union containing both struct Foo and struct Bar,
the compiler can assume that the elements in instances of the two
are distinct: The bytes in a certain memory area represent the value
of a struct Foo *or* of a struct Bar, not both. (This is just like
other types: Some batch of bytes belongs to an int *or* to a double,
and unless there's a union in the picture they cannot belong to
both.) If you use type-punning to access the bytes via a "foreign"
type, the compiler is not obliged to notice or respect the pun
(6.5p7; some specific puns are permitted, but not all).

The code in your post would be, I think, entirely well-behaved
and well-defined if the third processFoo() call were removed. With
the third call in place, it runs afoul of 6.5.2.2p2, violating a
"shall" in a Constraints clause. If you were to add a cast you'd
avoid the 6.5.2.2p2 issue, but 6.5p7 still operates. [comments on
practical considerations snipped]

I agree with the conclusions but not all of the reasoning.
Assuming the argument/parameter type mismatch has been fixed
(eg, by adding a cast, as suggested), AFAICS the requirements
of 6.5p7 are not violated. The accessing expression has type
int, and the object being accessed has type int. There is
a sub-expression that has a (wrong) struct type, but that
sub-expression doesn't do any accessing; it's only the
larger expression, ie, including the part after the member
selection operator, that does any accessing, and the type
of that expression is consistent with the effective type
of the object being accessed.

I think it's right that the concerns here are about aliasing,
not representation. However I think the undefinedness occurs
not as a result of 6.5p7 but just from the description of the
member selection operators. These operators (. and ->) select
a member of a struct (or union) object, of the kind of the
left operand. If there is no such object, then there is no
way to select one of its members. Hence the behavior is
undefined, by virtue of having no definition.

Obviously I would agree that how the Standard describes this
could be improved, and probably should be. But I don't think
there is any serious doubt about what was intended.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top