Struct interchangeability

S

sandeep

Say we have two types defined as follows:

struct foo { int a; int b; }

struct bar { int x; int y; }

Note how foo and bar have identical members. To what extent are they
interchangeable, eg as function arguments, assignments, and so on?

Say we have a function like

void myfunc(struct foo *fooptr);

Can we then write code like

struct bar *barptr;
....
myfunc(barptr);
myfunc((struct foo *)barptr); // same except with a cast

The ISO Standard seems to be silent/ambiguous on this point.
 
E

Eric Sosman

Say we have two types defined as follows:

struct foo { int a; int b; }

struct bar { int x; int y; }

Note how foo and bar have identical members. To what extent are they
interchangeable, eg as function arguments, assignments, and so on?

To no extent at all.
Say we have a function like

void myfunc(struct foo *fooptr);

Can we then write code like

struct bar *barptr;
...
myfunc(barptr);
myfunc((struct foo *)barptr); // same except with a cast

You can write it, but you can't require a compiler to accept
it. A conforming compiler must issue a diagnostic message for the
first myfunc() call. No diagnostic is required for the second, but
the behavior is undefined.
The ISO Standard seems to be silent/ambiguous on this point.

My copy of the Standard is silent on all points, since I lack
a text-to-speech program that will read it aloud to me. If I had
such a program, though, I'd request it to perform 6.2.7p1.
 
N

Nick

Eric Sosman said:
My copy of the Standard is silent on all points, since I lack
a text-to-speech program that will read it aloud to me. If I had
such a program, though, I'd request it to perform 6.2.7p1.

Grossly unfair. That's a perfectly normal (slightly legalistic) term
for "the document doesn't express[1] anything about this"

Try this Google search for example:
<URL:http://www.google.co.uk/search?hl=en&q=constitution+is+silent>

[1] the obvious word there is "say", which surely is just as bad as
being "silent". It's hard to see how you can say "the Standard says"[2]
and yet object sarcastically to "the Standard is silent".

[2] and you do
 
S

sandeep

Nick said:
Eric Sosman said:
My copy of the Standard is silent on all points, since I lack
a text-to-speech program that will read it aloud to me. If I had such
a program, though, I'd request it to perform 6.2.7p1.

Grossly unfair. That's a perfectly normal (slightly legalistic) term
for "the document doesn't express[1] anything about this"

I believe Mr Sosman was only making a little joke, I find he has a dry
wit that can sometimes pass you by.

However, this reference is interesting. Let's think about these two
translation units.

/* begin TU1 */

struct foo { int a; int b; } foo;
extern void fn2(void *);

void fn1(void)
{
fn2(&foo);
}

/* end TU1 */


/* begin TU2 */

struct bar { int x; int y; };
void fn2(void *vp)
{
struct bar *bp = vp;
bp->x++;
}

/* end TU2 */

As I read that paragraph, this code is not Standard-conforming: to make
it conforming, struct bar would have to be called struct foo, and the
fields would also need to have the same names.

This doesn't make much sense to me because TUs are compiled
independently. So when TU2 is compiled, how could the compiler ever know
what struct bar or its fields happened to be called in TU1? Maybe by that
point the source file to TU1 has been deleted and only a stripped object
file remains!
 
E

Eric Sosman

Nick said:
Eric Sosman said:
My copy of the Standard is silent on all points, since I lack
a text-to-speech program that will read it aloud to me. If I had such
a program, though, I'd request it to perform 6.2.7p1.

Grossly unfair. That's a perfectly normal (slightly legalistic) term
for "the document doesn't express[1] anything about this"

I believe Mr Sosman was only making a little joke, I find he has a dry
wit that can sometimes pass you by.

However, this reference is interesting. Let's think about these two
translation units.

/* begin TU1 */

struct foo { int a; int b; } foo;
extern void fn2(void *);

void fn1(void)
{
fn2(&foo);
}

/* end TU1 */


/* begin TU2 */

struct bar { int x; int y; };
void fn2(void *vp)
{
struct bar *bp = vp;
bp->x++;
}

/* end TU2 */

As I read that paragraph, this code is not Standard-conforming: to make
it conforming, struct bar would have to be called struct foo, and the
fields would also need to have the same names.

Right.
This doesn't make much sense to me because TUs are compiled
independently. So when TU2 is compiled, how could the compiler ever know
what struct bar or its fields happened to be called in TU1? Maybe by that
point the source file to TU1 has been deleted and only a stripped object
file remains!

As you say, the compiler doesn't "know" about the content of one
TU while translating another. That's why the behavior is "undefined,"
rather than an error the compiler is required to diagnose.

Why should it be an error? Because the two struct types are not
the same, even though they look the same. This happens with other
types, too: `char*' and `void*' look the same but are different types;
on many systems `int' and `long' look the same but are different, on
many systems `double' and `long double' look the same but are different,
and so on. If a programmer writes `char *cp; void *vp;' he states his
intent to create two variables of two distinct types, even though they
happen to share the same representation. If he writes `struct foo u;
struct bar v;', he similarly asks for the variables to have different
types even if they happen to look the same.

Strictly speaking, it is not even required that `struct foo' and
`struct bar' have the same representation! No sane compiler will make
them different, but the Standard doesn't forbid it. There's just no
pressing reason to require identical representations for types that
the programmer obviously intends to be different (if he'd wanted them
to be the same, he'd have created only one type).

Now let's look at optimization and efficiency. Take the two struct
types as given, and ponder this (silly) function:

int f(struct foo *pf, struct bar *pb) {
int n = 0;
for (pf->a = 0; pf->a < 10; pf->a++) {
pb->x++;
n++;
}
return n;
}

Can the compiler optimize this into the equivalent of

int f(struct foo *pf, struct bar *pb) {
pf->a = 10;
pb->x += 10;
return 10;
}

? Yes, it can. Since the parameters point to different types (and
since there's no union in sight that could hold both), the compiler
is allowed to assume that they point to distinct objects, and that
modifications to `*pf' and `*pb' don't interfere with each other
(see 6.5p7). If the parameters could point at the same object, the
optimization would not be valid (and the returned value would not
necessarily be ten).

Another reason it "should" be an error to mix types that just
happen to look alike is that their resemblance may be transitory.
The programmer has gone to the trouble of creating two distinct types,
suggesting two distinct purposes. They happen to look alike today,
but tomorrow the programmer may decide to add a `z' element to
`struct bar' -- and suddenly all the code that blithely assumed the
two structs were interchangeable stops compiling. More subtly, he
might leave the structs with the same elements, but (for some reason)
decide to switch the order in one of them:

struct foo { int a; int b; };
struct bar { int y; int x; }; // y is now first

If the structs were interchangeable (and interchanged), the program
would still compile -- but it would now have an entirely different
meaning: Altering the `a' element of a `struct foo' would now change
the `y' of an aliased `struct bar' instead of the `x'. Is this an
outcome you think would be a useful language feature?

When you think about it, the real anomaly in C is that different
TU's can utter independent declarations of the same type and somehow
have those independent declarations agree. This is weird! It's at
odds, in a way, with good software engineering practice: In a big
program, there should be one and only one "authoritative" description
for each type, object, and function, some kind of "data dictionary"
and not a bunch of free-floating independent declarations that you
sort of hope you'll keep synchronized as the program mutates. But
C doesn't have the meta-linguistic machinery to support such things,
so we're forced to rely on textual similarities. 6.2.7 describes how
much we can and cannot rely on.

A final thought: More than the users of other languages I know
of, C programmers seem obsessed with the representations of the values
and objects their programs manipulate. There are in truth times when
representations must be dealt with directly, but much C code would be
improved if the programmers could just forget about representation for
a moment or two and think about the values instead. I almost never
care how many bits are in a `double'; instead, I care about range and
precision and accumulated round-off error. Same thing with structs:
I almost never care where the padding is or isn't, nor even about the
order of their elements; I care about what the elements are, what they
mean, and what values are stored in them. You'll be a better architect
if you think more about the building and less about the bricks.
 
B

Billy Mays

/* clip */

Just out of curiosity, how do some projects use Polymorphic Structs to
pass data around? I looked through the code of the ffmpeg project and
noticed that they had a Preprocessor define for the first common fields
of two structs.


Example:

/************************/
#define COMMON_FIELDS \
int a; \
int b; \
int c; \

typedef struct Parent {
COMMON_FIELDS
} Parent;


typedef struct Child {
COMMON_FIELDS
int d;
} Child;


void printer(Parent * p) {
printf("A is %d\n", p->a);
}

/************************/

That way this would still work:

/************************/
Child * c;

c = child_setter();


printer( (Parent *)c );



/************************/



Is this code non-conforming? If so, why is it used a lot?
 
E

Eric Sosman

Just out of curiosity, how do some projects use Polymorphic Structs to
pass data around? I looked through the code of the ffmpeg project and
noticed that they had a Preprocessor define for the first common fields
of two structs.


Example:

/************************/
#define COMMON_FIELDS \
int a; \
int b; \
int c; \

typedef struct Parent {
COMMON_FIELDS
} Parent;


typedef struct Child {
COMMON_FIELDS
int d;
} Child;


void printer(Parent * p) {
printf("A is %d\n", p->a);
}

/************************/

That way this would still work:

/************************/
Child * c;

c = child_setter();


printer( (Parent *)c );



/************************/



Is this code non-conforming? If so, why is it used a lot?

The Standard does not guarantee that the COMMON_FIELDS will
be arranged identically in all structs that contain them. But
any sane compiler will use identical arrangements anyhow! For
one thing, it's simple. For another, it makes it easy to handle
a case where identical layout *is* guaranteed, namely, the case
where all these structs inhabit the same union. If they're all
in a union and they have a "common initial subsequence" of elements,
those initial elements will be arranged identically.

In the sample you've shown, the structs are not in a union so
the arrangement could, in principle, be different. But consider
that C allows separate compilation, and some other module (which
we can't see at the moment) might put these structs into a union
and thereby require identical layout. But a Parent in this module
must look like a Parent in the other module (otherwise we couldn't
pass them back and forth as function arguments and so on), so it's
easiest just to leave them identically aligned in the first place.

Layout isn't the only issue, by the way. I once worked with
some code that used a relative of this scheme, in which a function
did a big `switch' on one of the common fields to determine the
"real" type of the struct it had, then referred to the type-
specific fields only in the appropriate cases. Unfortunately, the
compiler preceded the `switch' with a speculative pre-fetch of an
element not present in all variants, and beyond the size of some
of the shorter ones. When the function encountered a short struct
instance located right at the end of addressable memory, BOOM!

For an utterly squeaky-clean implementation, one could use

typedef struct {int a,b,c; } CommonFields;

typedef struct { CommonFields cf; } Parent;

typedef struct { CommonFields cf; int d; } Child;

This cleanliness comes with a slight cost in verbosity: You can
no longer refer to elements a,b,c of a Parent or Child, but must
instead write cf.a, cf.b, cf.c. But if you really aspire to be
a complete Goody Two-Shoes coder (or to avoid strange problems not
directly due to layout), you'll pay the price.
 
K

Keith Thompson

Eric Sosman said:
For an utterly squeaky-clean implementation, one could use

typedef struct {int a,b,c; } CommonFields;

typedef struct { CommonFields cf; } Parent;

typedef struct { CommonFields cf; int d; } Child;

This cleanliness comes with a slight cost in verbosity: You can
no longer refer to elements a,b,c of a Parent or Child, but must
instead write cf.a, cf.b, cf.c. But if you really aspire to be
a complete Goody Two-Shoes coder (or to avoid strange problems not
directly due to layout), you'll pay the price.

You could also define macros:

#define a cf.a
#define b cf.b
#define c cf.c

This is a very bad idea if they're any chance that the names a, b,
and c could be used for other purposes (say, for local variables).
Careful name choice (e.g., don't use "a", "b", and "c") can avoid
this.
 
T

Tim Rentsch

Eric Sosman said:
Nick said:
My copy of the Standard is silent on all points, since I lack
a text-to-speech program that will read it aloud to me. If I had such
a program, though, I'd request it to perform 6.2.7p1.

Grossly unfair. That's a perfectly normal (slightly legalistic) term
for "the document doesn't express[1] anything about this"

I believe Mr Sosman was only making a little joke, I find he has a dry
wit that can sometimes pass you by.

However, this reference is interesting. Let's think about these two
translation units.

/* begin TU1 */

struct foo { int a; int b; } foo;
extern void fn2(void *);

void fn1(void)
{
fn2(&foo);
}

/* end TU1 */


/* begin TU2 */

struct bar { int x; int y; };
void fn2(void *vp)
{
struct bar *bp = vp;
bp->x++;
}

/* end TU2 */
...snip...

Strictly speaking, it is not even required that `struct foo' and
`struct bar' have the same representation! No sane compiler will make
them different, but the Standard doesn't forbid it. [snip]

There's no practical way for the respective member offsets
to be different. But they could have different amounts of
trailing padding, or different alignment requirements.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,537
Members
45,024
Latest member
ARDU_PROgrammER

Latest Threads

Top