Is it well-defined and portable to do something like:
typedef struct
{
int type;
char c;
} S1;
typedef struct
{
int type;
float f;
} S2;
void f(void* p)
{
S1* p1 = (S1*) p;
S2* p2 = (S2*) p;
Someone else already posted section references so I will skip all
that (I assume they are correct).
Practically speaking, you can run into problems here as well. Eric
Sosman noted an example of a compiler in which problems did occur
with a similar construct. To make the problem "more apparent" let
me suggest a slightly different version of S1 and S2:
/* typedef struct S1 S1; */ /* purely for those who like typedefs */
struct S1 { int type; char c; };
/* typedef struct S2 S2; */
struct S2 { int type; char pad[10000]; float f; };
Now in f() we begin with this, which is the same but with the ugly
casts removed, and the beautiful "struct" keyword inserted:
void f(void *p) {
struct S1 *p1 = p;
struct S2 *p2 = p;
Clearly, the goal here is that some caller will call f() with
an actual instance of an "S1" object, and f() will be able to
tell that "p" really pointed to an "S1", rather than an "S2", by
the type field. So here is the caller:
void g(void) {
struct S1 *q = xmalloc(sizeof *q);
/* xmalloc() just calls malloc and exits if it fails */
q->type = 1; /* tell f() that this is an S1 */
f(q);
...
}
So now here we are in f(), and we have set both p1 and p2 equal to
p (which is g()'s "q"). Our compiler optimizes, though. We have
suggested to it that p1 and p2 are both valid. It peeks ahead at
the rest of our code and determines that neither p1 nor p2 may be
NULL at this point -- we use p1 without checking for NULL, and if
p1 != NULL then p2 != NULL -- and that there could be a use of
p2->f, which is going to be in a different cache line than p1->type.
So, in order to make the code run fast, it issues a prefetch load
of p2->f *right now* (with signalling NaN traps suppressed, of
course). Then it goes on to compile the rest of the function:
if (p1->type == 1)
{
printf("%c\n", p1->c);
}
else
{
printf("%f\n", p2->f);
}
}
Now, when we call f() from g(), we give it a pointer to the last
(newly-allocated) page in our virtual memory space. This page has
size 8192 bytes, but the address p2->f is at offset 10004. The
prefetch causes f() to trap with an invalid address, and the program
crashes.
What went wrong is pretty straightforward: we lied to the compiler,
claiming that p was *both* an S1 *and* an S2. It got its revenge.
What we need to do is avoid lying, and rewrite f() as, say:
void f(void *p) {
int *ptype = p;
if (*ptype == 1) {
struct S1 *p1 = p;
printf("%c\n", p1->c);
} else {
struct S2 *p2 = p;
printf("%f\n", p2->f);
}
}
In other words, resist the temptation to use one of the various
types of structures "as if" it were also all the others. Use the
first-element rule to access *only* the first element -- in this
case, the type-selecting integer -- and pick out the correct
structure type. Once you have the correct type, *then* establish
a pointer to the entire struct.