A
Alexander Klauer
Hi,
suppose I allocate space for a structure, can I safely interpret
the allocated object as a union, even if the size of the space
allocated is smaller than the size of the union type?
This question appears to have come up before; at least I found
the (very old) threads
"struct pointer casting", c.std.c, 1993/03/22
http://groups.google.com/group/comp.lang.c/browse_thread/thread/1c12a4c6afb312a4
"Union and malloc", c.l.c, 1998/08/22
http://groups.google.com/group/comp.std.c/browse_thread/thread/960a336931f63f02
and the answer appears to lean towards "No". Does the C99
perspective change anything (I have the N1256 draft)? In order
to create some practical ground for discussion, consider the
following C99 program:
-----> start <-----
#include<stdio.h>
#include<stdlib.h>
enum Type {
T_SCALAR,
T_VECTOR
};
struct S {
enum Type type;
};
struct S1 {
enum Type type;
int scalar;
};
struct S2 {
enum Type type;
int vector[3];
};
union U {
struct S s;
struct S1 s1;
struct S2 s2;
};
void print_u(const union U * u) {
switch (u->s.type) {
case T_SCALAR:
printf("%d\n", u->s1.scalar);
break;
case T_VECTOR:
printf("(%d,%d,%d)\n",
u->s2.vector[0],
u->s2.vector[1],
u->s2.vector[2]);
break;
}
}
int main(void) {
struct S1 * s1 = malloc(sizeof(*s1));
if (s1 == NULL)
exit(EXIT_FAILURE);
*s1 = (struct S1) { .type = T_SCALAR, .scalar = 42 };
print_u((union U *) s1);
}
----> end <-----
There are several issues with this program.
* The cast "(union U *) s1": 6.3.2.3p7 allows this cast,
provided that the resulting pointer is correctly aligned for
the union. One should think this requirement to be fulfilled
because the value of s1 was returned by a successful call to
malloc. However, as Mark Brader has pointed out in
<[email protected]>, the wording of
7.20.3p1, "The pointer returned if the allocation succeeds is
suitably aligned so that it may be assigned to a pointer to any
type of object and then used to access such an object or an
array of such objects in the space allocated (until the space
is explicitly deallocated)", may be construed to imply that
malloc may return pointers not suitably aligned for types whose
size exceeds the allocated size. Is this still an accepted
interpretation of the wording of the standard?
* Strict aliasing and the access to u->s: the strict aliasing
rule laid down in 6.5p7, next-to-last item, allows the access
to u->s1 after the cast discussed above. Furthermore,
the "special guarantee" from 6.5.2.3p5 allows the access of
struct S in an object of type union U containing a struct S1.
Do 6.5p7 and 6.5.2.3p5 combine, making the access to u->s
legal?
* u is under-allocated for its type (assume sizeof(*u) >
sizeof(struct S1)). Does this, in itself, evoke UB? Clearly, an
assignment to u->s2 would be UB, caused by under-allocation.
(In the present case, the type of *u is const-qualified, so
this assignment is not possible. However, const-qualification
is not recursive, so with slightly more complicated structure
types, an UB assignment is possible.) But in the absence of
such explicit violations, may the compiler assume, the non-NULL
pointer u points to at least sizeof(*u) bytes and thus may UB
ensue?
The reason I ask this question is that I have the following
situation (which I think is fairly common, but I may be wrong):
I have a list of pointers to objects of different sizes. When I
retrieve a pointer, I want to know what type of data it points
to, and then operate on that data accordingly. The natural
solution appears to be using a union type. But allocating an
entire union for each object is wasteful.
There is, of course, a simple workaround. Just replace each
struct SomeStruct {
enum Type type;
/* lots of members */
};
with
struct SomeStructReal {
/* lots of members */
};
struct SomeStruct {
enum Type type;
struct SomeStructReal * p;
};
and then allocate space for struct SomeStructReal and the union
in which objects of type struct SomeStruct and similar reside.
But isn't this a little unnatural? In other words: if
under-allocating unions leads to undefined behaviour, are there
any actual implementations exhibiting unintended behaviour in
such a case? If not, the standard should IMHO be fixed to make
such use of unions well-defined. Or is there any compelling
reason the standard makes under-allocating unions undefined (if
it does)?
Finally, if I am right in surmising that my situation is common,
maybe this question should go into the FAQ?
Alexander
suppose I allocate space for a structure, can I safely interpret
the allocated object as a union, even if the size of the space
allocated is smaller than the size of the union type?
This question appears to have come up before; at least I found
the (very old) threads
"struct pointer casting", c.std.c, 1993/03/22
http://groups.google.com/group/comp.lang.c/browse_thread/thread/1c12a4c6afb312a4
"Union and malloc", c.l.c, 1998/08/22
http://groups.google.com/group/comp.std.c/browse_thread/thread/960a336931f63f02
and the answer appears to lean towards "No". Does the C99
perspective change anything (I have the N1256 draft)? In order
to create some practical ground for discussion, consider the
following C99 program:
-----> start <-----
#include<stdio.h>
#include<stdlib.h>
enum Type {
T_SCALAR,
T_VECTOR
};
struct S {
enum Type type;
};
struct S1 {
enum Type type;
int scalar;
};
struct S2 {
enum Type type;
int vector[3];
};
union U {
struct S s;
struct S1 s1;
struct S2 s2;
};
void print_u(const union U * u) {
switch (u->s.type) {
case T_SCALAR:
printf("%d\n", u->s1.scalar);
break;
case T_VECTOR:
printf("(%d,%d,%d)\n",
u->s2.vector[0],
u->s2.vector[1],
u->s2.vector[2]);
break;
}
}
int main(void) {
struct S1 * s1 = malloc(sizeof(*s1));
if (s1 == NULL)
exit(EXIT_FAILURE);
*s1 = (struct S1) { .type = T_SCALAR, .scalar = 42 };
print_u((union U *) s1);
}
----> end <-----
There are several issues with this program.
* The cast "(union U *) s1": 6.3.2.3p7 allows this cast,
provided that the resulting pointer is correctly aligned for
the union. One should think this requirement to be fulfilled
because the value of s1 was returned by a successful call to
malloc. However, as Mark Brader has pointed out in
<[email protected]>, the wording of
7.20.3p1, "The pointer returned if the allocation succeeds is
suitably aligned so that it may be assigned to a pointer to any
type of object and then used to access such an object or an
array of such objects in the space allocated (until the space
is explicitly deallocated)", may be construed to imply that
malloc may return pointers not suitably aligned for types whose
size exceeds the allocated size. Is this still an accepted
interpretation of the wording of the standard?
* Strict aliasing and the access to u->s: the strict aliasing
rule laid down in 6.5p7, next-to-last item, allows the access
to u->s1 after the cast discussed above. Furthermore,
the "special guarantee" from 6.5.2.3p5 allows the access of
struct S in an object of type union U containing a struct S1.
Do 6.5p7 and 6.5.2.3p5 combine, making the access to u->s
legal?
* u is under-allocated for its type (assume sizeof(*u) >
sizeof(struct S1)). Does this, in itself, evoke UB? Clearly, an
assignment to u->s2 would be UB, caused by under-allocation.
(In the present case, the type of *u is const-qualified, so
this assignment is not possible. However, const-qualification
is not recursive, so with slightly more complicated structure
types, an UB assignment is possible.) But in the absence of
such explicit violations, may the compiler assume, the non-NULL
pointer u points to at least sizeof(*u) bytes and thus may UB
ensue?
The reason I ask this question is that I have the following
situation (which I think is fairly common, but I may be wrong):
I have a list of pointers to objects of different sizes. When I
retrieve a pointer, I want to know what type of data it points
to, and then operate on that data accordingly. The natural
solution appears to be using a union type. But allocating an
entire union for each object is wasteful.
There is, of course, a simple workaround. Just replace each
struct SomeStruct {
enum Type type;
/* lots of members */
};
with
struct SomeStructReal {
/* lots of members */
};
struct SomeStruct {
enum Type type;
struct SomeStructReal * p;
};
and then allocate space for struct SomeStructReal and the union
in which objects of type struct SomeStruct and similar reside.
But isn't this a little unnatural? In other words: if
under-allocating unions leads to undefined behaviour, are there
any actual implementations exhibiting unintended behaviour in
such a case? If not, the standard should IMHO be fixed to make
such use of unions well-defined. Or is there any compelling
reason the standard makes under-allocating unions undefined (if
it does)?
Finally, if I am right in surmising that my situation is common,
maybe this question should go into the FAQ?
Alexander