Memory layout in unions

qarnos · Jan 10, 2009

Hi, people.

I just have a quick question for people more familiar with the C
standards than myself.

If I have a union with an anonymous struct, as follows:

union my_union
{
unsigned int ccount[2];

struct
{
unsigned int rcount;
unsigned int lcount;
};
};

Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?

Thanks for any assistance.

just.a.garbageman · Jan 10, 2009

Hi, people.

I just have a quick question for people more familiar with the C
standards than myself.

If I have a union with an anonymous struct, as follows:

union my_union
{
unsigned int ccount[2];

struct
{
unsigned int rcount;
unsigned int lcount;
};

};

Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?

The compiler is not allowed to reorder the struct members if they're
not bitfields.

Tomás Ó hÉilidhe · Jan 10, 2009

qarnos said:
union my_union
{
unsigned int ccount[2];

struct
{
unsigned int rcount;
unsigned int lcount;
};
};

Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?

You'll never have padding between elements in an array, however you /
might/ have padding between members in a struct.

In your example, which uses "int", it's extremely unlikely that
there'll be padding between the struct members (in fact there might
not be a single compiler on the planet that would put in padding
between ints).

Tomás Ó hÉilidhe · Jan 10, 2009

The compiler is not allowed to reorder the struct members if they're
not bitfields.

I'm open to correction here, but I was certain that struct members had
to be lain out in the order you specify. That is to say, if you have:

struct Type { char a; int b; double c; void *d; };

Then the following is true:

offsetof(Type,d) > offsetof(Type,c) > offsetof(Type,b) > offsetof
(Type,a)

Again I'm open to correction.

qarnos · Jan 10, 2009

qarnos said:
qarnos said:

union my_union
{
unsigned int ccount[2];

Click to expand...

struct
{
unsigned int rcount;
unsigned int lcount;
};
};

Click to expand...

Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?

Click to expand...

You'll never have padding between elements in an array, however you /
might/ have padding between members in a struct.

In your example, which uses "int", it's extremely unlikely that
there'll be padding between the struct members (in fact there might
not be a single compiler on the planet that would put in padding
between ints).

So basically what you are saying is that in the "real world", no
compiler will insert padding, but at the technical level it is not
100% guaranteed by the standard?

Tomás Ó hÉilidhe · Jan 10, 2009

So basically what you are saying is that in the "real world", no
compiler will insert padding, but at the technical level it is not
100% guaranteed by the standard?

Correct, that's what I'm saying. The definition of "mammal" allows for
a mammalian species to have an odd number of legs, but I've yet to see
such a species.

If you were to have something like "char" though instead of "int",
you'll find a lot of compilers will put in padding.

Keith Thompson · Jan 10, 2009

qarnos said:
I just have a quick question for people more familiar with the C
standards than myself.

If I have a union with an anonymous struct, as follows:

union my_union
{
unsigned int ccount[2];

struct
{
unsigned int rcount;
unsigned int lcount;
};
};

Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?

If you use a conforming compiler, your only guarantee is that you'll
get a diagnostic message. If you don't get one, your compiler is not
conforming, or at least you didn't run it in a conforming mode. (gcc
is not conforming by default; try "-ansi -pedantic".)

Standard C does not allow anonymous struct members.

But let's make it non-anonymous (that wasn't your main point anyway):

union my_union {
unsigned int ccount[2];
struct {
unsigned int rcount;
unsigned int lcount;
} foo;
};

union my_union obj;

The standard guarantees that members of a struct (other than bit
fields) are laid out in the order in which they're declared, that the
first member of a struct is at offset 0, and that each member of a
union is at offset 0. Thus obj.ccount[0] and obj.foo.rcount are
guaranteed to occupy the same location.

Compilers are allowed to insert arbitrary padding between struct
members and/or after the last member. Normally this is done for
alignment purposes, but the standard doesn't restrict it; a perverse
compiler could insert as much padding as it likes. I don't think it's
possible for padding between rcount and lcount to be necessary for
alignment purposes, so obj.ccount[1] and obj.foo.lcount almost
certainly occupy the same location, but the standard doesn't actually
guarantee it.

Furthermore, though unions are commonly used to treat a given chunk of
memory as if it were of two different types, the standard doesn't
actually support this usage except in a few cases. Storing a value in
one member of a union and then reading a value from another member is,
in most cases, undefined behavior. It's a common enough usage that
any compiler will probably let you get away with it, but even if the
obj.ccount[0] and obj.foo.rcount occupy the same location, an
optimizing compiler could theoretically rearrange the code so that it
doesn't behave that way. For example:

int n = 42;
printf("%d\n", n);
/* The generated code could use a literal 42 rather than
re-loading the value of n */

/* declarations as above */
obj.foo.rcount = 42;
obj.ccount[0] = 137;
printf("%d\n", obj.foo.rcount);
/* The generated code could use a literal 42 rather than
re-loading the value of obj.foo.rcount. Since the value must
be 42 unless you've done something that invokes undefined
behavior, this is a valid optimization. */

*But* there's a lot of code out there that does this kind of thing,
even though the standard doesn't support it, and it's unlikely that a
compiler vendor is going to break such code.

Having said all that, there is a way to do what you want that's fully
supported by the standard:

struct my_struct {
unsigned int ccount[2];
};
#define rcount ccount[0]
#define lcount ccount[1]
struct my_struct obj;

Now obj.rcount actually *means* obj.ccount[0], and obj.lcount means
obj.ccount[1].

gcc, aliasing rules and unions	3	Apr 18, 2006
What scope are struct members in?	19	Mar 11, 2011
Union of structs with duplicate var names	4	May 10, 2010
Does C guarantee the data layout of the memory allocated by malloc function?	6	Sep 23, 2005
What's the memory layout of bit field struct in little-endian and big-endian platform?	8	Oct 19, 2005
Need Help with List Destroy Function in Storage Allocator ( LongCode )	6	Nov 17, 2009
comp.lang.c Changes to Answers to Frequently Asked Questions (FAQ)	1	Jul 4, 2004
comp.lang.c Answers to Frequently Asked Questions (FAQ List)	15	Apr 1, 2006

Memory layout in unions

qarnos

just.a.garbageman

Tomás Ó hÉilidhe

Tomás Ó hÉilidhe

qarnos

Tomás Ó hÉilidhe

Keith Thompson

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads