Memory layout in unions

Q

qarnos

Hi, people.

I just have a quick question for people more familiar with the C
standards than myself.

If I have a union with an anonymous struct, as follows:

union my_union
{
unsigned int ccount[2];

struct
{
unsigned int rcount;
unsigned int lcount;
};
};

Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?

Thanks for any assistance.
 
J

just.a.garbageman

Hi, people.

I just have a quick question for people more familiar with the C
standards than myself.

If I have a union with an anonymous struct, as follows:

union my_union
{
    unsigned int ccount[2];

    struct
    {
        unsigned int rcount;
        unsigned int lcount;
    };

};

Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?

The compiler is not allowed to reorder the struct members if they're
not bitfields.
 
T

Tomás Ó hÉilidhe

qarnos said:
union my_union
{
unsigned int ccount[2];

struct
{
unsigned int rcount;
unsigned int lcount;
};
};

Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?


You'll never have padding between elements in an array, however you /
might/ have padding between members in a struct.

In your example, which uses "int", it's extremely unlikely that
there'll be padding between the struct members (in fact there might
not be a single compiler on the planet that would put in padding
between ints).
 
T

Tomás Ó hÉilidhe

The compiler is not allowed to reorder the struct members if they're
not bitfields.


I'm open to correction here, but I was certain that struct members had
to be lain out in the order you specify. That is to say, if you have:

struct Type { char a; int b; double c; void *d; };

Then the following is true:

offsetof(Type,d) > offsetof(Type,c) > offsetof(Type,b) > offsetof
(Type,a)

Again I'm open to correction.
 
Q

qarnos

qarnos said:
union my_union
{
    unsigned int ccount[2];
    struct
    {
        unsigned int rcount;
        unsigned int lcount;
    };
};
Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?

You'll never have padding between elements in an array, however you /
might/ have padding between members in a struct.

In your example, which uses "int", it's extremely unlikely that
there'll be padding between the struct members (in fact there might
not be a single compiler on the planet that would put in padding
between ints).

So basically what you are saying is that in the "real world", no
compiler will insert padding, but at the technical level it is not
100% guaranteed by the standard?
 
T

Tomás Ó hÉilidhe

So basically what you are saying is that in the "real world", no
compiler will insert padding, but at the technical level it is not
100% guaranteed by the standard?


Correct, that's what I'm saying. The definition of "mammal" allows for
a mammalian species to have an odd number of legs, but I've yet to see
such a species.

If you were to have something like "char" though instead of "int",
you'll find a lot of compilers will put in padding.
 
K

Keith Thompson

qarnos said:
I just have a quick question for people more familiar with the C
standards than myself.

If I have a union with an anonymous struct, as follows:

union my_union
{
unsigned int ccount[2];

struct
{
unsigned int rcount;
unsigned int lcount;
};
};

Am I guaranteed than ccount[0] will map to rcount and ccount[1] to
lcount? Or is the compiler allowed to re-order the struct members?

If you use a conforming compiler, your only guarantee is that you'll
get a diagnostic message. If you don't get one, your compiler is not
conforming, or at least you didn't run it in a conforming mode. (gcc
is not conforming by default; try "-ansi -pedantic".)

Standard C does not allow anonymous struct members.

But let's make it non-anonymous (that wasn't your main point anyway):

union my_union {
unsigned int ccount[2];
struct {
unsigned int rcount;
unsigned int lcount;
} foo;
};

union my_union obj;

The standard guarantees that members of a struct (other than bit
fields) are laid out in the order in which they're declared, that the
first member of a struct is at offset 0, and that each member of a
union is at offset 0. Thus obj.ccount[0] and obj.foo.rcount are
guaranteed to occupy the same location.

Compilers are allowed to insert arbitrary padding between struct
members and/or after the last member. Normally this is done for
alignment purposes, but the standard doesn't restrict it; a perverse
compiler could insert as much padding as it likes. I don't think it's
possible for padding between rcount and lcount to be necessary for
alignment purposes, so obj.ccount[1] and obj.foo.lcount almost
certainly occupy the same location, but the standard doesn't actually
guarantee it.

Furthermore, though unions are commonly used to treat a given chunk of
memory as if it were of two different types, the standard doesn't
actually support this usage except in a few cases. Storing a value in
one member of a union and then reading a value from another member is,
in most cases, undefined behavior. It's a common enough usage that
any compiler will probably let you get away with it, but even if the
obj.ccount[0] and obj.foo.rcount occupy the same location, an
optimizing compiler could theoretically rearrange the code so that it
doesn't behave that way. For example:

int n = 42;
printf("%d\n", n);
/* The generated code could use a literal 42 rather than
re-loading the value of n */

/* declarations as above */
obj.foo.rcount = 42;
obj.ccount[0] = 137;
printf("%d\n", obj.foo.rcount);
/* The generated code could use a literal 42 rather than
re-loading the value of obj.foo.rcount. Since the value must
be 42 unless you've done something that invokes undefined
behavior, this is a valid optimization. */

*But* there's a lot of code out there that does this kind of thing,
even though the standard doesn't support it, and it's unlikely that a
compiler vendor is going to break such code.

Having said all that, there is a way to do what you want that's fully
supported by the standard:

struct my_struct {
unsigned int ccount[2];
};
#define rcount ccount[0]
#define lcount ccount[1]
struct my_struct obj;

Now obj.rcount actually *means* obj.ccount[0], and obj.lcount means
obj.ccount[1].
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top