Byte alignment in union

A

Ares Lagae

Suppose i do something like this:

template <typename T>
class Foo
{

union
{
T c[4];
struct
{
T c0, c1, c2, c3;
}
};

};

with T a built-in type eg. float

does the standard guarantee that c and ci (i in 0...3) occuppy the
same bytes in memory ?

best regards,
Ares Lagae
 
G

Gianni Mariani

Ares said:
Suppose i do something like this:

template <typename T>
class Foo
{

union
{
T c[4];
struct
{
T c0, c1, c2, c3;
}
};

};

with T a built-in type eg. float

does the standard guarantee that c and ci (i in 0...3) occuppy the
same bytes in memory ?


That is endian dependant.

Big endian
c[0] == c3

Little endian
c[0] == c0
 
K

Karl Heinz Buchegger

Gianni said:
Ares said:
Suppose i do something like this:

template <typename T>
class Foo
{

union
{
T c[4];
struct
{
T c0, c1, c2, c3;
}
};

};

with T a built-in type eg. float

does the standard guarantee that c and ci (i in 0...3) occuppy the
same bytes in memory ?


That is endian dependant.

Big endian
c[0] == c3

Little endian
c[0] == c0


Endienness has nothing to do with it.
But the compiler may or may not insert padding bytes
in the structure and thus ruining the intent.
 
J

John Harrison

Gianni Mariani said:
Ares said:
Suppose i do something like this:

template <typename T>
class Foo
{

union
{
T c[4];
struct
{
T c0, c1, c2, c3;
}
};

};

with T a built-in type eg. float

does the standard guarantee that c and ci (i in 0...3) occuppy the
same bytes in memory ?


That is endian dependant.

Big endian
c[0] == c3

Little endian
c[0] == c0


I wouldn't describe it as endian dependant, but its right that the order of
c0 .. c3 is not defined. The OP would have more luck with

union
{
T c[4];
struct
{
T c0;
T c1;
T c2;
T c3;
}
};

Now c[0] == c0, c[1] == c1 etc.

john
 
G

Gianni Mariani

John said:
I wouldn't describe it as endian dependant, but its right that the order of
c0 .. c3 is not defined. The OP would have more luck with

union
{
T c[4];
struct
{
T c0;
T c1;
T c2;
T c3;
}
};

Now c[0] == c0, c[1] == c1 etc.

john

You're right - my answer was way off. Monday morning .... what can I
say, I need my coffee.
 
A

Ares Lagae

John said:
I wouldn't describe it as endian dependant, but its right that the order of
c0 .. c3 is not defined. The OP would have more luck with

union
{
T c[4];
struct
{
T c0;
T c1;
T c2;
T c3;
}
};

Now c[0] == c0, c[1] == c1 etc.

Are you saying if i use

T c0;
T c1;
T c2;
T c3;

instead of

T c0, c1, c2, c3;

that c = ci is guaranteed ?

it seems to work on vc++, icc and gcc

best regards
Ares Lagae
 
J

John Harrison

Ares Lagae said:
John said:
I wouldn't describe it as endian dependant, but its right that the order of
c0 .. c3 is not defined. The OP would have more luck with

union
{
T c[4];
struct
{
T c0;
T c1;
T c2;
T c3;
}
};

Now c[0] == c0, c[1] == c1 etc.

Are you saying if i use

T c0;
T c1;
T c2;
T c3;

instead of

T c0, c1, c2, c3;

that c = ci is guaranteed ?

it seems to work on vc++, icc and gcc

best regards
Ares Lagae


I'm not sure but I think the compiler can still adding padding bytes between
c0, c1, c2 and c3. So I don't think that you can be sure that c = ci. But
I am sure that the order is c0, c1, c2, c3. With the other way of doing it,
you couldn't even be sure that the order was right.

john
 
G

Gianni Mariani

John said:
I'm not sure but I think the compiler can still adding padding bytes between
c0, c1, c2 and c3. So I don't think that you can be sure that c = ci. But
I am sure that the order is c0, c1, c2, c3. With the other way of doing it,
you couldn't even be sure that the order was right.


It it adds padding in the struct it will also add padding in the array.

Hence I think it is guarenteed (at least on every implementation I know
of) that ci == c.
 
K

Kevin Goodsell

Ares said:
Suppose i do something like this:

template <typename T>
class Foo
{

union
{
T c[4];
struct
{
T c0, c1, c2, c3;
}
};

};

with T a built-in type eg. float

does the standard guarantee that c and ci (i in 0...3) occuppy the
same bytes in memory ?


As far as I know you have no such guarantee, and even if you did you
would be unable to exploit it in your code without invoking undefined
behavior. In other words, this is a bad idea.

If you explain what you really want to do, I bet we can offer a good
alternative.

-Kevin
 
O

Oliver S.

I wouldn't describe it as endian dependant, but its right that the
order of c0 .. c3 is not defined.

The order is defined as this struct as well as the union are PODs.
 
K

Kevin Goodsell

Gianni said:
Do you know of a compiler that does not work like this ? I don't.

I think relying on undefined behavior is a bad idea, regardless of what
the compilers I know about do.

-Kevin
 
J

Jack Klein

Kevin said:
Ares said:
Suppose i do something like this:

template <typename T>
class Foo
{

union {
T c[4];
struct
{
T c0, c1, c2, c3;
}
};

};

with T a built-in type eg. float

does the standard guarantee that c and ci (i in 0...3) occuppy the
same bytes in memory ?


As far as I know you have no such guarantee, and even if you did you
would be unable to exploit it in your code without invoking undefined
behavior. In other words, this is a bad idea.


Do you know of a compiler that does not work like this ? I don't.


And have you tried every C++ compiler in the world, with every
available compiler option for each?

Even if every single one of them produced the results you want, the
next compiler or next upgrade to a compiler that worked could deliver
one that does not work.

Exactly as Kevin said, attempting to make use of this is basically
undefined behavior, except in some special circumstances that make it
useless.

As far as we are concerned here, if it involves undefined behavior we
don't care what any compiler, or all compilers, do.

--
Jack Klein
Home: http://JK-Technology.Com
FAQs for
comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html
comp.lang.c++ http://www.parashift.com/c++-faq-lite/
alt.comp.lang.learn.c-c++ ftp://snurse-l.org/pub/acllc-c++/faq
 
G

Gianni Mariani

Jack said:
On 26 Aug 2003 00:45:25 GMT, Gianni Mariani <[email protected]>
wrote in comp.lang.c++:
And have you tried every C++ compiler in the world, with every
available compiler option for each?

Even if every single one of them produced the results you want, the
next compiler or next upgrade to a compiler that worked could deliver
one that does not work.

Now you're paranoid.

In all practicality, they can't and you know better. It would break
ABI's and all hell breaks loose. Can't happen.
Exactly as Kevin said, attempting to make use of this is basically
undefined behavior, except in some special circumstances that make it
useless.

Maybe, maybe not. I think more research is needed.
As far as we are concerned here, if it involves undefined behavior we
don't care what any compiler, or all compilers, do.

Oh, then std::vector elements are not necessarily contiguous then ?
 
G

Gianni Mariani

Kevin said:
I think relying on undefined behavior is a bad idea, regardless of what
the compilers I know about do.

-Kevin

Agreed.

Personally I would write new code differently. However, if I ran across
code like this and I had 500 unrelated bugs in the bug list, I wouldn't
be touching it.
 
G

Gianni Mariani

Noah said:
Since code like this invokes undefined behavior the only thing broken is
broken code. A compiler writer would not have to worry about making the
above code functional because there is nothing saying that it is.

BTW, the upgrade from gcc-2 to gcc-3 rendered all pre-compiled c++
libraries useless...it happens.

True - that did happen. But the C libraries still survived and this is
one of those things that needs to be consistant across C and C++.

Come to think of that, I have a vague recollection that the layout of
members in C requires that they are laid out like this.

There is OODLES of code written in C that depends on this behaviour.
Take the X protocol header files as one example and this is meant to
work between DIFFERENT compilers !

If someone has a C standard handy, please look up the conventions on how
members must be laid out in a struct. If I remember correctly (and it's
been years and I don't remeber the document - it could have been K&R
TCL), members in a struct must be laid out so that the members are
placed in order of declaration. (padding between members to preserve
alignment is compiler specific.)

I think there are struct layout guarentees but I don't have a reference
handy to give a definitive answer.

.... yep, I need to be convinced. (No threat of me rushing off to write
code that depends on this, I'm just not sure all the facts are on the
table.)
 
N

Noah Roberts

Gianni said:
True - that did happen. But the C libraries still survived and this is
one of those things that needs to be consistant across C and C++.

Come to think of that, I have a vague recollection that the layout of
members in C requires that they are laid out like this.

There is OODLES of code written in C that depends on this behaviour.
Take the X protocol header files as one example and this is meant to
work between DIFFERENT compilers !

If someone has a C standard handy, please look up the conventions on how
members must be laid out in a struct. If I remember correctly (and it's
been years and I don't remeber the document - it could have been K&R
TCL), members in a struct must be laid out so that the members are
placed in order of declaration. (padding between members to preserve
alignment is compiler specific.)

I think there are struct layout guarentees but I don't have a reference
handy to give a definitive answer.

Some sort of confusion is going on here I think. It was my
understanding that we where talking about comparing an array, which is
guaranteed to be contiguous, to a structure that has no such guarantee.
It is true that they are guarteed to be in order, but there is nothing
that states how far apart they are.

X never, afaik, uses any translation between array and structure. There
are two pieces that I know of where they do something that could be
mistaken for similar, unions such as XEvent, and offsets in Xt. The
XEvent union looks something like this:

typedef union
{
XMouseEvent mouseEvent,
XKeyEvent keyEvent,
...
} XEvent;

And each of the events in that union are a structure that always starts
with "int type". By my understanding this is the only structure element
that is guaranteed to always look the same no matter how you access it,
it has been a long time since I used Xlib. Based on this element you
typecast the XEvent to the particular type that it is and access its
elements from this new casting.

The second is the XOffset (XtOffset?) macro in Xt which looks something
like:

#define XOffset(x, y) ((x*)NULL)->y - (x*)NULL

but there are other definitions because this is apparently not portable.

This is then used to create an OO heigherarchy by incrementally adding
offsets as you dive into the object to get to the particular member in
question.

So, back to the origional:

union {
T c[4];
struct { T c0, c1, c2, c3 };
}

There is nothing that guarantees that &c == &ci even if it is
commonly the case. I am pretty sure that X would not depend on behavior
like this.

NR
 
G

Gianni Mariani

Noah said:
Gianni Mariani wrote: ....

So, back to the origional:

union {
T c[4];
struct { T c0, c1, c2, c3; };
}

There is nothing that guarantees that &c == &ci even if it is
commonly the case. I am pretty sure that X would not depend on behavior
like this.



Let's back-track the dicussion.

a) I stated that the layout of data in a structure is necessarily stable
because of the requirement to deal with various revisions of compilers
and libraries and that an indication of this would be the X library.

b) I also stated that all compilers I know would hold true for &c==&ci.

c) if you combine a and b, it is highly unlikely that you will ever have
a problem with code that assumed &c==&ci even though there may be no
requirement in the standard to hold true.

counter arguments.

a) There is no requirement in the standard for &c==&ci to hold true.

b) layout of members in a struct is compiler specific and may change.

So what would I personally do, I'd keep away from this but if I was
looking for somthing to fix, I would not be looking here to begin with.
 
K

Kevin Goodsell

Gianni said:
Let's back-track the dicussion.

a) I stated that the layout of data in a structure is necessarily stable
because of the requirement to deal with various revisions of compilers
and libraries and that an indication of this would be the X library.

b) I also stated that all compilers I know would hold true for &c==&ci.

c) if you combine a and b, it is highly unlikely that you will ever have
a problem with code that assumed &c==&ci even though there may be no
requirement in the standard to hold true.

counter arguments.

a) There is no requirement in the standard for &c==&ci to hold true.

b) layout of members in a struct is compiler specific and may change.

So what would I personally do, I'd keep away from this but if I was
looking for somthing to fix, I would not be looking here to begin with.


c) It doesn't matter at all whether &c == &ci because any code
attempting to exploit it is undefined. You may not store something in
one union member and access it through a different union member (except
in very specific circumstances). That was what I was talking about all
along.

-Kevin
 
S

Shane Beasley

You may not store something in one union member and access it through a
different union member (except in very specific circumstances).
</snip>

Not so fast. Everybody knows and agrees with that. The question is
whether this is one of those exceptional circumstances --
specifically, whether the two members in question happen to be
layout-compatible PODs.

My take: The Standard doesn't forbid an implementation from padding
member fields beyond alignment requirements, but common sense does.

A POD struct can *always* be implemented in terms of an array of its
largest member, and any further padding is just a waste of space.

That is, unless a type needs more padding to work efficiently than to
work correctly. But then, why extraneously pad a struct and not an
array? And how do you explain that to the guy who needs a massive
array of structs? If you're going to choose between time and space,
stick to your choice in all cases. Maybe give the user the right to
choose, but don't mix-and-match (unless you want to add that as an
optional extension, of course).

So, I wouldn't count on this behavior, but (if I cared enough) I'd
look into filing a DR. The only problem is that it's hard to argue
that "because I want to be able to do weird stuff with unions,
structs, and arrays" is sufficient reason to alter the core language.
:)

- Shane
 
G

Gianni Mariani

Kevin said:
Gianni Mariani wrote:
....


You may not store something in
one union member and access it through a different union member (except
in very specific circumstances).

....

There is a significant amount of code in existance that does otherwise
that seems to behave in a well defined manner.

Generalizations are allways wrong.

:^)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top