union of structs: common variable stored in same address?

R

Rui Maciel

Consider the following code:

<code>
struct S1 {
size_t type;
// other declarations
};

struct S2 {
size_t type;
// other declarations
};

//snip. an unspecified number of structs S?

struct Sn {
size_t type;
// other declarations
};

typedef union {
struct S1 *s1;
struct S2 *s2;
// a pointer to each struct S? that has been declared
struct Sn *sn;
} cursor_t;

void foo(void)
{
cursor_t cursor;
struct S1 bar;
// init bar

cursor.s1 = &bar;
}
</code>


Does the C language guarantee that cursor.s2->type to cursor.sn->type will refer to bar.type? If so, will
that still be the case if the programmer assigns &bar to cursor.s2 instead of cursor.s1?


Thanks in advance,
Rui Maciel
 
B

Barry Schwarz

Consider the following code:

<code>
struct S1 {
size_t type;
// other declarations
};

struct S2 {
size_t type;
// other declarations
};

//snip. an unspecified number of structs S?

struct Sn {
size_t type;
// other declarations
};

typedef union {
struct S1 *s1;
struct S2 *s2;
// a pointer to each struct S? that has been declared
struct Sn *sn;
} cursor_t;

void foo(void)
{
cursor_t cursor;
struct S1 bar;
// init bar

cursor.s1 = &bar;
}
</code>


Does the C language guarantee that cursor.s2->type to cursor.sn->type will refer to bar.type? If so, will
that still be the case if the programmer assigns &bar to cursor.s2 instead of cursor.s1?

Assigning &bar to cursor.s2 without a cast is a constraint violation.
The language guarantees that all pointers to struct have the same size
and representation. It also guarantees that there is no padding
before the first member of a struct. But even with a cast, you still
have no guarantee.

It is entirely possible for the alignment of struct S3 to be more
restrictive than the alignment of struct S1. If, for example, struct
S1 consists entirely of int members while struct S3 has a double, then
on many systems a struct S1 object would need to be aligned on a
multiple of four while a struct S3 would need to be aligned on a
multiple of 8. If bar is aligned on an odd multiple of four, then
cursor.s3 contains an invalid value and attempting to dereference it
is probably undefined behavior (as a natural extrapolation of the fact
that evaluating (struct S3*)cursor.s1 is specified as undefined
behavior in the standard).

Remember, if you lie to the compiler, it will get its revenge.
 
R

Rui Maciel

Barry said:
Assigning &bar to cursor.s2 without a cast is a constraint violation.
The language guarantees that all pointers to struct have the same size
and representation. It also guarantees that there is no padding
before the first member of a struct. But even with a cast, you still
have no guarantee.

It is entirely possible for the alignment of struct S3 to be more
restrictive than the alignment of struct S1.

In that case, what about the following code:

<code>
// each type is used to refer to each struct S? declaration
enum s_type {type_1, type_2, ... , type_n};

// snip struct declarations

void foo(void)
{
cursor_t cursor;
struct S1 bar;
// init bar

cursor.s2 = &bar;
switch(cursor.s2->type)
{
case type_1:
// use cursor.s1
break;

case type_2:
// use cursor.s2
break;
// snip
}
}
</code>

In this case, as there is no padding before the first element of the struct and every struct which will be
referenced by cursor has in common the type of it's first element, then will it be possible to verify the
struct type with a union member that doesn't correspond to the passed struct and then, from the value
stored in type, use the corresponding union member?


Remember, if you lie to the compiler, it will get its revenge.

But this appears to be a very clever lie. The compiler should be amused by such a ruse :D


Rui Maciel
 
B

Barry Schwarz

In that case, what about the following code:

<code>
// each type is used to refer to each struct S? declaration
enum s_type {type_1, type_2, ... , type_n};

// snip struct declarations

void foo(void)
{
cursor_t cursor;
struct S1 bar;
// init bar

cursor.s2 = &bar;

This is still a constraint violation. Did you mean to use cursor.s1
or make bar a struct S2 or use a cast?

I will assume that somewhere you actually put values in bar.
switch(cursor.s2->type)

The problem has not gone away. In fact it is unchanged. If any of
the struct S2, S3, ..., Sn has a more demanding alignment than struct
S1 and if bar is not properly aligned for the most restrictive, then
you will have undefined behavior, possibly in the assignment and
switch statements above, and definitely in at least one of the case
groups.
{
case type_1:
// use cursor.s1
break;

case type_2:
// use cursor.s2
break;
// snip
}
}
</code>

In this case, as there is no padding before the first element of the struct and every struct which will be
referenced by cursor has in common the type of it's first element, then will it be possible to verify the
struct type with a union member that doesn't correspond to the passed struct and then, from the value
stored in type, use the corresponding union member?

Not if the value being extracted from the union does meet the
requirements imposed on its type.

If the alignment for struct S1 is 4 and for S2 is 8 and the
address of bar is 0x100004, then your assignment to cursor.s2 invokes
undefined behavior as does your attempt to evaluate cursor.s2->type.

If the alignment for struct S1 and S2 is 4 and for S3 is 8 and
the address of bar is 0x100004, then you will not experience undefined
behavior until you attempt to dereference cursor.s3 (in the implied
case type_3 group).
But this appears to be a very clever lie. The compiler should be amused by such a ruse :D

Right up until the time you try to test the code in front of your most
important customer. And appearances are often deceiving.

Instead of (possibly in addition to) imbedding your pointers in a
union, why not imbed the structures also in a (probably different)
union and avoid the alignment issues completely.
 
R

Richard Bos

Barry Schwarz said:
Assigning &bar to cursor.s2 without a cast is a constraint violation.
The language guarantees that all pointers to struct have the same size
and representation. It also guarantees that there is no padding
before the first member of a struct. But even with a cast, you still
have no guarantee.

Note, however, that if you pull this trick, not with a union of struct
pointers, but with a union of the structs themselves, then, as long as
the initial members are of identical types, it is required to work
(6.5.2.3 #5).

Richard
 
R

Rui Maciel

Richard said:
Note, however, that if you pull this trick, not with a union of struct
pointers, but with a union of the structs themselves, then, as long as
the initial members are of identical types, it is required to work
(6.5.2.3 #5).

So, considering the following code:

<code>
typedef union {
size_t type; // new, extra-struct type
struct S1 s1;
struct S2 s2;
// a pointer to each struct S? that has been declared
struct Sn sn;
} cursor_t;


int main(void)
{
cursor_t *cursor;
struct S1 bar;

// init bar

cursor = (cursor_t*)&bar;

switch(cursor->type)
{
// perform stuff specific for each struct type
}

return 0;
}
</code>


Would this be valid?


Best regards,
Rui Maciel
 
B

Ben Bacarisse

Rui Maciel said:
So, considering the following code:

<code>
typedef union {
size_t type; // new, extra-struct type
struct S1 s1;
struct S2 s2;
// a pointer to each struct S? that has been declared
struct Sn sn;
} cursor_t;

This is not what Richard was suggesting. You now have a union that
contains /either/ a size_t /or/ one of the struct types. Keep the
type member in the structs and remove it from the union and you have
what Richard proposed.
 
A

Alan Curry

This is not what Richard was suggesting. You now have a union that
contains /either/ a size_t /or/ one of the struct types. Keep the
type member in the structs and remove it from the union and you have
what Richard proposed.

If all the structs start with a size_t type, there's no harm in adding it to
the union as a separate alternative. If they don't, you've got a disaster
regardless.

The definition above looks a lot like an XEvent from X11/Xlib.h, which
starts out:

typedef union _XEvent {
int type; /* must not be changed; first element */
XAnyEvent xany;
XKeyEvent xkey;
XButtonEvent xbutton;
....

The code using this union often looks like

/* get a pointer-to-union ev from somewhere */
switch(ev->type) {
case KeyPress: /* do something with ev->xkey */
case KeyPress: /* do something with ev->xkey */
case ButtonRelease: /* do something with ev->xbutton */
case ButtonRelease: /* do something with ev->xbutton */
}

Without the standalone "int type" in the union, you'd have to do the initial
test with one of the structs:

switch(ev->xkey.type) {
case KeyPress: /* do something with ev->xkey */
case KeyPress: /* do something with ev->xkey */
case ButtonRelease: /* do something with ev->xbutton */
case ButtonRelease: /* do something with ev->xbutton */
}

And that just looks weird, peeking into the one member of the union and then
deciding to read from a different one.
 
B

Ben Bacarisse

If all the structs start with a size_t type, there's no harm in adding it to
the union as a separate alternative. If they don't, you've got a disaster
regardless.

I had not considered the possibility that the OP had duplicated rather
than moved the type member. However, I disagree that duplicating it
in the union is harmless. It's harmless if you never use it, yes, but
your example shows that was not what you mean.
The definition above looks a lot like an XEvent from X11/Xlib.h, which
starts out:

typedef union _XEvent {
int type; /* must not be changed; first element */
XAnyEvent xany;
XKeyEvent xkey;
XButtonEvent xbutton;
...

The code using this union often looks like

/* get a pointer-to-union ev from somewhere */
switch(ev->type) {
case KeyPress: /* do something with ev->xkey */
case KeyPress: /* do something with ev->xkey */
case ButtonRelease: /* do something with ev->xbutton */
case ButtonRelease: /* do something with ev->xbutton */
}

This is likely to work, yes, but I don't think is it guaranteed to do
what the programmer expects. I think it is permissible for an
implementation to align structs and size_ts in such a way that the
type members don't coincide. That is a practical matter. More
formally, I can't see any text in the standard that assures me that
will work as expected.
Without the standalone "int type" in the union, you'd have to do the initial
test with one of the structs:

switch(ev->xkey.type) {
case KeyPress: /* do something with ev->xkey */
case KeyPress: /* do something with ev->xkey */
case ButtonRelease: /* do something with ev->xbutton */
case ButtonRelease: /* do something with ev->xbutton */
}

And that just looks weird, peeking into the one member of the union and then
deciding to read from a different one.

Yes, it looks odd but I think it is needed. You can make it less odd
by having a struct that has only a type member whose name suggests
that its purpose is somehow universal: ev->all_events.type.
 
A

Alan Curry

[email protected] (Alan Curry) said:
The definition above looks a lot like an XEvent from X11/Xlib.h, which
starts out:

typedef union _XEvent {
int type; /* must not be changed; first element */
XAnyEvent xany;
XKeyEvent xkey;
XButtonEvent xbutton;
[...]

This is likely to work, yes, but I don't think is it guaranteed to do
what the programmer expects. I think it is permissible for an
implementation to align structs and size_ts in such a way that the
type members don't coincide. That is a practical matter. More
formally, I can't see any text in the standard that assures me that
will work as expected.

xkey.type is the first member of a struct, so it has the same address as the
struct, which is also the address of the union, which is also the address of
the union's type member. The struct if allocated alone might have a different
alignment requirement than the integer if allocated alone, but the union
has to be aligned for all of its members. Where's the loophole?
 
J

James Kuyper

Alan said:
The definition above looks a lot like an XEvent from X11/Xlib.h, which
starts out:

typedef union _XEvent {
int type; /* must not be changed; first element */
XAnyEvent xany;
XKeyEvent xkey;
XButtonEvent xbutton;
[...]
This is likely to work, yes, but I don't think is it guaranteed to do
what the programmer expects. I think it is permissible for an
implementation to align structs and size_ts in such a way that the
type members don't coincide. That is a practical matter. More
formally, I can't see any text in the standard that assures me that
will work as expected.

xkey.type is the first member of a struct, so it has the same address as the
struct, which is also the address of the union, which is also the address of
the union's type member.

6.7.2.1p13 says "A pointer to a structure object, suitably converted,
points to its initial member (or if that member is a bit-field, then to
the unit in which it resides), and vice versa."

However, there's no corresponding wording for unions.

... The struct if allocated alone might have a different
alignment requirement than the integer if allocated alone, but the union
has to be aligned for all of its members. Where's the loophole?

The standard permits padding before union members, though I know of no
reason why any implementation would take advantage of that option.
 
E

Eric Sosman

James said:
[...]
The standard permits padding before union members, though I know of no
reason why any implementation would take advantage of that option.

6.7.2.1p14: "[...] A pointer to a union object, suitably
converted, points to each of its members (or if a member is a
bitfield, then to the unit in which it resides), and vice versa."

Where does this leave room for (sorry) padding?

(To forestall a possible objection: I do not believe that
"suitably converted" can be used to smuggle in any sleight of
hand like invisibly adjusting the pointer to hide padding.)
 
J

jameskuyper

Eric said:
James said:
[...]
The standard permits padding before union members, though I know of no
reason why any implementation would take advantage of that option.

6.7.2.1p14: "[...] A pointer to a union object, suitably
converted, points to each of its members (or if a member is a
bitfield, then to the unit in which it resides), and vice versa."

OK - now I'm feeling pretty stupid. I looked for that text, failed to
find it despited the fact that it immediately follows 6.7.2.1p13,
which I've already cited. Can I claim that this was due to the
lingering aftermath of the flu I just recovered from? :-(.
Where does this leave room for (sorry) padding?

(To forestall a possible objection: I do not believe that
"suitably converted" can be used to smuggle in any sleight of
hand like invisibly adjusting the pointer to hide padding.)

Unfortunately, due to what I consider a defect in the standard, I do
believe that it can.

It is widely understood that a permissible conversion of a pointer to
a given object results in a new pointer value that points at an object
whose initial byte is the same as the initial byte of the object
pointed at by the original pointer - but the standard never actually
says so, except when the destination type is a pointer to a character
type (6.3.2.3p7).

Using the Rui Maciel's cursor_t, given

cursor_t ct;

Then 6.7.2.1p13 says that (int*)&ct == &ct.type, while 6.3.2.3p7 says
that (char*)&ct.type must point at the first byte of ct.type, and that
(char*)&ct must point at the first byte of ct. It might seem that you
can combine these facts to prove that (char*)&ct and (char*)&ct.type
point at the same byte. However, in order to do that, you have to
assume that (char*)(int*)&ct == (char*)&ct; and the standard provides
no guarantees that this will be the case.

The only time that the standard says anything about the result of
chaining two pointer conversions together is when the second
conversion converts back to the type of the original pointer; in some
of those cases, it guarantees that the result will compare equal to
the original pointer value. None of those cases apply here.
 
R

Rui Maciel

Ben said:
This is not what Richard was suggesting. You now have a union that
contains /either/ a size_t /or/ one of the struct types. Keep the
type member in the structs and remove it from the union and you have
what Richard proposed.

Yes, I was aware that the presence of a size_t in the union was never suggested by Richard. Nonetheless, if a
pointer to a union object points to each of it's members and a pointer to a structure object points to it's
initial member then, from the code that I've provided and according to the C standard, shouldn't cursor->type
point to the exact same size_t member of any of the structs' objects being considered?


Rui Maciel
 
T

Tim Rentsch

Rui Maciel said:
Yes, I was aware that the presence of a size_t in the union was
never suggested by Richard. Nonetheless, if a pointer to a union
object points to each of it's members and a pointer to a
structure object points to it's initial member then, from the
code that I've provided and according to the C standard,
shouldn't cursor->type point to the exact same size_t member of
any of the structs' objects being considered?

Yes, the Standard does say just that.
 
T

Tim Rentsch

Ben Bacarisse said:
(e-mail address removed) (Alan Curry) writes:
[snip]
The definition above looks a lot like an XEvent from X11/Xlib.h, which
starts out:

typedef union _XEvent {
int type; /* must not be changed; first element */
XAnyEvent xany;
XKeyEvent xkey;
XButtonEvent xbutton;
...

The code using this union often looks like

/* get a pointer-to-union ev from somewhere */
switch(ev->type) {
case KeyPress: /* do something with ev->xkey */
case KeyPress: /* do something with ev->xkey */
case ButtonRelease: /* do something with ev->xbutton */
case ButtonRelease: /* do something with ev->xbutton */
}

This is likely to work, yes, but I don't think is it guaranteed to do
what the programmer expects. I think it is permissible for an
implementation to align structs and size_ts in such a way that the
type members don't coincide. That is a practical matter. More
formally, I can't see any text in the standard that assures me that
will work as expected.

What Alan said -- all members of a union are aligned at the
beginning of the union, and the first member of a struct is
aligned at the beginning of the struct.
Yes, it looks odd but I think it is needed. You can make it less odd
by having a struct that has only a type member whose name suggests
that its purpose is somehow universal: ev->all_events.type.

Why do you think it's needed? Given what the Standard says
(6.7.2.1, among other places) about member alignment in the
two cases, aren't the two objects guaranteed to be in the
same place? Or do you think the access 'ev->type' is suspect
for some other reason?
 
T

Tim Rentsch

James Kuyper said:
Alan said:
(e-mail address removed) (Alan Curry) writes:
The definition above looks a lot like an XEvent from X11/Xlib.h, which
starts out:

typedef union _XEvent {
int type; /* must not be changed; first element */
XAnyEvent xany;
XKeyEvent xkey;
XButtonEvent xbutton; [...]
This is likely to work, yes, but I don't think is it guaranteed to do
what the programmer expects. I think it is permissible for an
implementation to align structs and size_ts in such a way that the
type members don't coincide. That is a practical matter. More
formally, I can't see any text in the standard that assures me that
will work as expected.

xkey.type is the first member of a struct, so it has the same address as the
struct, which is also the address of the union, which is also the address of
the union's type member.

6.7.2.1p13 says "A pointer to a structure object, suitably converted,
points to its initial member (or if that member is a bit-field, then
to the unit in which it resides), and vice versa."

However, there's no corresponding wording for unions.

The very next paragraph covers the union case.
... The struct if allocated alone might have a different
alignment requirement than the integer if allocated alone, but the union
has to be aligned for all of its members. Where's the loophole?

The standard permits padding before union members, [snip]

Essentially no one other than you believes this. Doesn't
this suggest that how you read the Standard is in need of
reconsideration?
 
T

Tim Rentsch

jameskuyper said:
Eric said:
James said:
[...]
The standard permits padding before union members, though I know of no
reason why any implementation would take advantage of that option.

6.7.2.1p14: "[...] A pointer to a union object, suitably
converted, points to each of its members (or if a member is a
bitfield, then to the unit in which it resides), and vice versa."

OK - now I'm feeling pretty stupid. I looked for that text, failed to
find it despited the fact that it immediately follows 6.7.2.1p13,
which I've already cited. Can I claim that this was due to the
lingering aftermath of the flu I just recovered from? :-(.
Where does this leave room for (sorry) padding?

(To forestall a possible objection: I do not believe that
"suitably converted" can be used to smuggle in any sleight of
hand like invisibly adjusting the pointer to hide padding.)

Unfortunately, due to what I consider a defect in the standard, I do
believe that it can.

It is widely understood that a permissible conversion of a pointer to
a given object results in a new pointer value that points at an object
whose initial byte is the same as the initial byte of the object
pointed at by the original pointer - but the standard never actually
says so, [except for character types].

What you mean is the Standard never says this directly.
Essentially everyone other than you believes it's implied by
other statements in the Standard. What makes you think your
interpretation is right and all those other people are wrong?
Using the Rui Maciel's cursor_t, given

cursor_t ct;

Then 6.7.2.1p13 says that (int*)&ct == &ct.type, while 6.3.2.3p7 says
that (char*)&ct.type must point at the first byte of ct.type, and that
(char*)&ct must point at the first byte of ct. It might seem that you
can combine these facts to prove that (char*)&ct and (char*)&ct.type
point at the same byte. However, in order to do that, you have to
assume that (char*)(int*)&ct == (char*)&ct; and the standard provides
no guarantees that this will be the case.

The only time that the standard says anything about the result of
chaining two pointer conversions together is when the second
conversion converts back to the type of the original pointer; in some
of those cases, it guarantees that the result will compare equal to
the original pointer value. None of those cases apply here.

Again, the only time it says anything directly. The
Standard doesn't always express itself in direct language.
Isn't it more likely that you've misunderstood some
other part of the Standard than that this point has been
missed by everyone else who's looked at it?
 
J

jameskuyper

Tim said:
It is widely understood that a permissible conversion of a pointer to
a given object results in a new pointer value that points at an object
whose initial byte is the same as the initial byte of the object
pointed at by the original pointer - but the standard never actually
says so, [except for character types].

What you mean is the Standard never says this directly.
Essentially everyone other than you believes it's implied by
other statements in the Standard. What makes you think your
interpretation is right and all those other people are wrong?

I have been unable to identify a valid argument derived from the
actual requirements of the standard to demonstrate that this
conclusion is implied by those requirements. I've discussed that
opinion not just once, but many times, and no one who has disagreed
have ever presented such an argument, either (though not for want of
trying). Do I need anything more than that to justify my conclusion
that no such argument is possible?

What does the number of people who disagree with me have to do with
that?

....
Again, the only time it says anything directly. The
Standard doesn't always express itself in direct language.
Isn't it more likely that you've misunderstood some
other part of the Standard than that this point has been
missed by everyone else who's looked at it?

If I had never discussed this issue publicly before, and never seen
the opposing "arguments" before, I might be more willing to consider
that possibility. However, when people with a great deal of knowledge
of the standard, who believe strongly that I'm wrong about this, are
unable to articulate valid arguments based upon correct premises to
support their belief, I think I'm entitled to a little more confidence
in my understanding of this issue than you think I should have.

If you consider that arrogance, so be it.
 
B

Ben Bacarisse

Tim Rentsch said:
Ben Bacarisse said:
(e-mail address removed) (Alan Curry) writes:
[snip]
The definition above looks a lot like an XEvent from X11/Xlib.h, which
starts out:

typedef union _XEvent {
int type; /* must not be changed; first element */
XAnyEvent xany;
XKeyEvent xkey;
XButtonEvent xbutton;
...

The code using this union often looks like

/* get a pointer-to-union ev from somewhere */
switch(ev->type) {
case KeyPress: /* do something with ev->xkey */
case KeyPress: /* do something with ev->xkey */
case ButtonRelease: /* do something with ev->xbutton */
case ButtonRelease: /* do something with ev->xbutton */
}

This is likely to work, yes, but I don't think is it guaranteed to do
what the programmer expects. I think it is permissible for an
implementation to align structs and size_ts in such a way that the
type members don't coincide. That is a practical matter. More
formally, I can't see any text in the standard that assures me that
will work as expected.

What Alan said -- all members of a union are aligned at the
beginning of the union, and the first member of a struct is
aligned at the beginning of the struct.

Honestly I am not yet persuaded. I've been busy so I have not had
time to think this through but the trouble is I am not 100% sure that
the wording in the standard is watertight.

I don't think is makes the assurance you state about alignment, at
least not directly. What it does is say that a pointer, suitably
converted, points to all union members.

I can't quite shake the fear that some peculiar addressing system
allows size_t and structs (even ones that start with a size_t member)
to differently aligned whilst permitting the pointers to work as
required due to the conversion. However, off and on over the last few
days I've tried top come up with a set of mappings between the various
address types that give the effect I am thinking of and I can't!
Every time, the mappings fall foul of some requirement or other or
they simply put the size_t members in the same place (as one would
expect). It is only a nagging doubt that prevents me from saying,
"no, I fold".

<snip>
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,535
Members
45,008
Latest member
obedient dusk

Latest Threads

Top