accessing members of union within a struct

Mark · Jul 7, 2012

Hello,

In a snippet below is there a real benefit not to address 'prefix4/prefix6'
members of a union:

#define IPV4_MAX_BYTELEN 4
#define IPV6_MAX_BYTELEN 16

struct prefix
{
u_int8_t family;
u_int8_t prefixlen;
union
{
u_int8_t prefix;
struct in_addr prefix4;
struct in6_addr prefix6;
u_int8_t val[9];
} u;
};
....
unsigned char key [40];

memcpy (key, obj->u.val, IPV4_MAX_BYTELEN);
memcpy (key, obj->u.val, IPV6_MAX_BYTELEN);

rather than in a more readable way:

memcpy (key, &obj->u.prefix4, IPV4_MAX_BYTELEN);
memcpy (key, &obj->u.prefix6, IPV6_MAX_BYTELEN);

Would you suggest/speculate what was the author of this code trying to
achieve? (I just can't see an obvious explanation to this.)
Thanks in advance.

Mark

Barry Schwarz · Jul 7, 2012

Hello,

In a snippet below is there a real benefit not to address 'prefix4/prefix6'
members of a union:

#define IPV4_MAX_BYTELEN 4
#define IPV6_MAX_BYTELEN 16

struct prefix
{
u_int8_t family;
u_int8_t prefixlen;
union
{
u_int8_t prefix;
struct in_addr prefix4;
struct in6_addr prefix6;
u_int8_t val[9];

For those compilers that go the extra mile and attempt bounds
checking, shouldn't it be val[16]?

} u;
};
...
unsigned char key [40];

memcpy (key, obj->u.val, IPV4_MAX_BYTELEN);
memcpy (key, obj->u.val, IPV6_MAX_BYTELEN);

rather than in a more readable way:

memcpy (key, &obj->u.prefix4, IPV4_MAX_BYTELEN);
memcpy (key, &obj->u.prefix6, IPV6_MAX_BYTELEN);

Would you suggest/speculate what was the author of this code trying to
achieve? (I just can't see an obvious explanation to this.)

There is obviously some code between the two statements of each set.
Possibly the author was thinking that the compiler would save the
address of val for the second call in the first set. In the second
set, the compiler would have to recognize that prefix4 and prefix6
start at the same address and not recompute it. Perhaps the author's
compiler did not recognize this and he decided to help it along.

On the other hand, he could have just thought he was being clever, in
the pejorative sense of the word.

Tim Rentsch · Jul 7, 2012

Mark said:
Hello,

In a snippet below is there a real benefit not to address 'prefix4/prefix6'
members of a union:

#define IPV4_MAX_BYTELEN 4
#define IPV6_MAX_BYTELEN 16

struct prefix
{
u_int8_t family;
u_int8_t prefixlen;
union
{
u_int8_t prefix;
struct in_addr prefix4;
struct in6_addr prefix6;
u_int8_t val[9];
} u;
};
...
unsigned char key [40];

memcpy (key, obj->u.val, IPV4_MAX_BYTELEN);
memcpy (key, obj->u.val, IPV6_MAX_BYTELEN);

rather than in a more readable way:

memcpy (key, &obj->u.prefix4, IPV4_MAX_BYTELEN);
memcpy (key, &obj->u.prefix6, IPV6_MAX_BYTELEN);

Would you suggest/speculate what was the author of this code trying to
achieve? (I just can't see an obvious explanation to this.)

Possibly the author thought accessing the relevant bytes
through an array of character type is safer, in a union
context, than accessing them through a member which does not
have character type, and might be not the same as the member
that was last stored into. (And ignoring that the accessing
is actually done inside memcpy(), which is another wrinkle.)

In fact, it shouldn't be any safer to use the character
array instead of, eg, &obj->u.prefix4, but a lot of people
think it is. Also, there are cases (this isn't one of them)
where access using a character type actually is safer than
just using the appropriate member directly, and using a
character array for access could be just a reflexive habit
to protect against that. So those sorts of behaviors might
explain why the author made the choice he (or she) did.

Mark · Jul 7, 2012

[skip]

unsigned char key [40];

memcpy (key, obj->u.val, IPV4_MAX_BYTELEN);
memcpy (key, obj->u.val, IPV6_MAX_BYTELEN);

rather than in a more readable way:

memcpy (key, &obj->u.prefix4, IPV4_MAX_BYTELEN);
memcpy (key, &obj->u.prefix6, IPV6_MAX_BYTELEN);

Would you suggest/speculate what was the author of this code trying to
achieve? (I just can't see an obvious explanation to this.)

Click to expand...

There is obviously some code between the two statements of each set.
Possibly the author was thinking that the compiler would save the
address of val for the second call in the first set. In the second
set, the compiler would have to recognize that prefix4 and prefix6
start at the same address and not recompute it. Perhaps the author's
compiler did not recognize this and he decided to help it along.

Sorry, the more accurate code snippet is:

if (obj->prefixlen == AF_INET)
memcpy (key, &obj->u.prefix4, IPV4_MAX_BYTELEN);
else
memcpy (key, &obj->u.prefix6, IPV6_MAX_BYTELEN);

So probably there is now way for compiler to save address of val when using
'memcpy (key, obj->u.val, IPV4_MAX_BYTELEN)' for example.

Mark

Mark · Jul 7, 2012

[skip]

Possibly the author thought accessing the relevant bytes
through an array of character type is safer, in a union
context, than accessing them through a member which does not
have character type

Why would character type be safer and in what situations, can you please
elaborate on this?
Thanks.

Mark

Tim Rentsch · Jul 8, 2012

Mark said:
[skip]

Possibly the author thought accessing the relevant bytes
through an array of character type is safer, in a union
context, than accessing them through a member which does not
have character type

Click to expand...

Why would character type be safer and in what situations, can you please
elaborate on this?

Character types differ from other types (or most other types) in
two important ways:

1. The particular type 'unsigned char' never has trap representations.
Most other types (in particular, all other scalar types, not counting
access to unsigned bitfields which really is a different category)
may have trap representations. Reading an object that has a trap
representation (ie, as seen by the type of the access) is undefined
behavior, even if the access is otherwise legal and well-defined.

2. Access done using any character type is never, in and of itself,
a violation of effective type rules. Access done through a pointer
to any non-character type may result (ie, in the absence of other
information to the contrary) in a violation of effective type
rules, which therefore would be undefined behavior.

I realize these descriptions have ventured into the more technical
realms of the C standard. Let me say it non-technically: it's
always safe to read any part of any validly accessible memory (in
the C standard sense) using an 'unsigned char' type to do the
access (and also to write, as long as the object in question doesn't
fall under the protection of a 'const' qualifier). For any other
type, an access might be safe or might not, depending on other
things in the program, and on the implementation.

While I'm thinking of it, let me mention another possible concern,
having to do with pointers. Any pointer value may always be
converted, safely, to a pointer-to-character type. For other
pointer types, converting to them may carry some risk, namely
the risk that the value being converted is not suitably aligned for
the type being converted, which results in -- you guessed it --
undefined behavior. Converting to a pointer-to-character type
doesn't have this risk, because all pointer values are suitably
aligned for being a pointer-to-character type.

In practice, these considerations make very little difference.
But for unions, many people are unsure enough of what the
Standard does guarantee that they fall back to the safest
form of access they know, which is using an unsigned char type
(or sometimes one of the other character types).

88888 Dihedral · Jul 9, 2012

Markæ–¼ 2012å¹´7æœˆ8æ—¥æ˜ŸæœŸæ—¥UTC+8ä¸Šåˆ12æ™‚28åˆ†54ç§’å¯«é“ï¼š

Hello,

In a snippet below is there a real benefit not to address 'prefix4/prefix6'
members of a union:

#define IPV4_MAX_BYTELEN 4

THIS IS TRIVIAL IN A 32 BIT INTEGER TO HOLD THE CONTENT.
PLEASE ADD THE PORT NUMBER IN ANOTHER INTEGER.

#define IPV6_MAX_BYTELEN 16

struct prefix
{
u_int8_t family;
u_int8_t prefixlen;
union
{
u_int8_t prefix;
struct in_addr prefix4;
struct in6_addr prefix6;
u_int8_t val[9];
} u;
};
...
unsigned char key [40];

memcpy (key, obj->u.val, IPV4_MAX_BYTELEN);
memcpy (key, obj->u.val, IPV6_MAX_BYTELEN);

rather than in a more readable way:

memcpy (key, &obj->u.prefix4, IPV4_MAX_BYTELEN);

memcpy (key, &obj->u.prefix6, IPV6_MAX_BYTELEN);

Would you suggest/speculate what was the author of this code trying to
achieve? (I just can't see an obvious explanation to this.)
Thanks in advance.

Mark

Ok, ipv6 has been around for so manny years.

Does every consumer electronics toy such as mobile phones,
and i-pads, and etc. really need a unique IP in IPv6 to be trackable by
the government?

Splitting union members in different struct	2	Apr 14, 2011
Union and strict aliasing	4	Jul 28, 2012
Generically accessing members of a struct.	3	May 23, 2009
Accessing members of a struct	5	Apr 11, 2006
Accessing alternate union members	2	Jan 17, 2004
reading the source of calling a singly-linked list	4	Feb 12, 2010
embarrassing spaghetti code needs stylistic advice	72	Mar 20, 2009
In the Matter of Herb Schildt: a Detailed Analysis of "C: TheComplete Nonsense"	109	Apr 3, 2010

accessing members of union within a struct

Mark

Barry Schwarz

Tim Rentsch

Mark

Mark

Tim Rentsch

88888 Dihedral

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads