offsetOf

J

Jack Boot

Hello:

This came up in a code review recently, I've never heard of it before,
opinions welcome.

There is a struct that looks a bit like this

struct foo {
int a[10];
long b;
...
};

Now I have a function that starts thusly:

void
bar ( struct foo * f )
{
int * ip = (int *) foo;
...
}

The reviewer recommended, that for maximal portability I should replace
this by:

int * ip = ( (int *) ( ( (unsigned char *) foo ) + offsetOf(struct foo,
a) ) );

I've never seen this construction before, also I think type punning is
inherently non-portable so I'm not sure how much this buys you.

Cheers,
JB
 
I

Ian Collins

Hello:

This came up in a code review recently, I've never heard of it before,
opinions welcome.

There is a struct that looks a bit like this

struct foo {
int a[10];
long b;
...
};

Now I have a function that starts thusly:

void
bar ( struct foo * f )
{
int * ip = (int *) foo;
...
}

The reviewer recommended, that for maximal portability I should replace
this by:

int * ip = ( (int *) ( ( (unsigned char *) foo ) + offsetOf(struct foo,
a) ) );

Where's the chunder bucket?

Why not

int* ip = foo->a;
I've never seen this construction before, also I think type punning is
inherently non-portable so I'm not sure how much this buys you.

It's a horrible mess no matter how you look at it. What is ip used for?
 
B

Ben Bacarisse

Jack Boot said:
This came up in a code review recently, I've never heard of it before,
opinions welcome.

There is a struct that looks a bit like this

struct foo {
int a[10];
long b;
...
};

Now I have a function that starts thusly:

void
bar ( struct foo * f )
{
int * ip = (int *) foo;
...
}

The reviewer recommended, that for maximal portability I should replace
this by:

int * ip = ( (int *) ( ( (unsigned char *) foo ) + offsetOf(struct foo,
a) ) );

I've never seen this construction before, also I think type punning is
inherently non-portable so I'm not sure how much this buys you.

Nothing but obfuscation. offsetOf(struct foo, a) is 0 (it can't be
otherwise) so this just converts to a char * first. This won't hurt but
it won't help either. This:

int *ip = f->a;

is, to my mind, the clearest way to write it. I can't see why it would
be done with any casts at all.
 
E

Eric Sosman

Hello:

This came up in a code review recently, I've never heard of it before,
opinions welcome.

There is a struct that looks a bit like this

struct foo {
int a[10];
long b;
...
};

Now I have a function that starts thusly:

void
bar ( struct foo * f )
{
int * ip = (int *) foo;

I guess you mean `f' here.
...
}

The reviewer recommended, that for maximal portability I should replace
this by:

int * ip = ( (int *) ( ( (unsigned char *) foo ) + offsetOf(struct foo,
a) ) );

I've never seen this construction before, also I think type punning is
inherently non-portable so I'm not sure how much this buys you.

I don't think the reviewer is concerned about portability, because
the code as it stands is entirely portable: A pointer to a struct can
be converted to a pointer to its first element, and vice versa, and
nothing will go wrong.

BUT what happens if somebody adds a new element `x' to the struct
and puts it before `a', so `a' is no longer the first element? The
code becomes "portably incorrect," and the compiler is unlikely to warn
you about the error.

If what you need is a pointer to the start of the `a' array, use

int *ip = f->a;

This construct will find `a' no matter where it sits in the struct --
and has the further virtue that the the compiler will squawk if somebody
changes `a' to an array of `long' instead of `int'.

If you also require (for some reason not apparent in the code as
shown) that `a' be first, you can

assert(offsetof(struct foo, a) == 0);

.... or use a "compile-time assertion" (GIYF) to make the same test.
 
J

J. J. Farrell

Jack said:
Hello:

This came up in a code review recently, I've never heard of it before,
opinions welcome.

There is a struct that looks a bit like this

struct foo {
int a[10];
long b;
...
};

Now I have a function that starts thusly:

void
bar ( struct foo * f )
{
int * ip = (int *) foo;
...
}

The reviewer recommended, that for maximal portability I should replace
this by:

int * ip = ( (int *) ( ( (unsigned char *) foo ) + offsetOf(struct foo,
a) ) );

I've never seen this construction before, also I think type punning is
inherently non-portable so I'm not sure how much this buys you.

What is "offserOf"? Did they mean "offsetof"? I'll assume so.

Frankly I'd find a reviewer who has a clue and lose this one. Your
original is valid and portable, though it could arguably be improved in
style and readability. This proposed improvement is obfuscatory nonsense
which does nothing.

offsetof(struct foo, a) is always 0, by definition, so we can remove one
level of obfuscation. This gets us to:

int * ip = ( (int *) ( ( (unsigned char *) foo ) + 0 ) );

which simplifies after removal of pointless bits and obfuscatory
redundant brackets to

int * ip = (int *)(unsigned char *)foo;

which is a silly way of writing

int * ip = (int *)foo;

which means it is no different from your version, apart from taking a
lot more typing and being a lot more prone to misunderstanding.

My comment on your code would have been along the lines of:

Why the cast? It's best if casts are only used where absolutely
necessary, usually indicating something which is non-portable or
otherwise dodgy. What you're doing here is entirely portable and valid,
so best to do it cleanly without a cast. You want ip to point to the int
which you know is at the start of array a which is at the beginning of
struct foo, so why not just

int *ip = foo->a;
 
J

Jack Boot

Ben said:
Jack Boot said:
This came up in a code review recently, I've never heard of it before,
opinions welcome.

There is a struct that looks a bit like this

struct foo {
int a[10];
long b;
...
};

Now I have a function that starts thusly:

void
bar ( struct foo * f )
{
int * ip = (int *) foo;
...
}

The reviewer recommended, that for maximal portability I should replace
this by:

int * ip = ( (int *) ( ( (unsigned char *) foo ) + offsetOf(struct foo,
a) ) );

I've never seen this construction before, also I think type punning is
inherently non-portable so I'm not sure how much this buys you.

Nothing but obfuscation. offsetOf(struct foo, a) is 0 (it can't be
otherwise) so this just converts to a char * first. This won't hurt but
it won't help either. This:

int *ip = f->a;

is, to my mind, the clearest way to write it. I can't see why it would
be done with any casts at all.

Thanks for all the replies.

I wrote the code with the cast because I used struct foo as an Opaque Type
in that file (only using pointers, the header with the struct definition
not #include'd) to emulate C++-style encapsulation. Therefore I could not
dereference f to get f->a.

The reviewer was concerned that there might be padding before a. I'm not
sure any compiler would need to insert padding around int types, though I
guess it's possible on a 64-bit arch with 32-bit ints. Anyways, to use his
offsetOf solution you need to include the struct def, so in that case I
agree with all you guys that f->a would be preferable.

Cheers,
JB
 
I

Ian Collins

Ben said:
Jack Boot said:
This came up in a code review recently, I've never heard of it before,
opinions welcome.

There is a struct that looks a bit like this

struct foo {
int a[10];
long b;
...
};

Now I have a function that starts thusly:

void
bar ( struct foo * f )
{
int * ip = (int *) foo;
...
}

The reviewer recommended, that for maximal portability I should replace
this by:

int * ip = ( (int *) ( ( (unsigned char *) foo ) + offsetOf(struct foo,
a) ) );

I've never seen this construction before, also I think type punning is
inherently non-portable so I'm not sure how much this buys you.

Nothing but obfuscation. offsetOf(struct foo, a) is 0 (it can't be
otherwise) so this just converts to a char * first. This won't hurt but
it won't help either. This:

int *ip = f->a;

is, to my mind, the clearest way to write it. I can't see why it would
be done with any casts at all.

Thanks for all the replies.

I wrote the code with the cast because I used struct foo as an Opaque Type
in that file (only using pointers, the header with the struct definition
not #include'd) to emulate C++-style encapsulation. Therefore I could not
dereference f to get f->a.

The reviewer was concerned that there might be padding before a.

There won't be. As Eric said: A pointer to a struct can be converted to
a pointer to its first element, and vice versa.
 
P

Phil Carmody

Ben Bacarisse said:
Jack Boot said:
This came up in a code review recently, I've never heard of it before,
opinions welcome.

There is a struct that looks a bit like this

struct foo {
int a[10];
long b;
...
};

Now I have a function that starts thusly:

void
bar ( struct foo * f )
{
int * ip = (int *) foo;
...
}

The reviewer recommended, that for maximal portability I should replace
this by:

int * ip = ( (int *) ( ( (unsigned char *) foo ) + offsetOf(struct foo,
a) ) );

I've never seen this construction before, also I think type punning is
inherently non-portable so I'm not sure how much this buys you.

Nothing but obfuscation.

That's kind to it.
offsetOf(struct foo, a) is 0 (it can't be
otherwise) so this just converts to a char * first. This won't hurt but
it won't help either. This:

int *ip = f->a;

I have this fetish for wanting addresses to explicitly be addresses
(the result of applying the 'address of' operator), so my default
would be that which explicitly says 'the address of the first element':

int *ip = &f->a[0];

I'm consistent with this, I also like to pass functions around with
an explicit '&' before their name.

The weird thing is that I'm usually one who favours brevity, but for
addresses, I like a clear flag that reminds me what I'm dealing with.
is, to my mind, the clearest way to write it. I can't see why it would
be done with any casts at all.

100% agree. Casts are usually a warning sign.

Phil
 
B

Ben Bacarisse

Phil Carmody said:
Ben Bacarisse said:
Jack Boot <[email protected]> writes:
struct foo {
int a[10];
long b;
...
};
void
bar ( struct foo * f )
{
int *ip = f->a;

I have this fetish for wanting addresses to explicitly be addresses
(the result of applying the 'address of' operator), so my default
would be that which explicitly says 'the address of the first element':

int *ip = &f->a[0];

I'm consistent with this,

How many of these are there in your code:

const char *digits = &"0123456789"[0];
...
strcmp(&"string"[0], str) == 0

?

<snip>
 
J

James Kuyper

On 07/24/2011 06:30 AM, Jack Boot wrote:
....
I wrote the code with the cast because I used struct foo as an Opaque Type
in that file (only using pointers, the header with the struct definition
not #include'd) to emulate C++-style encapsulation. Therefore I could not
dereference f to get f->a.

When you use opaque types, translation units which work with that type
should fall into two categories:

1. Code which handles the type as opaque. Such code only needs a
declaration of the type of the form:
struct my_opaque_type;
It's the use of such a declaration that makes this an opaque type.

2. Code which looks inside the type. Such code should have a full
definition of the data type. This definition should be #included from a
single common header file.

The code you're writing is solidly in the second category, but for some
reason you're trying to write it without the full definition of the
type. That's the fundamental mistake; the dubious code you're writing is
a normal consequence of that bad decision, and is in fact the reason why
that decision was a bad one.
The reviewer was concerned that there might be padding before a.

There cannot be - but that's not the relevant issue. In the future, the
definition of the type might be changed so that something does come
before 'a', and that is a serious problem for your code, but the
something that comes before 'a' has to include at least one real field;
it can't all be padding.

If your code needs to access a, it should have available to it a full
definition for the type that actually includes a. If, for any reason, it
should NOT have that full type definition, it should also not be trying
to access a.
 
J

Jens Thoms Toerring

Jack Boot said:
Ben said:
Jack Boot said:
This came up in a code review recently, I've never heard of it before,
opinions welcome.

There is a struct that looks a bit like this

struct foo {
int a[10];
long b;
...
};

Now I have a function that starts thusly:

void
bar ( struct foo * f )
{
int * ip = (int *) foo;
...
}

The reviewer recommended, that for maximal portability I should replace
this by:

int * ip = ( (int *) ( ( (unsigned char *) foo ) + offsetOf(struct foo,
a) ) );

I've never seen this construction before, also I think type punning is
inherently non-portable so I'm not sure how much this buys you.

Nothing but obfuscation. offsetOf(struct foo, a) is 0 (it can't be
otherwise) so this just converts to a char * first. This won't hurt but
it won't help either. This:

int *ip = f->a;

is, to my mind, the clearest way to write it. I can't see why it would
be done with any casts at all.
Thanks for all the replies.
I wrote the code with the cast because I used struct foo as an Opaque Type
in that file (only using pointers, the header with the struct definition
not #include'd)

In that case the code using the offsetof() macro can't compile,
since offsetof() needs to know how the structure looks like.
And it doesn't sound like a very useful kind of "encapsulation"
if you then retrieve something from the innards of the structure.

Regards, Jens
 
P

Phil Carmody

Ben Bacarisse said:
Phil Carmody said:
Ben Bacarisse said:
struct foo {
int a[10];
long b;
...
};
void
bar ( struct foo * f )
{
int *ip = f->a;

I have this fetish for wanting addresses to explicitly be addresses
(the result of applying the 'address of' operator), so my default
would be that which explicitly says 'the address of the first element':

int *ip = &f->a[0];

I'm consistent with this,

How many of these are there in your code:

const char *digits = &"0123456789"[0];
...
strcmp(&"string"[0], str) == 0

A pair of '"'s is another clear indicator, so precisely zero.

Phil
 
J

Joe Pfeiffer

Jack Boot said:
I wrote the code with the cast because I used struct foo as an Opaque Type
in that file (only using pointers, the header with the struct definition
not #include'd) to emulate C++-style encapsulation. Therefore I could not
dereference f to get f->a.

You're making the struct an opaque type, but then taking advantage of
your knowledge of its internal representation. You can't really have it
both ways; you should either expose the internal representation using
f->a or else write an accessor function that returns a (or even an
accessor for the elements of a, or a function that's part of your foo
implementation that does whatever manipulation is going on in bar()).
 
N

Nobody

I wrote the code with the cast because I used struct foo as an Opaque Type
in that file

No you didn't, you relied upon the fact that the first element of the
structure was an array of integers. That *isn't* using it as an opaque
type.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,564
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top