offsetof() macro

S

Simon Morgan

Hi,

Can somebody please help me grok the offsetof() macro?

I've found an explanation on
http://www.embedded.com/shared/printableArticle.jhtml?articleID=18312031
but I'm afraid it still doesn't make sense to me.

The sticking point seems to be:

((s *)0) takes the integer zero and casts it as a pointer to s.

To my untrained eye that would basically result in a null pointer. Does
this expression result in some special behaviour or am I missing
something?

Thanks.

Simon
 
K

Kenneth Brody

Simon said:
Hi,

Can somebody please help me grok the offsetof() macro?

I've found an explanation on
http://www.embedded.com/shared/printableArticle.jhtml?articleID=18312031
but I'm afraid it still doesn't make sense to me.

The sticking point seems to be:

((s *)0) takes the integer zero and casts it as a pointer to s.

To my untrained eye that would basically result in a null pointer. Does
this expression result in some special behaviour or am I missing
something?

The offset of s_ptr->item could be taken as

((char *)&s_ptr->item) - ((char *)s_ptr)

Rather than having to create an actual "s_ptr" to use such a macro, it
can be shortcutted by simply using "((s *)constant)". By making the
"constant" zero, it also eliminates the need for the right-half of the
above equation.

There is nothing wrong with a NULL pointer, as long as you don't
dereference it. Taking it's address is fine.


Side question: what happens on platforms where NULL is not all-bits-
zero?

--
+-------------------------+--------------------+-----------------------------+
| Kenneth J. Brody | www.hvcomputer.com | |
| kenbrody/at\spamcop.net | www.fptech.com | #include <std_disclaimer.h> |
+-------------------------+--------------------+-----------------------------+
Don't e-mail me at: <mailto:[email protected]>
 
I

Irrwahn Grausewitz

Simon Morgan said:
Hi,

Can somebody please help me grok the offsetof() macro?

I'll try.
I've found an explanation on
http://www.embedded.com/shared/printableArticle.jhtml?articleID=18312031
but I'm afraid it still doesn't make sense to me.

The sticking point seems to be:

((s *)0) takes the integer zero and casts it as a pointer to s.

To my untrained eye that would basically result in a null pointer. Does
this expression result in some special behaviour or am I missing
something?

Preamble: note that the definition of the offsetof macro is highly
implementation specific. The one I'll use below is only one possible
solution, which happens to work on a variety of systems, while
miserably failing on others. An implementation may even call an
inbuilt function to calculate the offset of struct members by "black
magic" (read: compiler symbol table).

Ok, let's have a look at it:

#define offsetof(s,m) (size_t)&(((s *)0)->m)

with s being the name of a structure type and m the name of a member
within that structure.

Now, if we want to get a grip at a complex C expression, we simply
read it inside out:

0 We take the integer constant 0,

(s *)0 cast to a pointer-to-object-of-type-s, thus pointing
at address 0; we now _pretend_ that at address 0 an
actual object of type s resides.

((s *)0)->m Now look at the member m of that object, which we do
not access, but instead

&(((s *)0)->m) take the address of. Since the address of the
structure object is 0, the address of m happens to
equal its offset in bytes.

(size_t)&(((s *)0)->m) Finally, we cast the result to a suitable
integral data type.

Note, however, that this specific implementation relies upon the fact,
that the underlying architecture allows for _meaningfully_ conversion
of an integral value into a memory address an vice versa.

HTH
Best regards
 
E

Eric Laberge

Kenneth said:
The offset of s_ptr->item could be taken as

((char *)&s_ptr->item) - ((char *)s_ptr)

Rather than having to create an actual "s_ptr" to use such a macro, it
can be shortcutted by simply using "((s *)constant)". By making the
"constant" zero, it also eliminates the need for the right-half of the
above equation.

There is nothing wrong with a NULL pointer, as long as you don't
dereference it. Taking it's address is fine.


Side question: what happens on platforms where NULL is not all-bits-
zero?

Both "NULL" and "offsetof" are defined in <stddef.h> (7.17 Common de?nitions
<stddef.h>).

NULL is implementation-defined, but (6.3.2.3 Pointers) an integer constant
with value 0 is the same as NULL.

I suppose if, eg. the implementation decided that the value of NULL is
0x1234, then constant 0, when used as a pointer, would be equal to 0x1234.

If NULL is defined as something other than 0 (incl. non-integer values),
such a construct as:
#define offsetof(TYPE, MEMBER) ((size_t) &((TYPE *)0)->MEMBER)
would not yield the correct value, hence why it is defined in <stddef.h>
along with NULL.

Note that I'm not an expert on the C standard unlike many people here, and I
could be giving totally wrong information.

BTW, I had to retrieve the "offsetof" of an array element, so here's a
complementary macro, only valid with a NULL value of 0, if what I said
above is correct.

#define array_offsetof(BASETYPE,ARRAY,INDEX) ((size_t) &((BASETYPE(*)ARRAY
0)[0]INDEX)

ex.:
int a[5][10][15];
/* offset of a[1][3] */
int offset = array_offsetof(int, [5][10][15], [1][3]);
 
D

DevarajA

Eric Laberge ha scritto:
Kenneth Brody wrote:




Both "NULL" and "offsetof" are defined in <stddef.h> (7.17 Common de?nitions
<stddef.h>).

NULL is implementation-defined, but (6.3.2.3 Pointers) an integer constant
with value 0 is the same as NULL.

I suppose if, eg. the implementation decided that the value of NULL is
0x1234, then constant 0, when used as a pointer, would be equal to 0x1234.

Are you sure? I've just compiled (gcc) and executed the code below, and
it prints '0x1234 (nil)'

#include <stdio.h>
#include <stdlib.h>
#undef NULL
#define NULL (void*)0x1234

int main()
{
int *pt1=NULL;
int *pt2=(int*)0;

printf("%p %p\n",pt1,pt2);

return 0;
}
 
I

Irrwahn Grausewitz

DevarajA said:
Eric Laberge ha scritto:

Are you sure? I've just compiled (gcc) and executed the code below, and
it prints '0x1234 (nil)'

#include <stdio.h>
#include <stdlib.h>
#undef NULL
#define NULL (void*)0x1234

Undefined behaviour, anything can happen.

Changing the definition of the NULL macro to one that yields an
arbitrarily chosen value on does not change the general behaviour
of the implementation itself.
However, nothing and nobody stops you from writing an implementation
which uses 0x1234 as the numerical value for each and any null
pointer. You just have to make sure it's different from any valid
object pointer (unlike above).
int main()

Better: int main(void)
{
int *pt1=NULL;
int *pt2=(int*)0;

printf("%p %p\n",pt1,pt2);

The %p format specifier expects an argument of type pointer-to-void.
return 0;
}

Regards.
 
F

Flash Gordon

DevarajA said:
Eric Laberge ha scritto:

No, a constant expression in the *source* code used in a pointer context
is a null pointer constant. However, that has no bearing on how a null
pointer is represented in the program when it is run.

No, NULL has to be and integer constant expression evaluating to 0 or
such an expression cast to void*. However, if a null pointer is not all
bits 0 then the compiler will have to change it to whatever
representation is used for a null pointer when it compiles the code.

Equally, a float or double value of 0 does not have to be represented by
all bits 0.
Are you sure? I've just compiled (gcc) and executed the code below, and
it prints '0x1234 (nil)'

#include <stdio.h>
#include <stdlib.h>
#undef NULL
#define NULL (void*)0x1234

You are not the implementation, so you have not changed the
representation of a null pointer constant.
int main()
{
int *pt1=NULL;
int *pt2=(int*)0;

printf("%p %p\n",pt1,pt2);

%p is for void* pointers not other pointer types. They can be
represented differently.
return 0;
}

The reason the offsetof macro can make use of a null pointer in strange
ways is that it is part of the *implementation* and the implementer is
allowed to use whatever magic s/he wants as long as it produces the
correct result. User code can't portably use such tricks.
 
D

DevarajA

Flash Gordon ha scritto:
No, a constant expression in the *source* code used in a pointer context
is a null pointer constant. However, that has no bearing on how a null
pointer is represented in the program when it is run.



No, NULL has to be and integer constant expression evaluating to 0 or
such an expression cast to void*. However, if a null pointer is not all
bits 0 then the compiler will have to change it to whatever
representation is used for a null pointer when it compiles the code.

So, tell me if I have understood... null pointer constant needs not to
have an all-0 internal representation (it is implementation-defined),
but in the source code it must be represented as 0. And NULL macro just
expands into the null pointer constant, but I may also redefine it. Am I
right?
 
J

Jack Klein

Hi,

Can somebody please help me grok the offsetof() macro?

I've found an explanation on
http://www.embedded.com/shared/printableArticle.jhtml?articleID=18312031
but I'm afraid it still doesn't make sense to me.

The sticking point seems to be:

((s *)0) takes the integer zero and casts it as a pointer to s.

To my untrained eye that would basically result in a null pointer. Does
this expression result in some special behaviour or am I missing
something?

Thanks.

Simon

The offsetof() macro is "special", as are many other things in the
standard C library and implementation. Just for example, the fopen()
function cannot itself be written in standard, portable C. It must
either interface with a lower-level API function provided by the
platform's operating system, or it must contain non-standard, hardware
specific code of its own to access a device that stores files.

Let's start with an example:

struct xyz { int x; int y; int z; };


The point is that once the compiler has processed the definition of a
structure type, it knows the offset of each of the members of the
structure. Once a human being (describes most programmers) has read
the structure definition, they know the offset of the first member,
'x', because it is at offset 0. They can guess or assume the offsets
of 'y' and 'z', but they can't really know.

There are times when it is quite useful to know the offset of a member
of a structure from the beginning of the structure. And the compiler
already has this information. So what we need is a way to write
source code that allows the compiler to provide this information to a
program when the program needs it.

Now, given our example definition, above, let's see how we can get the
offset of 'z' in a C program without using the offsetof() macro:

#include <stdio.h>
#include <stddef.h>

struct xyz { int x; int y; int z; };

int main(void)
{
struct xyz x_y_z = { 0 };
struct xyz *xyz_ptr = &x_y_z;
int *ip = &x_y_z.z;
ptrdiff_t diff = (char *)ip - (char *)xyz_ptr;
printf("%d\n", (int)diff);
return 0;
}

This is all perfectly valid, legal C code. You can cast any valid
pointer to any type of object to a pointer to char, and it will point
to the same address in memory as the lowest addressed byte of the
object. You can subtract two pointers of the same type, as long as
they both point within the same object. The result of the valid
subtraction between two pointers of the same type is the signed
integer type ptrdif_t, defined in <stddef.h>. Note that the
offsetof() macro is defined as yielding a type size_t, not ptrdif_t,
but that's a simple cast.

But to do this, we used an actual struct xyz object. What if we want
to do this without having one handy?

So let's try another version of the program which works with a pointer
to a struct xyz:

#include <stdio.h>
#include <stddef.h>

struct xyz { int x; int y; int z; };

size_t my_offsetof(struct xyz *xyz_ptr)
{
size_t result =
(size_t)((char *)&xyz_ptr->z - (char *)xyz_ptr);
return result;
}

int main(void)
{
struct xyz x_y_z = { 0 };
printf("%d\n", (int)my_offsetof(&x_y_z));
return 0;
}

This produces the same result, which happens to be 8 on the particular
compiler that I am using. But note that even though the function
my_offsetof() does not define and create a struct xyz object, the
caller must have one handy to provide a pointer to the function.

If we want to do this without having a struct xyz object anywhere, we
could be tempted to change this line in main() from:

printf("%d\n", (int)my_offsetof(&x_y_z));

....to:

printf("%d\n", (int)my_offsetof(0));

....and on many compilers this will work just fine, and perhaps on some
few it will crash the program or produce unpredictable results. That
is because passing 0 or the macro NULL causes my_offsetof() to be
called with a null pointer. And the highlighted subexpression:

(size_t)((char *)&xyz_ptr->z - (char *)xyz_ptr);
^^^^^^^^^^

....dereferences that null pointer, although in actuality the compiler
does not need to dereference the pointer, since the value of
xyz_ptr->z is not used.

So on implementations where the compiler writer knows that his
compiler will not generate code to dereference the pointer, he can
define offsetof(s,m) like this:

#define offset(s, m) ((size_t)&((s *)0)->m)

....or something similar.

This code is not legal for you to write, but the implementer is not
constrained by the rules that apply to legal programs. The
implementer is allowed to bend or break the rules in standard library
functions and macros, so long as they deliver the proper results.

On the other hand, there are some compilers that do it differently. I
have seen definitions that look like this:

#define offset(s, m) __builtin_offset__(s, m)

....that cause the compiler to look up the results in its symbol table
directly without going through the clumsy and technically undefined
operation on a null pointer.

The point is that there is a well-defined and perfectly legal sequence
that works properly on all compilers to get this information if there
is an object of the structure type available, but there is no legal,
defined method to get it without such an object. If there were, there
would have been no need for the language standard to require that the
implementation provides the macro.

So the point is, use the macro instead of trying to do the
calculations yourself. Even if you have an object of the type around,
the expression:

offsetof(struct xyz, z)

....is much more readable in your source code than:

(char *)&x_y_z->z - (char *)&x_y_z

....and the name of the macro makes the reason for its use
self-documenting.
 
W

Walter Roberson

So, tell me if I have understood... null pointer constant needs not to
have an all-0 internal representation (it is implementation-defined),
Right.

but in the source code it must be represented as 0.

The standard does not preclude other source-code representations, but 0
is the only portable one.
And NULL macro just
expands into the null pointer constant, but I may also redefine it. Am I
right?

It falls into the class of being reserved if any of the standard headers
are included.
 
K

Keith Thompson

DevarajA said:
Flash Gordon ha scritto:

So, tell me if I have understood... null pointer constant needs not to
have an all-0 internal representation (it is implementation-defined),
but in the source code it must be represented as 0. And NULL macro
just expands into the null pointer constant, but I may also redefine
it. Am I right?

Close.

A "null pointer" and a "null pointer constant" are two different
things. A null pointer constant is a construct that can appear in C
source code. A null pointer is a pointer value that can occur during
execution time.

An integer constant 0 is a valid null pointer constant. If you assign
a null pointer constant to a pointer object, that object's value at
execution time will be a null pointer. The execution-time
representation of a null pointer may or may not be all-bits-zero. (If
a null pointer has a representation other than all-bits-zero, it's up
to the compiler to do whatever conversion might be necessary.)

The macro NULL expands to a null pointer constant. You cannot legally
redefine NULL yourself. Your compiler might let you get away with it,
but there's absolutely no reason to do so.
 
K

Keith Thompson

The standard does not preclude other source-code representations, but 0
is the only portable one.

(See my other followup for a discussion of the distinction between
"null pointer constants" and "null pointers".)

The standard defined a "null pointer constant" as

An integer constant expression with the value 0, or such an
expression cast to type void *.

So the following are all valid null pointer constants under any
possible conforming implementation:

0
(void*)0
(1-1)
(void*)('-'-'-')
(void*)('/'/'/'-'/'/'/')

All but the first two are considered silly, but they're no less valid
for that.
 
D

DevarajA

Keith Thompson ha scritto:
Close.

A "null pointer" and a "null pointer constant" are two different
things. A null pointer constant is a construct that can appear in C
source code. A null pointer is a pointer value that can occur during
execution time.

An integer constant 0 is a valid null pointer constant. If you assign
a null pointer constant to a pointer object, that object's value at
execution time will be a null pointer. The execution-time
representation of a null pointer may or may not be all-bits-zero. (If
a null pointer has a representation other than all-bits-zero, it's up
to the compiler to do whatever conversion might be necessary.)

The macro NULL expands to a null pointer constant. You cannot legally
redefine NULL yourself. Your compiler might let you get away with it,
but there's absolutely no reason to do so.

Then using NULL is the same as using 0? So why in C many people use
NULL? Is that to avoid portability problems (in the case of a weird
non-standard implementation that has the null ptr const !=0)?
 
K

Keith Thompson

DevarajA said:
Keith Thompson ha scritto: [...]
A "null pointer" and a "null pointer constant" are two different
things. A null pointer constant is a construct that can appear in
C source code. A null pointer is a pointer value that can occur
during execution time. An integer constant 0 is a valid null
pointer constant. If you assign a null pointer constant to a
pointer object, that object's value at execution time will be a
null pointer. The execution-time representation of a null pointer
may or may not be all-bits-zero. (If a null pointer has a
representation other than all-bits-zero, it's up to the compiler to
do whatever conversion might be necessary.) The macro NULL expands
to a null pointer constant. You cannot legally redefine NULL
yourself. Your compiler might let you get away with it, but
there's absolutely no reason to do so.

Then using NULL is the same as using 0? So why in C many people use
NULL? Is that to avoid portability problems (in the case of a weird
non-standard implementation that has the null ptr const !=0)?

The advantage of using NULL is documentation. It makes it obvious
that what's intended is a pointer value, not an integer value.

For example:

a = 0;
b = 0.0;
c = '\0';
d = NULL;

You can tell even without looking at the declarations (assuming a sane
programmer) that a is an integer, b is a floating-point variable, c is
a character, and d is a pointer.

There are those who advocate using 0 rather than the NULL macro as a
null pointer constant. They are, of course, wrong, since they
disagree with me.

(If I were designing the language from scratch, there would be a
keyword, either "nil" or "null", that would be the *only* null pointer
constant. Using 0 in a pointer context would be a constraint error.
The current confusing situation is, I think, a remnant of the early
days of C when it was assumed that a null pointer *was* all-bits-zero,
and even non-zero integer constants were commonly used as pointers.)
 
D

DevarajA

There are those who advocate using 0 rather than the NULL macro as a
null pointer constant. They are, of course, wrong, since they
disagree with me.

(If I were designing the language from scratch, there would be a
keyword, either "nil" or "null", that would be the *only* null pointer
constant. Using 0 in a pointer context would be a constraint error.

What do you mean by "using 0 in a pointer context"? In another post you
have said "An integer constant 0 is a valid null pointer constant. If
you assign a null pointer constant to a pointer object, that object's
value at execution time will be a null pointer."+"The macro NULL expands
to a null pointer constant." And now you say that using 0 is a
constraint error. I think I haven't understood yet :-(

And on every compiler I've tried, int *pt=0; works without errors or
warnigs. And also the standard says that a constant expression
evaluating to 0 or such an expression cast to void*, assigned to a
pointer of wathever type makes it a null pointer.

Sorry if I'm wasting your time, but I'm a bit confused...
 
I

Irrwahn Grausewitz

[DevarajA: please preserve attribution lines, to make clear who wrote
what. Attribution restored.]

:)
What do you mean by "using 0 in a pointer context"? In another post you
have said "An integer constant 0 is a valid null pointer constant. If
you assign a null pointer constant to a pointer object, that object's
value at execution time will be a null pointer."+"The macro NULL expands
to a null pointer constant." And now you say that using 0 is a
constraint error. I think I haven't understood yet :-(

Please read again: Keith _would_make_ using 0 in a pointer context a
constraint violation, _if_he_were_re-designing_the_language_C_.

C-as-we-know-it: int *p = 0; /* correct, but bad style */
Keith's revamped C: int *p = 0; /* constraint violation */

(FWIW, I'd probably do the same.)

<snip>

Best regards
 
D

DevarajA

Irrwahn Grausewitz ha scritto:
[DevarajA: please preserve attribution lines, to make clear who wrote
what. Attribution restored.]

Keith Thompson wrote:


:)



What do you mean by "using 0 in a pointer context"? In another post you
have said "An integer constant 0 is a valid null pointer constant. If
you assign a null pointer constant to a pointer object, that object's
value at execution time will be a null pointer."+"The macro NULL expands
to a null pointer constant." And now you say that using 0 is a
constraint error. I think I haven't understood yet :-(


Please read again: Keith _would_make_ using 0 in a pointer context a
constraint violation, _if_he_were_re-designing_the_language_C_.

Ok, got it :) thank you all.
C-as-we-know-it: int *p = 0; /* correct, but bad style */
Keith's revamped C: int *p = 0; /* constraint violation */

Another thing.. looking into stddef.h I've seen that for C, NULL it is
defined as ((void*)0). The standard specifies that plain 0 is already a
null pointer constant. So why that cast? Maybe this is the last stupid
question for today :)
 
K

Keith Thompson

DevarajA said:
What do you mean by "using 0 in a pointer context"? In another post
you have said "An integer constant 0 is a valid null pointer
constant. If you assign a null pointer constant to a pointer object,
that object's value at execution time will be a null pointer."+"The
macro NULL expands to a null pointer constant." And now you say that
using 0 is a constraint error. I think I haven't understood yet :-(

Look again. I said that if I were designing the language from
scratch, using 0 in a pointer context would be a constraint error (I
meant "constraint violation"). In C as it actually exists, 0 is a
valid null pointer constant, so it *can* be used in a pointer context.

For example:

int *p;
p = 0;

This is perfectly legal C. As a matter of style and clarity, I
prefer:

int *p;
p = NULL;

(In my hypothetical, non-existent C-like language, there would be a
keyword "nil" that would be the only legal null pointer constant.
"p = 0;" would be illegal, and there would be no need for the NULL
macro. I occasionally rant about how I'd redesign C if I had the
chance; don't take it too seriously.)
 
I

Irrwahn Grausewitz

DevarajA said:
Another thing.. looking into stddef.h I've seen that for C, NULL it is
defined as ((void*)0).

Depends on the implementation, yours happen to use the above
definition, another might define NULL as (0) or even
((void *)('9'-'4'-'5')).
The standard specifies that plain 0 is already a
null pointer constant. So why that cast?

I'm just guessing now; maybe the implementer thought: while we all
know that it's perfectly legitimate to implicitly convert a constant
integer expression with value 0 to any type of pointer, yielding a
null pointer of that type, I'll take a step further and _explicitly_
spell it out as ((void *)0) in the definition of the NULL macro.

Like saying: "Yes, I _really_ _really_ mean null pointer constant."

For a definite answer you'd have to ask the guy himself. ;o)
Maybe this is the last stupid
question for today :)

There are no stupid questions, only stupid answers. :)

Best regards
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,007
Latest member
obedient dusk

Latest Threads

Top