is this portable ?

J

junky_fellow

Hi,

Can I portably find the offset of a member of a structure in the
following way:

#define offsetof(type,member) (char *)(&((struct type*)0)->member) -
(char *)0

Is the NULL pointer dereferenced in the above macro ?
 
G

Guest

Hi,

Can I portably find the offset of a member of a structure in the
following way:

#define offsetof(type,member) (char *)(&((struct type*)0)->member) -
(char *)0

Is the NULL pointer dereferenced in the above macro ?

http://c-faq.com/struct/offsetof.html

The NULL pointer is typically not dereferenced in practice, but since
it is in theory, it's not valid.
 
J

junky_fellow

Harald said:
http://c-faq.com/struct/offsetof.html

The NULL pointer is typically not dereferenced in practice, but since
it is in theory, it's not valid.

Thanks for the pointer. Still, I am not able to understand that , if
NULL pointer is not dereferenced then why it is not valid ?

Secondly, in the macro ( from the FAQ list)
#define offsetof(type, f) ((size_t) \
((char *)&((type *)0)->f - (char *)(type *)0))

^^^
why do we first type cast NULL pointer to (type *) and then to (char *)
? Can't we simply do
#define offsetof(type, f) ((size_t) \
((char *)&((type *)0)->f - (char *)0))


Thirdly, on which implementation the compiler would refuse to accept it
?
 
F

Friedrich Dominicus

Hi,

Can I portably find the offset of a member of a structure in the
following way:

#define offsetof(type,member) (char *)(&((struct type*)0)->member) -
(char *)0
How about using stddef.h
and use th offsetof in there?

Regards
Friedrich
 
G

Guest

Thanks for the pointer. Still, I am not able to understand that , if
NULL pointer is not dereferenced then why it is not valid ?

Because as I said, it /is/ dereferenced as far as the standard is
concerned. Most implementations don't actually emit code for it, but
that only means it doesn't cause a lot of problems in practice.
Secondly, in the macro ( from the FAQ list)
#define offsetof(type, f) ((size_t) \
((char *)&((type *)0)->f - (char *)(type *)0))

^^^
why do we first type cast NULL pointer to (type *) and then to (char *)
? Can't we simply do
#define offsetof(type, f) ((size_t) \
((char *)&((type *)0)->f - (char *)0))

I suspect it is to avoid problems where an implementation has multiple
null pointer representations. For example, if a pointer consists of a
segment and an offset, and the implementation treats any pointer with a
segment of 0 as a null pointer, the offset used for (char *)(type *)0
and (char *)0 could be different, and so, without a double cast, the
subtraction could give a different result. This is merely speculation,
though.
Thirdly, on which implementation the compiler would refuse to accept it
?

The TenDRA compiler does not treat the given definition of offsetof as
an integer constant expression, and as a result does not allow it in
(for example) array declarations. This is a different possible problem
than mentioned in the FAQ, and I do not know of a system where the
other problem exists.
 
C

christian.bau

Hi,

Can I portably find the offset of a member of a structure in the
following way:

#define offsetof(type,member) (char *)(&((struct type*)0)->member) -
(char *)0

Is the NULL pointer dereferenced in the above macro ?

Of course not.

For starters, trying to redefine the standard offsetof macro will
usually produce an error.

Second, it is incompatible with the standard offsetof macro - it will
not work with typedef's because for some reason you added "struct"
where you shouldn't have done so.

Third, you messed up your parentheses. Try

struct soandso { int x; int y; };
char* p = "Hello, world plus some more characters to be safe";

p + offsetof (soandso, y);

This will not compile because you forgot parentheses around the
offsetof expression.

Now if you got all of these things right, it still wouldn't work
because it invokes undefined behavior. Just trying to calculate the
address of ((struct soandso *) NULL) -> y is undefined behavior.
Calculating a pointer difference with one pointer being NULL is
undefined behavior. Since you expect the compiler to turn all of this
into a compile-time constant, and at the same time you confront it with
undefined behavior, what do you expect?
 
J

junky_fellow

Harald said:
Because as I said, it /is/ dereferenced as far as the standard is
concerned. Most implementations don't actually emit code for it, but
that only means it doesn't cause a lot of problems in practice.

But, when we are not doing any read/write from that address, then why
do we need to dereference it all ? I don't know much about what C
standard says about that. I tried many times reading it, but the
language used by the standard is too difficult to understand for a
beginner ( although I have been leraning C for the past 3 years, still
I am no better than a newbie). Coming back to the question, its very
hard for me to find out any reason why the C standard imposes this
restricion that the pointer may be dereferenced while finding the
address of some member ? I am not fetching or storing anything at that
address. Shouldn't this be a simple pointer arithmetic ?
Also, what I believe that it is only compiler that can tell about the
offset of some member in some structure. So, why not add one more
operator "offsetof" just as we have "sizeof" operator ?
 
G

Guest

But, that's a macro provided by library. I was talking about using an
operator for it, that would give the offset at compile time rather than
calculating it at run time.

The result of the standard offsetof() macro is an integer constant
expression and can be used anywhere any other integer constant
expression can be used. For this to be possible, a compiler must be
able to evaluate offsetof() at compile time. The same cannot be said
for your custom definition of offsetof(): I already gave an example of
a compiler that does /not/ treat it as such.
 
K

Keith Thompson

But, when we are not doing any read/write from that address, then why
do we need to dereference it all ? I don't know much about what C
standard says about that. I tried many times reading it, but the
language used by the standard is too difficult to understand for a
beginner ( although I have been leraning C for the past 3 years, still
I am no better than a newbie). Coming back to the question, its very
hard for me to find out any reason why the C standard imposes this
restricion that the pointer may be dereferenced while finding the
address of some member ? I am not fetching or storing anything at that
address. Shouldn't this be a simple pointer arithmetic ?

Let's look at a typical (I think) definition of offsetof:

#define offsetof(s, m) (size_t)(&(((s *)0)->m))

and consider the sequence of operations that this specifies.

0 Zero
(s *)0 A null pointer of type pointer-to-s
((s *)0)->m The value of the member "m" of the object
of type "s" obtained by dereferencing the
null pointer
&(((s *)0)->m) No, we just need the address, not the value.
(size_t)(&(((s *)0)->m)) Convert the address to size_t

Now any reasonable compiler is very likely to optimize out the
evaluation of the value of the (nonexistent) structure member m. The
point is that the standard doesn't bother to *mandate* this particular
optimization. Similarly, it doesn't require a compiler to optimize
away the evaluation of x in (x * 0).

And the fact that the standard doesn't mandate this particular
optimization (that it allows the value of the member to be evaluated
and discarded) doesn't cause many problems in practice. The only good
reason I can think of to write code like the above is to implement
something like offsetof -- but you don't need to, because it's already
implemented for you. (As I'm sure you know, the author of the
<stddef.h> header is allowed to make whatever implementation-specific
assumptions he likes, as long as everything works as specified for
that implementation.)
Also, what I believe that it is only compiler that can tell about the
offset of some member in some structure. So, why not add one more
operator "offsetof" just as we have "sizeof" operator ?

It's probably just for historical reasons. My guess is that, many
years ago, somebody decided that it would be useful to determine the
offset of a structure member, and came up with something like the
above macro. This must have been before C was standardized, and the
author freely made assumptions about the behavior of the compiler. It
wasn't necessary to add a new keyword to the language. When the
language was standardized, it was felt that the simplest thing to do
was to bless the already existing offsetof() macro -- but since it
can't be implemented portably (given the language rules of the new
standard), the committee only required that it be possible to
implement it in *some* way.

If the usual definition of offsetof() doesn't work for a given
compiler, for whatever reason, the implementers can alway define and
use some compiler magic:

#define offsetof(s, m) __magic_offsetof(s, m)

offsetof() works correctly in all C implementations, and there's no
real need for any programmer to worry about just *how* it works.
There isn't really a problem to be fixed.
 
G

Guest

But, when we are not doing any read/write from that address, then why
do we need to dereference it all ?

Dereferencing a pointer means the unary * operator, or the -> operator,
is evaluated with that pointer as its operand. You applied the ->
operator to a null pointer. The expression, including the -> operator,
is evaluated. That means you dereferenced a null pointer. There could
be a special exception that says the -> operator is not evaluated in
this case, but there simply isn't. As for why you need it, you don't,
if you simply use the standard offsetof() macro instead of writing your
own.
 
B

Ben Pfaff

Keith Thompson said:
My guess is that, many years ago, somebody decided that it
would be useful to determine the offset of a structure member,
and came up with something like the above macro. This must
have been before C was standardized, and the author freely made
assumptions about the behavior of the compiler.

The text of the C99 rationale implies that offsetof was an
invention of the standards committee.
 
Y

Yevgen Muntyan

Keith said:
Let's look at a typical (I think) definition of offsetof:

#define offsetof(s, m) (size_t)(&(((s *)0)->m))

and consider the sequence of operations that this specifies.

0 Zero
(s *)0 A null pointer of type pointer-to-s
((s *)0)->m The value of the member "m" of the object
of type "s" obtained by dereferencing the
null pointer
&(((s *)0)->m) No, we just need the address, not the value.

Is this quite correct? Consider the following:

char **ptr;
char *p, *s;
ptr = &s; /* fine */
p = s; /* undefined behaviour */

I.e. in general "&s" is not a shortcut for "get the value; forget it,
just get me the address". Maybe parentheses add to it, so "&(s)" is not
the same as "&s"?

Best regards,
Yevgen
 
G

Guest

Yevgen said:
Is this quite correct? Consider the following:

char **ptr;
char *p, *s;
ptr = &s; /* fine */
p = s; /* undefined behaviour */

I.e. in general "&s" is not a shortcut for "get the value; forget it,
just get me the address". Maybe parentheses add to it, so "&(s)" is not
the same as "&s"?

In your example, the behaviour for the fourth line is only undefined
because s may hold a trap representation, not because s is stored at an
invalid address. And with parentheses added, &(s) is still allowed.
 
S

Simon Biber

Harald said:
Dereferencing a pointer means the unary * operator, or the -> operator,
is evaluated with that pointer as its operand. You applied the ->
operator to a null pointer. The expression, including the -> operator,
is evaluated. That means you dereferenced a null pointer. There could
be a special exception that says the -> operator is not evaluated in
this case, but there simply isn't. As for why you need it, you don't,
if you simply use the standard offsetof() macro instead of writing your
own.

By the same token you could say

char arr[12];
&arr[12];

is also doing an illegal dereference of a pointer.

(arr + 12) valid, points one past the end
*(arr + 12) invalid, dereferences one past the end
&*(arr + 12) valid?, calculates address of one past the end

In C99 6.5.2.1 the standard says that & and * cancel each other out, and
& and [] cancel into a simple +, and no dereference is actually
evaluated. They could easily have said the same for the & and ->
operator sequence, but for some reason left it out.
 
K

Keith Thompson

Yevgen Muntyan said:
Keith Thompson wrote: [...]
Let's look at a typical (I think) definition of offsetof:
#define offsetof(s, m) (size_t)(&(((s *)0)->m))
and consider the sequence of operations that this specifies.
0 Zero
(s *)0 A null pointer of type pointer-to-s
((s *)0)->m The value of the member "m" of the object
of type "s" obtained by dereferencing the
null pointer
&(((s *)0)->m) No, we just need the address, not the value.

Is this quite correct? Consider the following:

char **ptr;
char *p, *s;
ptr = &s; /* fine */
p = s; /* undefined behaviour */

I.e. in general "&s" is not a shortcut for "get the value; forget it,
just get me the address".

You're right; good catch.

The unary "&" operator causes its operand to be evaluated *as an
lvalue*. The value of m is not evaluated, but its address must be
determined. In this case, just determining the address invokes
undefined behavior, because there isn't really an object at that
address.
Maybe parentheses add to it, so "&(s)" is not the same as "&s"?

No, a parenthesized lvalue is still an lvalue.
 
J

junky_fellow

Yevgen Muntyan said:
Keith Thompson wrote: [...]
Let's look at a typical (I think) definition of offsetof:
#define offsetof(s, m) (size_t)(&(((s *)0)->m))
and consider the sequence of operations that this specifies.
0 Zero
(s *)0 A null pointer of type pointer-to-s
((s *)0)->m The value of the member "m" of the object
of type "s" obtained by dereferencing the
null pointer
&(((s *)0)->m) No, we just need the address, not the value.
Is this quite correct? Consider the following:
char **ptr;
char *p, *s;
ptr = &s; /* fine */
p = s; /* undefined behaviour */
I.e. in general "&s" is not a shortcut for "get the value; forget it,
just get me the address".You're right; good catch.

The unary "&" operator causes its operand to be evaluated *as an
lvalue*. The value of m is not evaluated, but its address must be
determined. In this case, just determining the address invokes
undefined behavior, because there isn't really an object at that
address.
Does that mean that the null pointer is not dereferenced ?
 
G

Guest

Simon said:
Harald said:
Dereferencing a pointer means the unary * operator, or the -> operator,
is evaluated with that pointer as its operand. You applied the ->
operator to a null pointer. The expression, including the -> operator,
is evaluated. That means you dereferenced a null pointer. There could
be a special exception that says the -> operator is not evaluated in
this case, but there simply isn't. As for why you need it, you don't,
if you simply use the standard offsetof() macro instead of writing your
own.

By the same token you could say

char arr[12];
&arr[12];

is also doing an illegal dereference of a pointer.

(arr + 12) valid, points one past the end
*(arr + 12) invalid, dereferences one past the end
&*(arr + 12) valid?, calculates address of one past the end

Because of the text you noted below, neither the & operator nor the *
operator is evaluated, so you're not dereferencing anything in C99. And
in C90, the behaviour /is/ undefined exactly for that reason, even if
no sane implementation would reject it.
In C99 6.5.2.1 the standard says that & and * cancel each other out, and
& and [] cancel into a simple +, and no dereference is actually
evaluated. They could easily have said the same for the & and ->
operator sequence, but for some reason left it out.

Yes, there could have been a special exception for the -> operator, but
there simply isn't. That's exactly what I said, isn't it?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top