Why does this not fail?

S

sgadag

Even if "a" is NULL in the assignment below, this assignment does not
cause any AV:

SOME_PTR * someVar = (SOME_PTR *) a->b;


But something like this will cause an AV because "someVar" is NULL:

if (someVar->someType == 1)
{

}


Why does the first assignment not cause any access violation?
 
W

Walter Roberson

Even if "a" is NULL in the assignment below, this assignment does not
cause any AV:
SOME_PTR * someVar = (SOME_PTR *) a->b;
But something like this will cause an AV because "someVar" is NULL:
if (someVar->someType == 1)
{
}

Why does the first assignment not cause any access violation?

Chance.

Derefencing a NULL pointer only results in an access violation
when you are lucky. The rest of the time, it does something or other
that is usually much harder to detect.
 
B

Ben Pfaff

Even if "a" is NULL in the assignment below, this assignment does not
cause any AV:

SOME_PTR * someVar = (SOME_PTR *) a->b;


But something like this will cause an AV because "someVar" is NULL:

if (someVar->someType == 1)
{

}


Why does the first assignment not cause any access violation?

Either way, the behavior is undefined, so anything is actually
allowed to happen. But I suppose your real question is why the
undefined behavior manifests this way. My first thought is that
the former code doesn't actually do anything with the value that
it obtains, so the compiler is probably optimizing it out
entirely, not dereferencing the pointer at all.
 
S

sgadag

I think since we are not accessing NULL memory, we will get the address
of "b", even if "a" is NULL.

What about this:

&( ((type *)0) -> field)

There is no problem here too. I am yet to get a satisfactory answer.
 
A

Ancient_Hacker

Even if "a" is NULL in the assignment below, this assignment does not
cause any AV:

SOME_PTR * someVar = (SOME_PTR *) a->b;

What is the struct declaration like? In your case It's likely field
"b" is many kilobytes from the start of the struct. Most OS's map the
lower few K of memory to "invalid", so that catches NULL references,
and a lot of NULL->field references. But if a field is far enough into
the structure, it may map into valid memory addresses. And then a->b
might ne a valid read reference.

is NULL:
if (someVar->someType == 1)


Yep, if someType is in the first few K of the struct, it is likely to
get caught as a bad address.
 
S

sgadag

I have seen at quite a few places that offsetof() is coded something
like

#define offsetof(type, mem) ((size_t)((char *)&((type *)0)->mem - (char
*)(type *)0))

Now, not getting into other issues with the code (portability etc), if
we see, we have null pointer dereferencing here. How is this allowed?
 
W

Walter Roberson

I think since we are not accessing NULL memory, we will get the address
of "b", even if "a" is NULL.

Please quote enough context so that people know what you are
referring to.

Your reply is with respect to a->b where a is NULL.

a->b is the same as (*a).b by definition. b must therefore be
a field name within the structure type associated with *a.
As b is a field name and not a variable, b has no address of its
own, so your analysis cannot be correct.

In considering (*a).b with a being NULL, you should understand
that the C standards say that doing this is not allowed and that
the results are undefined. The standards do not say that the program
must crash: crashing is one of the allowed options, as is doing
something else completely like accessing an I/O register or loading
a random number. Crashing is relatively easy to track down; the
other possibilities might lurk undetected for decades.

One of the allowed behaviours for (*a).b with a being NULL, is to
calculate the distance of the field b relative to the begining
of the structure, and then attempt to access a memory location that
much further along from whatever bit pattern NULL happens to be,
which often -happens- to be the all-zero bit pattern. For example,
if the field b happens to start 84 bytes from the beginning of the
structure then the code might try accessing location 0+84 . And
that just might happen to work, because there just might happen to
be valid and accessible memory at that location. Or it might happen
to crash if the system knows there is no memory there. Or it might
happen to return 0's, if the memory system knows there is no memory
there and automatically substitutes 0's. I've seen all of these
behaviours on real systems.

What about this:
&( ((type *)0) -> field)
There is no problem here too. I am yet to get a satisfactory answer.

This is slightly different in that the address of (*0).field is
being taken without the content of (*0).field being needed.
This does not need to go to the memory hardware for lookup, so
*some* systems would treat the above as calculating the offset of
the field relative to the beginning of the structure. It doesn't
really calculate that, though, as it is the wrong type (address
instead of offset).

According to the C standards, the -> operator is only valid when
its left side is a pointer to an object, and 0 (or NULL) are
defined as pointing to NO object. Therefore the code
does not have a defined result according to the C standards.
It isn't uncommon to see the code in the implementation of
offset(), but that's because the implementation is allowed to take
advantage of internal knowledge of the operating system, and so
is allowed to do things that C programmers cannot safely do in
user programs. The code is *not* portable. (But as I discussed
above, systems are not -required- to give an error when they
encounter it.)
 
B

Ben Pfaff

Ancient_Hacker said:
What is the struct declaration like? In your case It's likely field
"b" is many kilobytes from the start of the struct. Most OS's map the
lower few K of memory to "invalid", so that catches NULL references,
and a lot of NULL->field references. But if a field is far enough into
the structure, it may map into valid memory addresses. And then a->b
might ne a valid read reference.

Really? On what OSes is the second page of virtual address space
commonly mapped?
 
E

Eric Sosman

I have seen at quite a few places that offsetof() is coded something
like

#define offsetof(type, mem) ((size_t)((char *)&((type *)0)->mem - (char
*)(type *)0))

Now, not getting into other issues with the code (portability etc), if
we see, we have null pointer dereferencing here. How is this allowed?

The answer is inseparably bound with the "other issues"
you don't want to get into.

Briefly, the implementation can use all the non-portable
tricks and gimmicks it feels like, so long as they produce
the effect the Standard requires. The implementation does
not need to be portable to other implementations. The Frobozz
Magic C compiler is not required to work as advertised if you
try to run it on the DeathStation 9000. The implementation
doesn't even need to be written in C at all.

... and that's why dodgy implementations of offsetof() are
allowed: because they're part of the implementation, not
part of the user code.
 
W

Walter Roberson

Really? On what OSes is the second page of virtual address space
commonly mapped?

Ancient_Hacker made no reference to a "page" of virtual memory.
His reference was to "the lower few K", which is sufficiently
imprecise to cover paged and non-paged memory models and to cover
protected memory that might be 1 page long, 16 pages long, 42 pages
long...


But to answer your question very specifically:

Silicon Graphics IRIX, starting from some version starting in 4.x,
through to version 6.5.22.

If memory serves me, it was IRIX 6.4 that introduced the models for
which the second page of virtual adress space was NOT commonly mapped.
It wasn't a matter that the addresses were no longer used: what
happened is that the page size got larger for newer hardware models,
requiring that the mapped memory be accessed via the first page (which
was now big enough to cover that address space). IRIX 6.4 -only-
supported models that referenced the memory via the first virtual page;
IRIX 6.5 was a general purpose OS that supported both models that used
the second virtual page for the needed addresses and models that used
the first {larger} virtual page for the same addresses. However, after
6.5.22, support was dropped for all the hardware that used the smaller
page size.

In IRIX 4 through 6.5.22 on models that supported the smaller page
size, the first virtual page of memory is flagged as allowing
no access (no read, no write, no execute), but the second virtual
page of memory was read and write because it was used for SGI's GL
graphics subsystem. In IRIX 6.4 and in IRIX 6.5 on the models with
the larger virtual page, the GL addresses are part of the {larger} first
page; as read and write were required for GL graphics, this had
the size effect of unprotecting memory address 0. If I recall
correctly, the locations near there are initialized to 0... and Yes, they
are writable :(
 
K

Keith Thompson

Even if "a" is NULL in the assignment below, this assignment does not
cause any AV:

SOME_PTR * someVar = (SOME_PTR *) a->b;

What does "AV" mean? I'm guessing it means something like "access
violation", but don't assume that we know that.

It's difficult to tell without seeing the actual code. If you had
posted a complete self-contained program that exhibits the problem, we
might have a chance of helping, but we have no way of knowing what
SOME_PTR, a, and b are.

Most likely you're doing something that invokes undefined behavior,
which can do anything, including quietly giving you some
reasonable-looking result.
 
M

Michael Mair

Walter said:
Please quote enough context so that people know what you are
referring to.

Your reply is with respect to a->b where a is NULL.

a->b is the same as (*a).b by definition. b must therefore be
a field name within the structure type associated with *a.
As b is a field name and not a variable, b has no address of its
own, so your analysis cannot be correct.

In considering (*a).b with a being NULL, you should understand
that the C standards say that doing this is not allowed and that
the results are undefined. The standards do not say that the program
must crash: crashing is one of the allowed options, as is doing
something else completely like accessing an I/O register or loading
a random number. Crashing is relatively easy to track down; the
other possibilities might lurk undetected for decades.

One of the allowed behaviours for (*a).b with a being NULL, is to
calculate the distance of the field b relative to the begining
of the structure, and then attempt to access a memory location that
much further along from whatever bit pattern NULL happens to be,
which often -happens- to be the all-zero bit pattern. For example,
if the field b happens to start 84 bytes from the beginning of the
structure then the code might try accessing location 0+84 . And
that just might happen to work, because there just might happen to
be valid and accessible memory at that location. Or it might happen
to crash if the system knows there is no memory there. Or it might
happen to return 0's, if the memory system knows there is no memory
there and automatically substitutes 0's. I've seen all of these
behaviours on real systems.




This is slightly different in that the address of (*0).field is
being taken without the content of (*0).field being needed.
This does not need to go to the memory hardware for lookup, so
*some* systems would treat the above as calculating the offset of
the field relative to the beginning of the structure. It doesn't
really calculate that, though, as it is the wrong type (address
instead of offset).

According to the C standards, the -> operator is only valid when
its left side is a pointer to an object, and 0 (or NULL) are
defined as pointing to NO object. Therefore the code
does not have a defined result according to the C standards.
It isn't uncommon to see the code in the implementation of
offset(), but that's because the implementation is allowed to take

Nit: offsetof
@OP: Walter is talking about the offsetof macro from said:
advantage of internal knowledge of the operating system, and so
is allowed to do things that C programmers cannot safely do in
user programs. The code is *not* portable. (But as I discussed
above, systems are not -required- to give an error when they
encounter it.)

Cheers
Michael
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,904
Latest member
HealthyVisionsCBDPrice

Latest Threads

Top