Why does this not fail?

Discussion in 'C Programming' started by sgadag@gmail.com, Aug 23, 2006.

  1. Guest

    Even if "a" is NULL in the assignment below, this assignment does not
    cause any AV:

    SOME_PTR * someVar = (SOME_PTR *) a->b;


    But something like this will cause an AV because "someVar" is NULL:

    if (someVar->someType == 1)
    {

    }


    Why does the first assignment not cause any access violation?
     
    , Aug 23, 2006
    #1
    1. Advertising

  2. In article <>,
    <> wrote:
    >Even if "a" is NULL in the assignment below, this assignment does not
    >cause any AV:


    >SOME_PTR * someVar = (SOME_PTR *) a->b;


    >But something like this will cause an AV because "someVar" is NULL:


    >if (someVar->someType == 1)
    >{
    >}



    >Why does the first assignment not cause any access violation?


    Chance.

    Derefencing a NULL pointer only results in an access violation
    when you are lucky. The rest of the time, it does something or other
    that is usually much harder to detect.
    --
    "It is important to remember that when it comes to law, computers
    never make copies, only human beings make copies. Computers are given
    commands, not permission. Only people can be given permission."
    -- Brad Templeton
     
    Walter Roberson, Aug 23, 2006
    #2
    1. Advertising

  3. Ben Pfaff Guest

    writes:

    > Even if "a" is NULL in the assignment below, this assignment does not
    > cause any AV:
    >
    > SOME_PTR * someVar = (SOME_PTR *) a->b;
    >
    >
    > But something like this will cause an AV because "someVar" is NULL:
    >
    > if (someVar->someType == 1)
    > {
    >
    > }
    >
    >
    > Why does the first assignment not cause any access violation?


    Either way, the behavior is undefined, so anything is actually
    allowed to happen. But I suppose your real question is why the
    undefined behavior manifests this way. My first thought is that
    the former code doesn't actually do anything with the value that
    it obtains, so the compiler is probably optimizing it out
    entirely, not dereferencing the pointer at all.
    --
    "The way I see it, an intelligent person who disagrees with me is
    probably the most important person I'll interact with on any given
    day."
    --Billy Chambless
     
    Ben Pfaff, Aug 23, 2006
    #3
  4. Guest

    I think since we are not accessing NULL memory, we will get the address
    of "b", even if "a" is NULL.

    What about this:

    &( ((type *)0) -> field)

    There is no problem here too. I am yet to get a satisfactory answer.
     
    , Aug 23, 2006
    #4
  5. wrote:
    > Even if "a" is NULL in the assignment below, this assignment does not
    > cause any AV:
    >
    > SOME_PTR * someVar = (SOME_PTR *) a->b;


    What is the struct declaration like? In your case It's likely field
    "b" is many kilobytes from the start of the struct. Most OS's map the
    lower few K of memory to "invalid", so that catches NULL references,
    and a lot of NULL->field references. But if a field is far enough into
    the structure, it may map into valid memory addresses. And then a->b
    might ne a valid read reference.

    is NULL:
    >
    > if (someVar->someType == 1)



    Yep, if someType is in the first few K of the struct, it is likely to
    get caught as a bad address.
     
    Ancient_Hacker, Aug 23, 2006
    #5
  6. Guest

    I have seen at quite a few places that offsetof() is coded something
    like

    #define offsetof(type, mem) ((size_t)((char *)&((type *)0)->mem - (char
    *)(type *)0))

    Now, not getting into other issues with the code (portability etc), if
    we see, we have null pointer dereferencing here. How is this allowed?
     
    , Aug 23, 2006
    #6
  7. In article <>,
    <> wrote:
    >I think since we are not accessing NULL memory, we will get the address
    >of "b", even if "a" is NULL.


    Please quote enough context so that people know what you are
    referring to.

    Your reply is with respect to a->b where a is NULL.

    a->b is the same as (*a).b by definition. b must therefore be
    a field name within the structure type associated with *a.
    As b is a field name and not a variable, b has no address of its
    own, so your analysis cannot be correct.

    In considering (*a).b with a being NULL, you should understand
    that the C standards say that doing this is not allowed and that
    the results are undefined. The standards do not say that the program
    must crash: crashing is one of the allowed options, as is doing
    something else completely like accessing an I/O register or loading
    a random number. Crashing is relatively easy to track down; the
    other possibilities might lurk undetected for decades.

    One of the allowed behaviours for (*a).b with a being NULL, is to
    calculate the distance of the field b relative to the begining
    of the structure, and then attempt to access a memory location that
    much further along from whatever bit pattern NULL happens to be,
    which often -happens- to be the all-zero bit pattern. For example,
    if the field b happens to start 84 bytes from the beginning of the
    structure then the code might try accessing location 0+84 . And
    that just might happen to work, because there just might happen to
    be valid and accessible memory at that location. Or it might happen
    to crash if the system knows there is no memory there. Or it might
    happen to return 0's, if the memory system knows there is no memory
    there and automatically substitutes 0's. I've seen all of these
    behaviours on real systems.


    >What about this:


    >&( ((type *)0) -> field)


    >There is no problem here too. I am yet to get a satisfactory answer.


    This is slightly different in that the address of (*0).field is
    being taken without the content of (*0).field being needed.
    This does not need to go to the memory hardware for lookup, so
    *some* systems would treat the above as calculating the offset of
    the field relative to the beginning of the structure. It doesn't
    really calculate that, though, as it is the wrong type (address
    instead of offset).

    According to the C standards, the -> operator is only valid when
    its left side is a pointer to an object, and 0 (or NULL) are
    defined as pointing to NO object. Therefore the code
    does not have a defined result according to the C standards.
    It isn't uncommon to see the code in the implementation of
    offset(), but that's because the implementation is allowed to take
    advantage of internal knowledge of the operating system, and so
    is allowed to do things that C programmers cannot safely do in
    user programs. The code is *not* portable. (But as I discussed
    above, systems are not -required- to give an error when they
    encounter it.)
    --
    "It is important to remember that when it comes to law, computers
    never make copies, only human beings make copies. Computers are given
    commands, not permission. Only people can be given permission."
    -- Brad Templeton
     
    Walter Roberson, Aug 23, 2006
    #7
  8. Ben Pfaff Guest

    "Ancient_Hacker" <> writes:

    > wrote:
    >> Even if "a" is NULL in the assignment below, this assignment does not
    >> cause any AV:
    >>
    >> SOME_PTR * someVar = (SOME_PTR *) a->b;

    >
    > What is the struct declaration like? In your case It's likely field
    > "b" is many kilobytes from the start of the struct. Most OS's map the
    > lower few K of memory to "invalid", so that catches NULL references,
    > and a lot of NULL->field references. But if a field is far enough into
    > the structure, it may map into valid memory addresses. And then a->b
    > might ne a valid read reference.


    Really? On what OSes is the second page of virtual address space
    commonly mapped?
    --
    int main(void){char p[]="ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz.\
    \n",*q="kl BIcNBFr.NKEzjwCIxNJC";int i=sizeof p/2;char *strchr();int putchar(\
    );while(*q){i+=strchr(p,*q++)-p;if(i>=(int)sizeof p)i-=sizeof p-1;putchar(p\
    );}return 0;}
     
    Ben Pfaff, Aug 23, 2006
    #8
  9. Eric Sosman Guest

    wrote On 08/23/06 15:07,:
    > I have seen at quite a few places that offsetof() is coded something
    > like
    >
    > #define offsetof(type, mem) ((size_t)((char *)&((type *)0)->mem - (char
    > *)(type *)0))
    >
    > Now, not getting into other issues with the code (portability etc), if
    > we see, we have null pointer dereferencing here. How is this allowed?


    The answer is inseparably bound with the "other issues"
    you don't want to get into.

    Briefly, the implementation can use all the non-portable
    tricks and gimmicks it feels like, so long as they produce
    the effect the Standard requires. The implementation does
    not need to be portable to other implementations. The Frobozz
    Magic C compiler is not required to work as advertised if you
    try to run it on the DeathStation 9000. The implementation
    doesn't even need to be written in C at all.

    ... and that's why dodgy implementations of offsetof() are
    allowed: because they're part of the implementation, not
    part of the user code.

    --
     
    Eric Sosman, Aug 23, 2006
    #9
  10. In article <>,
    Ben Pfaff <> wrote:
    >"Ancient_Hacker" <> writes:


    >> Most OS's map the
    >> lower few K of memory to "invalid", so that catches NULL references,
    >> and a lot of NULL->field references. But if a field is far enough into
    >> the structure, it may map into valid memory addresses.


    >Really? On what OSes is the second page of virtual address space
    >commonly mapped?


    Ancient_Hacker made no reference to a "page" of virtual memory.
    His reference was to "the lower few K", which is sufficiently
    imprecise to cover paged and non-paged memory models and to cover
    protected memory that might be 1 page long, 16 pages long, 42 pages
    long...


    But to answer your question very specifically:

    Silicon Graphics IRIX, starting from some version starting in 4.x,
    through to version 6.5.22.

    If memory serves me, it was IRIX 6.4 that introduced the models for
    which the second page of virtual adress space was NOT commonly mapped.
    It wasn't a matter that the addresses were no longer used: what
    happened is that the page size got larger for newer hardware models,
    requiring that the mapped memory be accessed via the first page (which
    was now big enough to cover that address space). IRIX 6.4 -only-
    supported models that referenced the memory via the first virtual page;
    IRIX 6.5 was a general purpose OS that supported both models that used
    the second virtual page for the needed addresses and models that used
    the first {larger} virtual page for the same addresses. However, after
    6.5.22, support was dropped for all the hardware that used the smaller
    page size.

    In IRIX 4 through 6.5.22 on models that supported the smaller page
    size, the first virtual page of memory is flagged as allowing
    no access (no read, no write, no execute), but the second virtual
    page of memory was read and write because it was used for SGI's GL
    graphics subsystem. In IRIX 6.4 and in IRIX 6.5 on the models with
    the larger virtual page, the GL addresses are part of the {larger} first
    page; as read and write were required for GL graphics, this had
    the size effect of unprotecting memory address 0. If I recall
    correctly, the locations near there are initialized to 0... and Yes, they
    are writable :(
    --
    "No one has the right to destroy another person's belief by
    demanding empirical evidence." -- Ann Landers
     
    Walter Roberson, Aug 23, 2006
    #10
  11. writes:
    > Even if "a" is NULL in the assignment below, this assignment does not
    > cause any AV:
    >
    > SOME_PTR * someVar = (SOME_PTR *) a->b;


    What does "AV" mean? I'm guessing it means something like "access
    violation", but don't assume that we know that.

    It's difficult to tell without seeing the actual code. If you had
    posted a complete self-contained program that exhibits the problem, we
    might have a chance of helping, but we have no way of knowing what
    SOME_PTR, a, and b are.

    Most likely you're doing something that invokes undefined behavior,
    which can do anything, including quietly giving you some
    reasonable-looking result.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    San Diego Supercomputer Center <*> <http://users.sdsc.edu/~kst>
    We must do something. This is something. Therefore, we must do this.
     
    Keith Thompson, Aug 23, 2006
    #11
  12. Michael Mair Guest

    Walter Roberson schrieb:
    > In article <>,
    > <> wrote:
    >>I think since we are not accessing NULL memory, we will get the address
    >>of "b", even if "a" is NULL.

    >
    > Please quote enough context so that people know what you are
    > referring to.
    >
    > Your reply is with respect to a->b where a is NULL.
    >
    > a->b is the same as (*a).b by definition. b must therefore be
    > a field name within the structure type associated with *a.
    > As b is a field name and not a variable, b has no address of its
    > own, so your analysis cannot be correct.
    >
    > In considering (*a).b with a being NULL, you should understand
    > that the C standards say that doing this is not allowed and that
    > the results are undefined. The standards do not say that the program
    > must crash: crashing is one of the allowed options, as is doing
    > something else completely like accessing an I/O register or loading
    > a random number. Crashing is relatively easy to track down; the
    > other possibilities might lurk undetected for decades.
    >
    > One of the allowed behaviours for (*a).b with a being NULL, is to
    > calculate the distance of the field b relative to the begining
    > of the structure, and then attempt to access a memory location that
    > much further along from whatever bit pattern NULL happens to be,
    > which often -happens- to be the all-zero bit pattern. For example,
    > if the field b happens to start 84 bytes from the beginning of the
    > structure then the code might try accessing location 0+84 . And
    > that just might happen to work, because there just might happen to
    > be valid and accessible memory at that location. Or it might happen
    > to crash if the system knows there is no memory there. Or it might
    > happen to return 0's, if the memory system knows there is no memory
    > there and automatically substitutes 0's. I've seen all of these
    > behaviours on real systems.
    >
    >>What about this:

    >
    >>&( ((type *)0) -> field)

    >
    >>There is no problem here too. I am yet to get a satisfactory answer.

    >
    > This is slightly different in that the address of (*0).field is
    > being taken without the content of (*0).field being needed.
    > This does not need to go to the memory hardware for lookup, so
    > *some* systems would treat the above as calculating the offset of
    > the field relative to the beginning of the structure. It doesn't
    > really calculate that, though, as it is the wrong type (address
    > instead of offset).
    >
    > According to the C standards, the -> operator is only valid when
    > its left side is a pointer to an object, and 0 (or NULL) are
    > defined as pointing to NO object. Therefore the code
    > does not have a defined result according to the C standards.
    > It isn't uncommon to see the code in the implementation of
    > offset(), but that's because the implementation is allowed to take


    Nit: offsetof
    @OP: Walter is talking about the offsetof macro from <stddef.h>

    > advantage of internal knowledge of the operating system, and so
    > is allowed to do things that C programmers cannot safely do in
    > user programs. The code is *not* portable. (But as I discussed
    > above, systems are not -required- to give an error when they
    > encounter it.)


    Cheers
    Michael
    --
    E-Mail: Mine is an /at/ gmx /dot/ de address.
     
    Michael Mair, Aug 23, 2006
    #12
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Fred
    Replies:
    0
    Views:
    611
  2. Jim West
    Replies:
    5
    Views:
    600
    Jim West
    Oct 14, 2003
  3. Wenjie

    if (f() != FAIL) or if (FAIL != f())?

    Wenjie, Jul 28, 2003, in forum: C Programming
    Replies:
    3
    Views:
    465
    E. Robert Tisdale
    Jul 31, 2003
  4. Mr. SweatyFinger

    why why why why why

    Mr. SweatyFinger, Nov 28, 2006, in forum: ASP .Net
    Replies:
    4
    Views:
    921
    Mark Rae
    Dec 21, 2006
  5. Mr. SweatyFinger
    Replies:
    2
    Views:
    2,073
    Smokey Grindel
    Dec 2, 2006
Loading...

Share This Page