different way of finding out offsetof a member in structure

Discussion in 'C Programming' started by abhimanyu.v@gmail.com, Nov 5, 2007.

  1. Guest

    Hi Guys,

    I have one doubt. The test program is given below. It uses two way of
    finding out the offset of a variable in structure. I executed the
    program and found the same result.

    My question is what is difference between

    1) (unsigned long) &((struct foobar *)0)->foo
    and
    2) (unsigned long)((char*)&tmp.boo - (char*)&tmp)

    And why the second option is not used for offsetof macro.

    What is obvious advantage of the first syntax? Anything wrong with the
    second syntax?

    Thanks
    Abhimanyu

    =================================

    #include <stdio.h>
    #include <stdlib.h>

    struct foobar{
    unsigned int foo;
    char bar;
    char boo;
    };

    int main()
    {
    struct foobar tmp;

    printf("address of &tmp is= %p\n\n", &tmp);
    printf("address of tmp->foo= %p \t offset of tmp->foo= %lu\n",
    &tmp.foo, (unsigned long) &((struct foobar *)0)->foo);
    printf("address of tmp->bar= %p \t offset of tmp->bar= %lu\n",
    &tmp.bar, (unsigned long) &((struct foobar *)0)->bar);
    printf("address of tmp->boo= %p \t offset of tmp->boo= %lu\n\n",
    &tmp.boo, (unsigned long) &((struct foobar *)0)->boo);

    printf("address of tmp->foo= %p \t offset of tmp->foo= %lu\n",
    &tmp.foo, (unsigned long)((char*)&tmp.foo - (char*)&tmp) );
    printf("address of tmp->bar= %p \t offset of tmp->bar= %lu\n",
    &tmp.bar, (unsigned long)((char*)&tmp.bar - (char*)&tmp) );
    printf("address of tmp->boo= %p \t offset of tmp->boo= %lu\n\n",
    &tmp.boo, (unsigned long)((char*)&tmp.boo - (char*)&tmp) );

    printf("Hello world!\n");
    return 0;
    }


    Result
    ==================
    address of &tmp is= 0022FF70

    address of tmp->foo= 0022FF70 offset of tmp->foo= 0
    address of tmp->bar= 0022FF74 offset of tmp->bar= 4
    address of tmp->boo= 0022FF75 offset of tmp->boo= 5

    address of tmp->foo= 0022FF70 offset of tmp->foo= 0
    address of tmp->bar= 0022FF74 offset of tmp->bar= 4
    address of tmp->boo= 0022FF75 offset of tmp->boo= 5

    Hello world!

    Press ENTER to continue.
     
    , Nov 5, 2007
    #1
    1. Advertising

  2. On Nov 5, 12:56 pm, "" <>
    wrote:
    > Hi Guys,
    >
    > I have one doubt. The test program is given below. It uses two way of
    > finding out the offset of a variable in structure. I executed the
    > program and found the same result.
    >
    > My question is what is difference between
    >
    > 1) (unsigned long) &((struct foobar *)0)->foo
    > and
    > 2) (unsigned long)((char*)&tmp.boo - (char*)&tmp)
    >
    > And why the second option is not used for offsetof macro.
    >
    > What is obvious advantage of the first syntax? Anything wrong with the
    > second syntax?
    >
    > Thanks
    > Abhimanyu
    >
    > =================================
    >
    > #include <stdio.h>
    > #include <stdlib.h>
    >
    > struct foobar{
    > unsigned int foo;
    > char bar;
    > char boo;
    >
    > };
    >
    > int main()
    > {
    > struct foobar tmp;
    >
    > printf("address of &tmp is= %p\n\n", &tmp);
    > printf("address of tmp->foo= %p \t offset of tmp->foo= %lu\n",
    > &tmp.foo, (unsigned long) &((struct foobar *)0)->foo);
    > printf("address of tmp->bar= %p \t offset of tmp->bar= %lu\n",
    > &tmp.bar, (unsigned long) &((struct foobar *)0)->bar);
    > printf("address of tmp->boo= %p \t offset of tmp->boo= %lu\n\n",
    > &tmp.boo, (unsigned long) &((struct foobar *)0)->boo);
    >
    > printf("address of tmp->foo= %p \t offset of tmp->foo= %lu\n",
    > &tmp.foo, (unsigned long)((char*)&tmp.foo - (char*)&tmp) );
    > printf("address of tmp->bar= %p \t offset of tmp->bar= %lu\n",
    > &tmp.bar, (unsigned long)((char*)&tmp.bar - (char*)&tmp) );
    > printf("address of tmp->boo= %p \t offset of tmp->boo= %lu\n\n",
    > &tmp.boo, (unsigned long)((char*)&tmp.boo - (char*)&tmp) );
    >
    > printf("Hello world!\n");
    > return 0;
    >
    > }
    >
    > Result
    > ==================
    > address of &tmp is= 0022FF70
    >
    > address of tmp->foo= 0022FF70 offset of tmp->foo= 0
    > address of tmp->bar= 0022FF74 offset of tmp->bar= 4
    > address of tmp->boo= 0022FF75 offset of tmp->boo= 5
    >
    > address of tmp->foo= 0022FF70 offset of tmp->foo= 0
    > address of tmp->bar= 0022FF74 offset of tmp->bar= 4
    > address of tmp->boo= 0022FF75 offset of tmp->boo= 5
    >
    > Hello world!
    >
    > Press ENTER to continue.


    Good Question.

    But, i think that (unsigned long) &((struct foobar *)0)->bar is
    internally implemented as
    (unsigned long)((char*)&tmp.boo - (char*)&tmp).

    I think, both mean the same(I am not sure). !!

    Karthik Balaguru
     
    karthikbalaguru, Nov 5, 2007
    #2
    1. Advertising

  3. Guest

    On Nov 5, 1:17 pm, karthikbalaguru <>
    wrote:
    > On Nov 5, 12:56 pm, "" <>
    > wrote:
    >
    >
    >
    > > Hi Guys,

    >
    > > I have one doubt. The test program is given below. It uses two way of
    > > finding out the offset of a variable in structure. I executed the
    > > program and found the same result.

    >
    > > My question is what is difference between

    >
    > > 1) (unsigned long) &((struct foobar *)0)->foo
    > > and
    > > 2) (unsigned long)((char*)&tmp.boo - (char*)&tmp)

    >
    > > And why the second option is not used for offsetof macro.

    >
    > > What is obvious advantage of the first syntax? Anything wrong with the
    > > second syntax?

    >
    > > Thanks
    > > Abhimanyu

    >
    > > =================================

    >
    > > #include <stdio.h>
    > > #include <stdlib.h>

    >
    > > struct foobar{
    > > unsigned int foo;
    > > char bar;
    > > char boo;

    >
    > > };

    >
    > > int main()
    > > {
    > > struct foobar tmp;

    >
    > > printf("address of &tmp is= %p\n\n", &tmp);
    > > printf("address of tmp->foo= %p \t offset of tmp->foo= %lu\n",
    > > &tmp.foo, (unsigned long) &((struct foobar *)0)->foo);
    > > printf("address of tmp->bar= %p \t offset of tmp->bar= %lu\n",
    > > &tmp.bar, (unsigned long) &((struct foobar *)0)->bar);
    > > printf("address of tmp->boo= %p \t offset of tmp->boo= %lu\n\n",
    > > &tmp.boo, (unsigned long) &((struct foobar *)0)->boo);

    >
    > > printf("address of tmp->foo= %p \t offset of tmp->foo= %lu\n",
    > > &tmp.foo, (unsigned long)((char*)&tmp.foo - (char*)&tmp) );
    > > printf("address of tmp->bar= %p \t offset of tmp->bar= %lu\n",
    > > &tmp.bar, (unsigned long)((char*)&tmp.bar - (char*)&tmp) );
    > > printf("address of tmp->boo= %p \t offset of tmp->boo= %lu\n\n",
    > > &tmp.boo, (unsigned long)((char*)&tmp.boo - (char*)&tmp) );

    >
    > > printf("Hello world!\n");
    > > return 0;

    >
    > > }

    >
    > > Result
    > > ==================
    > > address of &tmp is= 0022FF70

    >
    > > address of tmp->foo= 0022FF70 offset of tmp->foo= 0
    > > address of tmp->bar= 0022FF74 offset of tmp->bar= 4
    > > address of tmp->boo= 0022FF75 offset of tmp->boo= 5

    >
    > > address of tmp->foo= 0022FF70 offset of tmp->foo= 0
    > > address of tmp->bar= 0022FF74 offset of tmp->bar= 4
    > > address of tmp->boo= 0022FF75 offset of tmp->boo= 5

    >
    > > Hello world!

    >
    > > Press ENTER to continue.

    >
    > Good Question.
    >
    > But, i think that (unsigned long) &((struct foobar *)0)->bar is
    > internally implemented as
    > (unsigned long)((char*)&tmp.boo - (char*)&tmp).
    >
    > I think, both mean the same(I am not sure). !!
    >
    > Karthik Balaguru


    No the (unsigned long) &((struct foobar *)0)->bar is not same as
    (unsigned long)((char*)&tmp.boo - (char*)&tmp).

    The (unsigned long) &((struct foobar *)0)->bar is basically doing the
    following thing:

    1) Typecast the ZEROth memory with the structure.
    2) Now assuming that ZEROth location is indeed 0, then pointing to the
    member variable will give the memory location of the variable.

    Now what if ZEROth location is not present at 0 internally? Then this
    construct will fail!

    Regards,
    Abhimanyu
     
    , Nov 5, 2007
    #3
  4. "" <> writes:
    > I have one doubt. The test program is given below. It uses two way of
    > finding out the offset of a variable in structure. I executed the
    > program and found the same result.
    >
    > My question is what is difference between
    >
    > 1) (unsigned long) &((struct foobar *)0)->foo
    > and
    > 2) (unsigned long)((char*)&tmp.boo - (char*)&tmp)
    >
    > And why the second option is not used for offsetof macro.
    >
    > What is obvious advantage of the first syntax? Anything wrong with the
    > second syntax?

    [...]

    The first form invokes undefined behavior. Note that this doesn't
    mean that it doesn't work, or that it blows up; the behavior just
    isn't defined by the standard. Implementations can use something
    similar to your first example to implement offsetof, taking advantage
    of the behavior of the particular compiler. (You can't reliably do
    that in portable code, which is why offsetof is part of the
    implementation.)

    The second form doesn't invoke undefined behavior as far as I can
    tell, but it can't be used to implement offsetof; the first argument
    to offsetof is a struct type, not a struct object.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Looking for software development work in the San Diego area.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Nov 5, 2007
    #4
  5. MisterE Guest

    <> wrote in message
    news:...
    > Hi Guys,
    >
    > I have one doubt. The test program is given below. It uses two way of
    > finding out the offset of a variable in structure. I executed the
    > program and found the same result.
    >
    > My question is what is difference between
    >
    > 1) (unsigned long) &((struct foobar *)0)->foo
    > and
    > 2) (unsigned long)((char*)&tmp.boo - (char*)&tmp)


    ? I assume you know the difference. The 0 one is just assigning the pointer
    value 0 (Address 0) and the compiler does the offset from the struct.

    The second one require a subtraction.

    > And why the second option is not used for offsetof macro.
    >
    > What is obvious advantage of the first syntax? Anything wrong with the
    > second syntax?


    The first one can load the value 0 to a reigster as a direct value. The 2nd
    one cannot load its values directly because they are variable.
    The second one also uses a subtraction operation.
    The difference is that the first one is going to require less machine
    instructions and will execute faster.
     
    MisterE, Nov 5, 2007
    #5
  6. Guest

    On Nov 5, 1:59 pm, "MisterE" <> wrote:
    > <> wrote in message
    >
    > news:...
    >
    > > Hi Guys,

    >
    > > I have one doubt. The test program is given below. It uses two way of
    > > finding out the offset of a variable in structure. I executed the
    > > program and found the same result.

    >
    > > My question is what is difference between

    >
    > > 1) (unsigned long) &((struct foobar *)0)->foo
    > > and
    > > 2) (unsigned long)((char*)&tmp.boo - (char*)&tmp)

    >
    > ? I assume you know the difference. The 0 one is just assigning the pointer
    > value 0 (Address 0) and the compiler does the offset from the struct.
    >
    > The second one require a subtraction.
    >
    > > And why the second option is not used for offsetof macro.

    >
    > > What is obvious advantage of the first syntax? Anything wrong with the
    > > second syntax?

    >
    > The first one can load the value 0 to a reigster as a direct value. The 2nd
    > one cannot load its values directly because they are variable.
    > The second one also uses a subtraction operation.
    > The difference is that the first one is going to require less machine
    > instructions and will execute faster.


    Thanks a lot everyone!!

    It indeed help me to understand the difference.

    Regards,
    Abhimanyu
     
    , Nov 5, 2007
    #6
  7. Mark Bluemel Guest

    wrote:
    > Hi Guys,
    >
    > I have one doubt. The test program is given below. It uses two way of
    > finding out the offset of a variable in structure. I executed the
    > program and found the same result.


    Which proves that for your particular compiler/platform combination the
    two are equivalent. This is not guaranteed.

    > My question is what is difference between
    >
    > 1) (unsigned long) &((struct foobar *)0)->foo


    This assumes that an address can meaningfully be cast to an integer
    value. This is not always true.

    It does not require an instance of the structure to be created...

    > and
    > 2) (unsigned long)((char*)&tmp.boo - (char*)&tmp)


    > And why the second option is not used for offsetof macro.


    This requires an instance of the structure...

    See q 2.14 of the FAQ at http://www.c-faq.com which combines the two
    techniques...
     
    Mark Bluemel, Nov 5, 2007
    #7
  8. Jack Klein Guest

    On Mon, 5 Nov 2007 18:59:41 +1000, "MisterE" <> wrote
    in comp.lang.c:

    >
    > <> wrote in message
    > news:...
    > > Hi Guys,
    > >
    > > I have one doubt. The test program is given below. It uses two way of
    > > finding out the offset of a variable in structure. I executed the
    > > program and found the same result.
    > >
    > > My question is what is difference between
    > >
    > > 1) (unsigned long) &((struct foobar *)0)->foo
    > > and
    > > 2) (unsigned long)((char*)&tmp.boo - (char*)&tmp)

    >
    > ? I assume you know the difference. The 0 one is just assigning the pointer
    > value 0 (Address 0) and the compiler does the offset from the struct.


    I have to assume that you don't know much about C. Assigning 0 to a
    pointer creates a null pointer, which does not point to address 0, and
    may not be all bits 0 in its representation.

    > The first one can load the value 0 to a reigster as a direct value. The 2nd
    > one cannot load its values directly because they are variable.
    > The second one also uses a subtraction operation.
    > The difference is that the first one is going to require less machine
    > instructions and will execute faster.


    ....the real difference is that the first one produces undefined
    behavior and is completely non-portable.

    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://c-faq.com/
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Nov 6, 2007
    #8
  9. Jack Klein Guest

    On Mon, 05 Nov 2007 10:03:18 +0000, Mark Bluemel
    <> wrote in comp.lang.c:

    > wrote:
    > > Hi Guys,
    > >
    > > I have one doubt. The test program is given below. It uses two way of
    > > finding out the offset of a variable in structure. I executed the
    > > program and found the same result.

    >
    > Which proves that for your particular compiler/platform combination the
    > two are equivalent. This is not guaranteed.


    Absolutely nothing about the first one is guaranteed, since the
    behavior is undefined. Not because the pointer is dereferenced,
    because it is not, but because evaluating the expression performs
    addition to a null pointer, which is undefined.

    > > My question is what is difference between
    > >
    > > 1) (unsigned long) &((struct foobar *)0)->foo

    >
    > This assumes that an address can meaningfully be cast to an integer
    > value. This is not always true.


    It also assumes that you can add an offset to a null pointer, which is
    not defined.

    > It does not require an instance of the structure to be created...
    >
    > > and
    > > 2) (unsigned long)((char*)&tmp.boo - (char*)&tmp)

    >
    > > And why the second option is not used for offsetof macro.

    >
    > This requires an instance of the structure...
    >
    > See q 2.14 of the FAQ at http://www.c-faq.com which combines the two
    > techniques...


    --
    Jack Klein
    Home: http://JK-Technology.Com
    FAQs for
    comp.lang.c http://c-faq.com/
    comp.lang.c++ http://www.parashift.com/c -faq-lite/
    alt.comp.lang.learn.c-c++
    http://www.club.cc.cmu.edu/~ajo/docs/FAQ-acllc.html
     
    Jack Klein, Nov 6, 2007
    #9
  10. James Kuyper Guest

    Jack Klein wrote:
    ....
    > I have to assume that you don't know much about C. Assigning 0 to a
    > pointer creates a null pointer, which does not point to address 0, and


    More accurately, it need not point to address 0; however, it's also
    allowed to point address 0, but only if no C object is also allocated at
    that same address. I've used systems where null pointers did indeed
    point to address 0; when code had undefined behavior due to
    dereferencing null pointers, the actual behavior involved actually
    reading or writing starting at address 0. Depending upon the system,
    this could be catastrophic for your program, or (for instance, under
    DOS) catastrophic for the entire operating system.
     
    James Kuyper, Nov 6, 2007
    #10
  11. Richard Guest

    Jack Klein <> writes:

    > On Mon, 5 Nov 2007 18:59:41 +1000, "MisterE" <> wrote
    > in comp.lang.c:
    >
    >>
    >> <> wrote in message
    >> news:...
    >> > Hi Guys,
    >> >
    >> > I have one doubt. The test program is given below. It uses two way of
    >> > finding out the offset of a variable in structure. I executed the
    >> > program and found the same result.
    >> >
    >> > My question is what is difference between
    >> >
    >> > 1) (unsigned long) &((struct foobar *)0)->foo
    >> > and
    >> > 2) (unsigned long)((char*)&tmp.boo - (char*)&tmp)

    >>
    >> ? I assume you know the difference. The 0 one is just assigning the pointer
    >> value 0 (Address 0) and the compiler does the offset from the struct.

    >
    > I have to assume that you don't know much about C. Assigning 0 to a
    > pointer creates a null pointer, which does not point to address 0, and
    > may not be all bits 0 in its representation.


    But more often than not appears to try just that and is indeed all 0
    bits. It might not work. But it mostly tries.

    e.g in gdb

    ,---- code sample ---
    | char *p="hello";
    `--------------------

    set variable p=0
    p *p

    "Cannot access memory at address 0x0"

    but in this case its a restricted memory architecture.

    Can one legally access a char at memory access 0 (assuming not protected) thus?

    *(char*)0; ?


    >
    >> The first one can load the value 0 to a reigster as a direct value. The 2nd
    >> one cannot load its values directly because they are variable.
    >> The second one also uses a subtraction operation.
    >> The difference is that the first one is going to require less machine
    >> instructions and will execute faster.

    >
    > ...the real difference is that the first one produces undefined
    > behavior and is completely non-portable.


    Well, not portable to a tiny minority of systems and certainly not the
    right way to do it.
     
    Richard, Nov 6, 2007
    #11
  12. santosh Guest

    On Tuesday 06 Nov 2007 7:22 pm Richard <> wrote in
    article <>:

    > Jack Klein <> writes:
    >
    >> On Mon, 5 Nov 2007 18:59:41 +1000, "MisterE" <>
    >> wrote in comp.lang.c:
    >>
    >>>
    >>> <> wrote in message
    >>> news:...
    >>> > Hi Guys,
    >>> >
    >>> > I have one doubt. The test program is given below. It uses two way
    >>> > of finding out the offset of a variable in structure. I executed
    >>> > the program and found the same result.
    >>> >
    >>> > My question is what is difference between
    >>> >
    >>> > 1) (unsigned long) &((struct foobar *)0)->foo
    >>> > and
    >>> > 2) (unsigned long)((char*)&tmp.boo - (char*)&tmp)
    >>>
    >>> ? I assume you know the difference. The 0 one is just assigning the
    >>> pointer value 0 (Address 0) and the compiler does the offset from
    >>> the struct.

    >>
    >> I have to assume that you don't know much about C. Assigning 0 to a
    >> pointer creates a null pointer, which does not point to address 0,
    >> and may not be all bits 0 in its representation.

    >
    > But more often than not appears to try just that and is indeed all 0
    > bits. It might not work. But it mostly tries.
    >
    > e.g in gdb
    >
    > ,---- code sample ---
    > | char *p="hello";
    > `--------------------
    >
    > set variable p=0
    > p *p
    >
    > "Cannot access memory at address 0x0"
    >
    > but in this case its a restricted memory architecture.
    >
    > Can one legally access a char at memory access 0 (assuming not
    > protected) thus?
    >
    > *(char*)0; ?


    No. I believe in Standard C you cannot deference address zero.

    <OT>

    Outside Standard C, this depends on the architecture. For the Intel x86
    architecture you can do so only from ring 0 protection level.

    Under the same architecture under segmented addressing mode a pointer
    pointing to address zero may not actually point to the start of
    system's memory, but merely to the start of a segment anywhere in
    memory.

    </OT>
     
    santosh, Nov 6, 2007
    #12
  13. Mark Bluemel Guest

    santosh wrote:
    > On Tuesday 06 Nov 2007 7:22 pm Richard <> wrote in
    > article <>:
    >>
    >> Can one legally access a char at memory access 0 (assuming not
    >> protected) thus?
    >>
    >> *(char*)0; ?

    >
    > No. I believe in Standard C you cannot deference address zero.


    Bzzt! Watch the terminology here. I suspect Richard has lured you into
    the "addresses are integers" trap.

    I'm not sure the standard forbids you dereferencing a null pointer. The
    paragraph (6.3.2.3) I just reviewed doesn't have such an injunction and
    Q 5.19 of the FAQ suggests that it can be a valid (in some sense) action.
     
    Mark Bluemel, Nov 6, 2007
    #13
  14. Richard Guest

    santosh <> writes:

    > On Tuesday 06 Nov 2007 7:22 pm Richard <> wrote in
    >
    > No. I believe in Standard C you cannot deference address zero.
    >
    > <OT>


    This is perfectly On Topic. Since it involves issues with "standard C"
    in the real world.

    >
    > Outside Standard C, this depends on the architecture. For the Intel x86
    > architecture you can do so only from ring 0 protection level.
    >
    > Under the same architecture under segmented addressing mode a pointer
    > pointing to address zero may not actually point to the start of
    > system's memory, but merely to the start of a segment anywhere in
    > memory.


    This is still address 0. No difference IMO. A 0 pointer (pointer=0) is a "null"
    pointer whether segmented or not.

    >
    > </OT>
     
    Richard, Nov 6, 2007
    #14
  15. Richard Guest

    Mark Bluemel <> writes:

    > santosh wrote:
    >> On Tuesday 06 Nov 2007 7:22 pm Richard <> wrote in
    >> article <>:
    >>>
    >>> Can one legally access a char at memory access 0 (assuming not
    >>> protected) thus?
    >>>
    >>> *(char*)0; ?

    >>
    >> No. I believe in Standard C you cannot deference address zero.

    >
    > Bzzt! Watch the terminology here. I suspect Richard has lured you into
    > the "addresses are integers" trap.
    >
    > I'm not sure the standard forbids you dereferencing a null
    > pointer. The paragraph (6.3.2.3) I just reviewed doesn't have such an


    I would be surprised if the standard didn't forbid just that. But a 0
    pointer?


    > injunction and Q 5.19 of the FAQ suggests that it can be a valid (in
    > some sense) action.
     
    Richard, Nov 6, 2007
    #15
  16. James Kuyper Guest

    santosh wrote:
    > On Tuesday 06 Nov 2007 7:22 pm Richard <> wrote in
    > article <>:

    ....
    >> Can one legally access a char at memory access 0 (assuming not
    >> protected) thus?
    >>
    >> *(char*)0; ?

    >
    > No. I believe in Standard C you cannot deference address zero.


    A pointer which refers to address 0 is not necessarily a null pointer.

    In standard C, dereferencing a null pointer has undefined behavior,
    which makes it technically meaningless to talk about the location it
    points at. However, if the undefined behavior for a particular platform
    takes the form of accessing a particular piece of memory, that piece of
    memory might or might not start at address 0. Just because you created
    the pointer by using (char*)0 doesn't guarantee anything.
     
    James Kuyper, Nov 7, 2007
    #16
  17. Mark Bluemel <> writes:
    [...]
    > I'm not sure the standard forbids you dereferencing a null
    > pointer. The paragraph (6.3.2.3) I just reviewed doesn't have such an
    > injunction and Q 5.19 of the FAQ suggests that it can be a valid (in
    > some sense) action.


    It doesn't forbid it, but the behavior is undefined.

    C99 6.3.2.3p3:

    If a null pointer constant is converted to a pointer type, the
    resulting pointer, called a _null pointer_, is guaranteed to
    compare unequal to a pointer to any object or function.

    C99 6.5.3.2p4:

    The unary * operator denotes indirection. If the operand points to
    a function, the result is a function designator; if it points to
    an object, the result is an lvalue designating the object.

    Since a null pointer doesn't point to an object, the standard doesn't
    define the behavior of an attempt to dereference it.

    Question 5.19 of the FAQ is:

    How can I access an interrupt vector located at the machine's
    location 0? If I set a pointer to 0, the compiler might translate
    it to some nonzero internal null pointer value.

    This is a very machine-specific thing. The standard does not define
    the behavior of any of the proposed solutions.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Looking for software development work in the San Diego area.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Nov 10, 2007
    #17
  18. James Kuyper <> writes:
    > santosh wrote:
    >> On Tuesday 06 Nov 2007 7:22 pm Richard <> wrote in
    >> article <>:

    > ...
    >>> Can one legally access a char at memory access 0 (assuming not
    >>> protected) thus?
    >>>
    >>> *(char*)0; ?

    >> No. I believe in Standard C you cannot deference address zero.

    >
    > A pointer which refers to address 0 is not necessarily a null pointer.
    >
    > In standard C, dereferencing a null pointer has undefined behavior,
    > which makes it technically meaningless to talk about the location it
    > points at. However, if the undefined behavior for a particular
    > platform takes the form of accessing a particular piece of memory,
    > that piece of memory might or might not start at address 0. Just
    > because you created the pointer by using (char*)0 doesn't guarantee
    > anything.


    It guarantees that it's a null pointer.

    The term "address 0" isn't necessarily meaningful. As far as C is
    concerned, addresses are not numbers. An address must have a
    numerical component, perhaps indirectly, in order for pointer
    arithmetic to work, but the address as a whole is just an address.

    On almost all modern implementations (that I know of):

    Addresses can sensibly be represented as numbers.

    All object pointers have the same size.

    A null pointer is represented as all-bits-zero.

    If you attempt to dereference a null pointer, you're attempting to
    access memory at address 0. The results of this attempt are
    machine-specific; most likely it will either fail horribly or
    actually access whatever happens to be stored at address 0 (the
    latter might cause further bad things to happen). A C
    implementation must avoid storing any C-visible object at address
    0.

    Conversion between a pointer type and an integer type of the same
    size, or between two pointer types, just reinterprets the bits;
    there's no change of representation.

    *None* of this is guaranteed, and there are (or have been, or perhaps
    will be) real-world implementations that violate one or more of these
    assumptions.

    Conversion of an integer constant expression with the value 0 to a
    pointer type is guaranteed to yield a null pointer value (which may or
    may not be all-bits-zero). Conversion of a non-constant integer
    expression with the value 0 yields some implementation-defined pointer
    value, possibly a trap representation; this may or may not be
    all-bits-zero (i.e., the conversion might be non-trivial), and it
    might or might not be a null pointer value. In other words, the
    following:

    A null pointer value;

    The result of converting a non-constant value 0 to a pointer type; and

    A pointer whose representation is all-bits-zero

    could possibly be three distinct pointer values.

    (It's been argued that converting a non-constant value 0 to a pointer
    type must yield the same result as converting a constant value 0 to a
    pointer type, i.e., a null pointer. If that's the case, the three
    cases above can only yield at most two distinct pointer values. I
    disagree, but once you're writing code that cares one way or the
    other, you're well beyond what's guaranteed by the standard anyway.)

    If you really need to access memory at "address 0", assuming that's
    meaningful, you need to do something very low-level and
    system-specific. Question 5.19 in the FAQ provides several plausible
    (but blatantly non-portable) suggestions for how to do this.

    --
    Keith Thompson (The_Other_Keith) <http://www.ghoti.net/~kst>
    Looking for software development work in the San Diego area.
    "We must do something. This is something. Therefore, we must do this."
    -- Antony Jay and Jonathan Lynn, "Yes Minister"
     
    Keith Thompson, Nov 10, 2007
    #18
  19. James Kuyper Guest

    Keith Thompson wrote:
    > James Kuyper <> writes:

    ....
    >> that piece of memory might or might not start at address 0. Just
    >> because you created the pointer by using (char*)0 doesn't guarantee
    >> anything.

    >
    > It guarantees that it's a null pointer.


    True - I should have said "doesn't guarantee anything about the address".
     
    James Kuyper, Nov 10, 2007
    #19
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Hiroki Horiuchi

    g++ "offsetof" problem

    Hiroki Horiuchi, Nov 25, 2003, in forum: C++
    Replies:
    5
    Views:
    6,713
    red floyd
    Nov 25, 2003
  2. Tony Johansson

    offsetof

    Tony Johansson, Dec 16, 2004, in forum: C++
    Replies:
    1
    Views:
    473
    Alf P. Steinbach
    Dec 16, 2004
  3. Tony Johansson

    More offsetof

    Tony Johansson, Dec 16, 2004, in forum: C++
    Replies:
    3
    Views:
    533
    Mike Wahler
    Dec 18, 2004
  4. Imre
    Replies:
    2
    Views:
    734
  5. Replies:
    5
    Views:
    813
    Michael Press
    Sep 2, 2011
Loading...

Share This Page