Byte Address Arithmetic Debate

Discussion in 'C++' started by Frederick Gotham, Nov 19, 2006.

  1. There is a thread currently active on this newsgroup entitled:

    "how to calculate the difference between 2 addresses ?"

    The thread deals with calculating the distance, in bytes, between two
    memory addresses. Obviously, this can only be done if the addresses refer
    to elements or members of the same object (or base objects, etc.).

    John Carson and I proposed two separate methods.

    I disagree with John's solution, and John disagrees with mine. Therefore,
    I'd like to present them both here and see what the audience thinks.

    Firstly, we shall start off with a simple POD type:

    struct MyPOD {
    int a;
    double b;
    void *c;
    short d;
    bool e;
    int f;
    };

    Given an object of this type, we shall calculate the distance, in bytes,
    between the "b" member and the "e" member.

    My own method is as follows:

    reinterpret_cast<char const volatile*>(&obj.e)
    - reinterpret_cast<char const volatile*>(&obj.b)

    John's method is as follows:

    reinterpret_cast<long unsigned>(&obj.e)
    - reinterpret_cast<long unsigned(&obj.b);

    In defence of my own method:

    (1) Any byte address can be accurately stored in a char*.

    In attack of John's method:

    (1) The Standard doesn't necessitate the existance of an integer type
    large enough to accomodate a memory address.
    (2) Even if such a type exists, the subtraction need not yield the
    correct answer (e.g. if each integer 1 represents half a byte, or a quarter
    of a byte).

    Of course, seeing as how _I_ started this thread, it may be a little biased
    toward my own ends, but I hope we get to the bottom of this objectively.

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 19, 2006
    #1
    1. Advertising

  2. Frederick Gotham

    David Harmon Guest

    On Sun, 19 Nov 2006 20:05:11 GMT in comp.lang.c++, Frederick Gotham
    <> wrote,
    >Given an object of this type, we shall calculate the distance, in bytes,
    >between the "b" member and the "e" member.


    #include <cstddef>
    offsetof(MyPOD, e) - offsetof(MyPOD, b)
     
    David Harmon, Nov 19, 2006
    #2
    1. Advertising

  3. David Harmon:

    > On Sun, 19 Nov 2006 20:05:11 GMT in comp.lang.c++, Frederick Gotham
    ><> wrote,
    >>Given an object of this type, we shall calculate the distance, in bytes,
    >>between the "b" member and the "e" member.

    >
    > #include <cstddef>
    > offsetof(MyPOD, e) - offsetof(MyPOD, b)



    I'll rephrase the question:

    Given two memory addresses in the form of pointers -- pointer types which
    may be different -- calculate the distance in bytes between them. The
    pointers refer to parts of the same object.

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 19, 2006
    #3
  4. Frederick Gotham

    Salt_Peter Guest

    Frederick Gotham wrote:
    > David Harmon:
    >
    > > On Sun, 19 Nov 2006 20:05:11 GMT in comp.lang.c++, Frederick Gotham
    > ><> wrote,
    > >>Given an object of this type, we shall calculate the distance, in bytes,
    > >>between the "b" member and the "e" member.

    > >
    > > #include <cstddef>
    > > offsetof(MyPOD, e) - offsetof(MyPOD, b)

    >
    >
    > I'll rephrase the question:
    >
    > Given two memory addresses in the form of pointers -- pointer types which
    > may be different -- calculate the distance in bytes between them. The
    > pointers refer to parts of the same object.
    >
    > --
    >
    > Frederick Gotham


    Not that i'm trying deliberately to be a pain in the attic, but what do
    you mean by between them?
    Thats not the same as offset.

    struct test
    {
    int n;
    int i;
    };

    The distance in bytes between a test instance.n and instance.i would be
    zero assuming no padding is involved. Remember: To assume == makes an
    ASS out of U and ME.
     
    Salt_Peter, Nov 19, 2006
    #4
  5. Salt_Peter:

    > Not that i'm trying deliberately to be a pain in the attic, but what do
    > you mean by between them?



    Let's say that a certain object is located at memory address 14.

    Let's say that another object is located at memory address 18.

    This distance between them is 4.


    > Thats not the same as offset.
    >
    > struct test
    > {
    > int n;
    > int i;
    > };
    >
    > The distance in bytes between a test instance.n and instance.i would be
    > zero assuming no padding is involved.



    We're just looking for the amount of bytes between two addresses.

    Let's say that &obj.n == Memory Byte Address 56
    Let's say that &obj.i == Memory Byte Address 60

    Therefore, the distance between them is 4 bytes.


    > Remember: To assume == makes an
    > ASS out of U and ME.


    Should I understand that somehow?

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 19, 2006
    #5
  6. Frederick Gotham

    Greg Guest

    Frederick Gotham wrote:
    > Salt_Peter:
    >
    > > Not that i'm trying deliberately to be a pain in the attic, but what do
    > > you mean by between them?

    >
    >
    > Let's say that a certain object is located at memory address 14.
    >
    > Let's say that another object is located at memory address 18.
    >
    > This distance between them is 4.
    >
    >
    > > Thats not the same as offset.
    > >
    > > struct test
    > > {
    > > int n;
    > > int i;
    > > };
    > >
    > > The distance in bytes between a test instance.n and instance.i would be
    > > zero assuming no padding is involved.

    >
    >
    > We're just looking for the amount of bytes between two addresses.
    >
    > Let's say that &obj.n == Memory Byte Address 56
    > Let's say that &obj.i == Memory Byte Address 60
    >
    > Therefore, the distance between them is 4 bytes.


    There is no guarantee that converting a pointer to an integer value
    will produce the logical address of the referenced object. So neither
    of the two approaches is certain to be portable. In fact, the only
    portable approach available is to use the offsetof macro - either to
    calculate the distance between the start of a POD object and one of its
    members, or between any two members of the same object:

    std::abs( offsetof(MyPOD, e) - offsetof(MyPOD, b));

    Greg
     
    Greg, Nov 20, 2006
    #6
  7. Greg:

    > There is no guarantee that converting a pointer to an integer value
    > will produce the logical address of the referenced object. So neither
    > of the two approaches is certain to be portable.



    My claim is that the char* method is perfect.

    #include <cstddef>

    template<class A,class B>
    std::ptrdiff_t BytesBetween(A const &a,B const &b)
    {
    return reinterpret_cast<char const volatile*>(&b)
    - reinterpret_cast<char const volatile*>(&a);
    }

    Of course, both "a" and "b" must refer to parts of the same object.

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 20, 2006
    #7
  8. Frederick Gotham

    John Carson Guest

    "Frederick Gotham" <> wrote in message
    news:XN28h.16163$
    > There is a thread currently active on this newsgroup entitled:
    >
    > "how to calculate the difference between 2 addresses ?"
    >
    > The thread deals with calculating the distance, in bytes, between two
    > memory addresses. Obviously, this can only be done if the addresses
    > refer to elements or members of the same object (or base objects,
    > etc.).
    >
    > John Carson and I proposed two separate methods.
    >
    > I disagree with John's solution, and John disagrees with mine.
    > Therefore, I'd like to present them both here and see what the
    > audience thinks.


    Just to be clear: I don't claim my approach is more correct than yours. I
    think they both involve implementation-defined behavior according to the
    Standard. Both will usually work in practice. My preference for converting
    to an integer is more of an aesthetic one. The aesthetics may differ
    depending on the exact nature of the problem.

    > Firstly, we shall start off with a simple POD type:
    >
    > struct MyPOD {
    > int a;
    > double b;
    > void *c;
    > short d;
    > bool e;
    > int f;
    > };
    >
    > Given an object of this type, we shall calculate the distance, in
    > bytes, between the "b" member and the "e" member.
    >
    > My own method is as follows:
    >
    > reinterpret_cast<char const volatile*>(&obj.e)
    > - reinterpret_cast<char const volatile*>(&obj.b)
    >
    > John's method is as follows:
    >
    > reinterpret_cast<long unsigned>(&obj.e)
    > - reinterpret_cast<long unsigned(&obj.b);


    I wish to cast it to a pointer-sized integer. This is not synonymous with
    long unsigned. Indeed on Win64, long unsigned is smaller than pointer-sized
    (crazy, I know), but a pointer-sized integer nevertheless exists.

    > In defence of my own method:
    >
    > (1) Any byte address can be accurately stored in a char*.


    Any pointer can be cast to char*. However, by Section 5.2.10/3:

    "The mapping performed by reinterpret_cast is implementation-defined. [Note:
    it might, or might not, produce a representation different from the original
    value. ]"

    This applies equally to my method.

    > In attack of John's method:
    >
    > (1) The Standard doesn't necessitate the existance of an integer
    > type large enough to accomodate a memory address.


    True, but not an issue on most platforms.

    > (2) Even if such a type exists, the subtraction need not yield the
    > correct answer (e.g. if each integer 1 represents half a byte, or a
    > quarter of a byte).


    If your cast can produce "a representation different from the original
    value", I don't see that it offers an advantage. Moreover, Section 5.2.10/4
    says that the conversion to an integer value "is intended to be unsurprising
    to those who know the addressing structure of the underlying machine", which
    provides an assurance of sorts for my preferred approach.

    Finally, I point out that the Standard doesn't guarantee an integer type
    large enough to store the result of the subtraction (See Section 5.7/6).
    Once again, both approaches rely on an implementation-defined feature (or on
    the choice of suitable addresses to compare).


    --
    John Carson
     
    John Carson, Nov 20, 2006
    #8
  9. Frederick Gotham

    Greg Guest

    Frederick Gotham wrote:
    > Greg:
    >
    > > There is no guarantee that converting a pointer to an integer value
    > > will produce the logical address of the referenced object. So neither
    > > of the two approaches is certain to be portable.

    >
    >
    > My claim is that the char* method is perfect.
    >
    > #include <cstddef>
    >
    > template<class A,class B>
    > std::ptrdiff_t BytesBetween(A const &a,B const &b)
    > {
    > return reinterpret_cast<char const volatile*>(&b)
    > - reinterpret_cast<char const volatile*>(&a);
    > }
    >
    > Of course, both "a" and "b" must refer to parts of the same object.


    In order to subtract pointer a from pointer b, both a and b must point
    to the same kind of object and the objects that they point to, must
    both be members of the same array. Since the BytesBetween() function
    template observes neither of these requirements, there is no guarantee
    that its behavior will be defined.

    "Unless both pointers point to elements of the same array object, or
    one past the last element of the array object, the behavior is
    undefined." [§5.7/7]

    C++ would not need the offsetof macro if there were another, portable
    way to calculate the distance between two members of an object.

    Greg
     
    Greg, Nov 20, 2006
    #9
  10. Frederick Gotham

    Kai-Uwe Bux Guest

    Greg wrote:

    > C++ would not need the offsetof macro if there were another, portable
    > way to calculate the distance between two members of an object.


    That seems incorrect: the difficulty with the offsetof macro is the need for
    compile-time evaluation. That makes it impossible to create an instance of
    the struct type and measure offsets of its members. Thus, even if you had a
    perfectly fine method of computing distances of members of an object, it
    would not help in writing an offsetof macro.


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Nov 20, 2006
    #10
  11. Frederick Gotham

    David Harmon Guest

    On Sun, 19 Nov 2006 20:58:34 GMT in comp.lang.c++, Frederick Gotham
    <> wrote,

    >I'll rephrase the question:


    I'll still dodge it.
    Eschew undefined behavior.
    Cast not thy pointers into the void.
     
    David Harmon, Nov 20, 2006
    #11
  12. Frederick Gotham

    Greg Guest

    Kai-Uwe Bux wrote:
    > Greg wrote:
    >
    > > C++ would not need the offsetof macro if there were another, portable
    > > way to calculate the distance between two members of an object.

    >
    > That seems incorrect: the difficulty with the offsetof macro is the need for
    > compile-time evaluation. That makes it impossible to create an instance of
    > the struct type and measure offsets of its members. Thus, even if you had a
    > perfectly fine method of computing distances of members of an object, it
    > would not help in writing an offsetof macro.


    Counting the number bytes from the start of an object to one of its
    members is not the only way to express the distance. But since the
    requirement in this case is to provide a byte measurement of the
    distance - the offsetof macro is the only portable way to obtain that
    figure.

    Requiring that the offset of a class member be expressed in bytes is of
    course a completely artificial constraint - no C++ program would ever
    face such a limitation. After all, no program calls offsetof simply to
    obtain a number. Instead the number that offsetof returns is useful
    only insofar as the program can use that value to gain access to the
    specified class member given a pointer to a class object.

    In C++, member access through an object pointer is already possible by
    applying a member pointer to the object pointer. A member pointer
    essentially abstracts the offset of a class member, and hides the
    implementation details from the C++ program. So although a C++ program
    cannot recover the byte distance of the offset that is stored within a
    member pointer - a member pointer is still more useful than the
    offsetof macro since a member pointer is not limited to members of POD
    classes only.

    Greg
     
    Greg, Nov 20, 2006
    #12
  13. John Carson:

    > I think they both involve implementation-defined behavior according to
    > the Standard. Both will usually work in practice.



    My own claim is that _my_ code is perfectly fine. I also claim that your
    code is not OK, even though I acknowledge it would work on a lot of
    systems.

    I could imagine a system which doesn't have 8-Bit bytes, but which has a
    layer between the machine and the C implementation that makes you think
    there are 8-Bit bytes. Let's say that the machine actually has 4-Bit bytes.
    When you cast to integer type and subtract, your result might be double
    what you thought it would be.


    > Any pointer can be cast to char*. However, by Section 5.2.10/3:
    >
    > "The mapping performed by reinterpret_cast is implementation-defined.
    > [Note: it might, or might not, produce a representation different from
    > the original value. ]"



    There are several exceptions to the whole "reinterpret_cast is a wild
    animal" idea. Casting to char* or void* is one of them. Another would be
    casting from a POD pointer to a pointer to the first member in the POD.


    >> (1) The Standard doesn't necessitate the existance of an integer
    >> type large enough to accomodate a memory address.

    >
    > True, but not an issue on most platforms.



    On every platform though, the char* subtraction will work.


    > Moreover, Section
    > 5.2.10/4 says that the conversion to an integer value "is intended to be
    > unsurprising to those who know the addressing structure of the
    > underlying machine", which provides an assurance of sorts for my
    > preferred approach.



    What if we're working with the 4-Bit system disguised as an 8-Bit system?


    > Finally, I point out that the Standard doesn't guarantee an integer type
    > large enough to store the result of the subtraction (See Section 5.7/6).
    > Once again, both approaches rely on an implementation-defined feature
    > (or on the choice of suitable addresses to compare).



    Are you sure about that? The purpose of ptrdiff_t is to store the result of
    subtracting two pointers. Presumably, if the subtraction of the pointers is
    valid, then the type should be able to hold the value.

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 20, 2006
    #13
  14. Frederick Gotham:

    > There are several exceptions to the whole "reinterpret_cast is a wild
    > animal" idea. Casting to char* or void* is one of them. Another would be
    > casting from a POD pointer to a pointer to the first member in the POD.



    In the past, I've seen people so fearful of reinterpret_cast that they write:

    char *p = static_cast<char*>(static_cast<void*>(&obj));

    I myself just write:

    char *p = (char*)&obj;

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 20, 2006
    #14
  15. Frederick Gotham

    John Carson Guest

    "Frederick Gotham" <> wrote in message
    news:nid8h.16179$
    > John Carson:
    >
    > I could imagine a system which doesn't have 8-Bit bytes, but which
    > has a layer between the machine and the C implementation that makes
    > you think there are 8-Bit bytes. Let's say that the machine actually
    > has 4-Bit bytes. When you cast to integer type and subtract, your
    > result might be double what you thought it would be.


    That would depend on the implementation.

    > There are several exceptions to the whole "reinterpret_cast is a wild
    > animal" idea. Casting to char* or void* is one of them. Another would
    > be casting from a POD pointer to a pointer to the first member in the
    > POD.


    The effect of reinterpret_cast on a POD pointer is specified in the Standard
    (section 9.2/17). The others are not as far as I am aware.

    >>> (1) The Standard doesn't necessitate the existance of an integer
    >>> type large enough to accomodate a memory address.

    >>
    >> True, but not an issue on most platforms.

    >
    > On every platform though, the char* subtraction will work.


    The char* cast will work. The subtraction isn't guaranteed.

    >> Moreover, Section
    >> 5.2.10/4 says that the conversion to an integer value "is intended
    >> to be unsurprising to those who know the addressing structure of the
    >> underlying machine", which provides an assurance of sorts for my
    >> preferred approach.

    >
    > What if we're working with the 4-Bit system disguised as an 8-Bit
    > system?


    I don't know, but the implementation should say what would happen.

    >> Finally, I point out that the Standard doesn't guarantee an integer
    >> type large enough to store the result of the subtraction (See
    >> Section 5.7/6). Once again, both approaches rely on an
    >> implementation-defined feature (or on the choice of suitable
    >> addresses to compare).

    >
    > Are you sure about that? The purpose of ptrdiff_t is to store the
    > result of subtracting two pointers. Presumably, if the subtraction of
    > the pointers is valid, then the type should be able to hold the value.


    I can only go by the Standard, which I have already quoted in the previous
    thread. The result of such a subtraction is a signed type and as such has a
    maximum absolute value only half the size of the largest value supported by
    the corresponding unsigned type. If addresses can have any value covered by
    the unsigned type, this creates the possibility of overflow.

    --
    John Carson
     
    John Carson, Nov 20, 2006
    #15
  16. John Carson:

    (Referring to pointer arithmetic)

    > The result of such a subtraction is a signed type and
    > as such has a maximum absolute value only half the size of the largest
    > value supported by the corresponding unsigned type. If addresses can
    > have any value covered by the unsigned type, this creates the
    > possibility of overflow.



    I think though that this argument can be countered by a combination of the
    following excerpts from the Standard.

    3.9.2
    For any object (other than a base-class subobject) of POD type T, whether
    or not the object holds a valid value of type T, the underlying bytes (1.7)
    making up the object can be copied into an array of char or unsigned
    char.36) If the content of the array of char or unsigned char is copied
    back into the object, the object shall subsequently hold its original
    value.

    Therefore, we can do the following:

    double arr[64] = { ... };

    char unsigned buf[sizeof arr];

    memcpy(buf,arr,sizeof buf);

    The array object, "buf", is a fully-fledged object type.

    Now let's read about ptrdiff_t:

    5.7.6
    When two pointers to elements of the same array object are subtracted, the
    result is the difference of the subscripts of the two array elements. The
    type of the result is an implementation-defined signed integral type; this
    type shall be the same type that is defined as ptrdiff_t in the <cstddef>
    header (18.1). As with any other arithmetic overflow, if the result does
    not fit in the space provided, the behavior is undefined. In other words,
    if the expressions P and Q point to, respectively, the i-th and j-th
    elements of an array object, the expression (P)-(Q) has the value i–j
    provided the value fits in an object of type ptrdiff_t.

    I'm glad to see we're agreed that the casting to char* is OK. What I find
    annoying though is the situation with ptrdiff_t... I'm going to take this
    over to comp.std.c++.

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 20, 2006
    #16
  17. Frederick Gotham

    Steve Pope Guest

    Frederick Gotham <> wrote:

    > I'll rephrase the question:


    > Given two memory addresses in the form of pointers -- pointer types which
    > may be different -- calculate the distance in bytes between them. The
    > pointers refer to parts of the same object.


    You can't. You can only subtract pointers if they are pointing
    to the same type of object, and then only if the pointed-to
    objects are elements of the same array of such objects.

    And even then, you will not necessarily get the distance in bytes.

    Just my opinion.

    Steve
     
    Steve Pope, Nov 20, 2006
    #17
  18. Steve Pope:

    >> Given two memory addresses in the form of pointers -- pointer types

    which
    >> may be different -- calculate the distance in bytes between them. The
    >> pointers refer to parts of the same object.

    >
    > You can't. You can only subtract pointers if they are pointing
    > to the same type of object, and then only if the pointed-to
    > objects are elements of the same array of such objects.
    >
    > And even then, you will not necessarily get the distance in bytes.
    >
    > Just my opinion.



    I don't see why there would be anything wrong with the following:

    struct SomePOD {
    int a;
    char b;
    int arr[5];
    };

    struct Base {
    double a;
    SomePOD b;
    void *c;
    };

    struct Derived : Base {
    double d;
    Base e;
    };

    #include <cstddef>

    template<class A,class B>
    std::ptrdiff_t BytesBtwn(A const *const p,B const *const q)
    {
    return (char const volatile*)q - (char const volatile*)p;
    }

    int main()
    {
    Derived const volatile obj = Derived();

    ptrdiff_t const i = BytesBtwn(obj.b.arr+2,&obj.e.b.a);
    }

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 20, 2006
    #18
  19. Frederick Gotham

    Steve Pope Guest

    Frederick Gotham <> wrote:

    >Steve Pope:


    >> You can only subtract pointers if they are pointing
    >> to the same type of object, and then only if the pointed-to
    >> objects are elements of the same array of such objects.


    >> And even then, you will not necessarily get the distance in bytes.


    >> Just my opinion.


    >I don't see why there would be anything wrong with the following:
    >
    >struct SomePOD {
    > int a;
    > char b;
    > int arr[5];
    >};
    >
    >struct Base {
    > double a;
    > SomePOD b;
    > void *c;
    >};
    >
    >struct Derived : Base {
    > double d;
    > Base e;
    >};
    >
    >#include <cstddef>
    >
    >template<class A,class B>
    >std::ptrdiff_t BytesBtwn(A const *const p,B const *const q)
    >{
    > return (char const volatile*)q - (char const volatile*)p;
    >}
    >
    >int main()
    >{
    > Derived const volatile obj = Derived();
    >
    > ptrdiff_t const i = BytesBtwn(obj.b.arr+2,&obj.e.b.a);
    >}


    This would not give the difference in bytes on architectures
    for which the address of an int is a word address.

    (Now, I admit not having seen such an architecture for 20
    years or so, but they may still be around.)

    Steve
     
    Steve Pope, Nov 20, 2006
    #19
  20. Steve Pope:

    > This would not give the difference in bytes on architectures
    > for which the address of an int is a word address.



    Sorry I don't understand, could you please explain that?

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 20, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?UGF1bA==?=

    Eternal Debate: Cookies vs. Sessions vs. QueryString

    =?Utf-8?B?UGF1bA==?=, Dec 9, 2005, in forum: ASP .Net
    Replies:
    6
    Views:
    4,702
    m.posseth
    Dec 12, 2005
  2. Christian Bongiorno
    Replies:
    5
    Views:
    547
    Chris Uppal
    Aug 30, 2004
  3. Christian Bongiorno
    Replies:
    1
    Views:
    340
    Chris Uppal
    Sep 27, 2004
  4. Christian Bongiorno

    More Inner class debate

    Christian Bongiorno, Sep 28, 2004, in forum: Java
    Replies:
    2
    Views:
    365
    P.Hill
    Sep 29, 2004
  5. joshc
    Replies:
    5
    Views:
    563
    Keith Thompson
    Mar 31, 2005
Loading...

Share This Page