type safety and reinterpret_cast<>

Discussion in 'C++' started by Noah Roberts, Oct 30, 2006.

  1. Noah Roberts

    Noah Roberts Guest

    What steps do people take to make sure that when dealing with C API
    callback functions that you do the appropriate reinterpret_cast<>? For
    instance, today I ran into a situation in which the wrong type was the
    target of a cast. Of course with a reinterpret_cast nothing complains
    until the UB bites you in the ass. It seems to me that there ought to
    be a way to deal with these kinds of functions yet still retain some
    semblance of type safety. Perhaps either any or variant from boost
    would help?

    What do you guys do to keep from stabbing your own foot in these
    situations?
     
    Noah Roberts, Oct 30, 2006
    #1
    1. Advertising

  2. Noah Roberts

    Jim Langston Guest

    "Noah Roberts" <> wrote in message
    news:...
    > What steps do people take to make sure that when dealing with C API
    > callback functions that you do the appropriate reinterpret_cast<>? For
    > instance, today I ran into a situation in which the wrong type was the
    > target of a cast. Of course with a reinterpret_cast nothing complains
    > until the UB bites you in the ass. It seems to me that there ought to
    > be a way to deal with these kinds of functions yet still retain some
    > semblance of type safety. Perhaps either any or variant from boost
    > would help?
    >
    > What do you guys do to keep from stabbing your own foot in these
    > situations?


    If you use reinterpret_cast you better know what the heck you're doing, or
    don't do it. reinterpret_cast is the most dangerous cast and should be used
    only when absolutly necessary. Use static_cast if possible which is a bit
    mroe safer.

    Other that that, if you use it wrong you shoot yoru self in the foot.
     
    Jim Langston, Oct 30, 2006
    #2
    1. Advertising

  3. Noah Roberts wrote:
    > What steps do people take to make sure that when dealing with C API
    > callback functions that you do the appropriate reinterpret_cast<>? For
    > instance, today I ran into a situation in which the wrong type was the
    > target of a cast. Of course with a reinterpret_cast nothing complains
    > until the UB bites you in the ass. It seems to me that there ought to
    > be a way to deal with these kinds of functions yet still retain some
    > semblance of type safety. Perhaps either any or variant from boost
    > would help?
    >
    > What do you guys do to keep from stabbing your own foot in these
    > situations?


    The whole meaning of reinterpret_cast is "please compiler, get out of
    the way and do what I tell you to do, those bits you have, they're of
    this type and just do it".

    I usually relegate the reinterpret casts to very minimal usage where I
    can check visually very easily that I know is happening.

    One thing you could do is use a single type that embodies some type
    safety whenever talking to the C api - perhaps use an "Any" class that
    contains the desired pointer. That way you only expect one type to come
    back and forth from the C call backs and you can verify that they
    contain the right type. How you manage their lifetime however is
    another story.
     
    Gianni Mariani, Oct 30, 2006
    #3
  4. Noah Roberts

    Noah Roberts Guest

    Jim Langston wrote:
    > "Noah Roberts" <> wrote in message
    > news:...
    > > What steps do people take to make sure that when dealing with C API
    > > callback functions that you do the appropriate reinterpret_cast<>? For
    > > instance, today I ran into a situation in which the wrong type was the
    > > target of a cast. Of course with a reinterpret_cast nothing complains
    > > until the UB bites you in the ass. It seems to me that there ought to
    > > be a way to deal with these kinds of functions yet still retain some
    > > semblance of type safety. Perhaps either any or variant from boost
    > > would help?
    > >
    > > What do you guys do to keep from stabbing your own foot in these
    > > situations?

    >
    > If you use reinterpret_cast you better know what the heck you're doing, or
    > don't do it. reinterpret_cast is the most dangerous cast and should be used
    > only when absolutly necessary. Use static_cast if possible which is a bit
    > mroe safer.


    Heh, really?
    >
    > Other that that, if you use it wrong you shoot yoru self in the foot.


    very insightfull.
     
    Noah Roberts, Oct 30, 2006
    #4
  5. Noah Roberts

    Noah Roberts Guest

    Gianni Mariani wrote:
    > Noah Roberts wrote:
    > > What steps do people take to make sure that when dealing with C API
    > > callback functions that you do the appropriate reinterpret_cast<>? For
    > > instance, today I ran into a situation in which the wrong type was the
    > > target of a cast. Of course with a reinterpret_cast nothing complains
    > > until the UB bites you in the ass. It seems to me that there ought to
    > > be a way to deal with these kinds of functions yet still retain some
    > > semblance of type safety. Perhaps either any or variant from boost
    > > would help?
    > >
    > > What do you guys do to keep from stabbing your own foot in these
    > > situations?

    >
    > The whole meaning of reinterpret_cast is "please compiler, get out of
    > the way and do what I tell you to do, those bits you have, they're of
    > this type and just do it".


    But sometimes it is a necissary evil and you would like to retain some
    safety net.
    >
    > I usually relegate the reinterpret casts to very minimal usage where I
    > can check visually very easily that I know is happening.


    Well, I _prefer_ to avoid it all together.
    >
    > One thing you could do is use a single type that embodies some type
    > safety whenever talking to the C api - perhaps use an "Any" class that
    > contains the desired pointer. That way you only expect one type to come
    > back and forth from the C call backs and you can verify that they
    > contain the right type. How you manage their lifetime however is
    > another story.


    This appears like it will do the trick. You have to implement it as a
    coding standard but once so required and practiced you can't undo
    yourself.

    #include <iostream>
    #include <boost/any.hpp>

    class Test
    {
    int x;
    public:
    Test(int y) : x(y) {}
    int GetSerial() const { return x; }
    };

    class NotTest
    {
    double y;
    public:
    NotTest(double x) : y(x) {}
    double f() const { return y; }
    };

    void f_test(void * t)
    {
    boost::any * at = reinterpret_cast<boost::any*>(t);

    // attempt cast to Test
    Test * test = boost::any_cast<Test>(at);

    if (test)
    std::cout << test->GetSerial() << std::endl;
    }

    int main()
    {

    boost::any x;
    x = Test(50);
    f_test(&x); // output 50

    x = NotTest(5.09);
    f_test(&x); // no output

    int y; std::cin >> y;
    return 0;
    }

    I'm kind of new to using boost but this appears to be a good answer.
    You can use the type "boost::any*" as the only pointer type you allow
    to be passed through generic pointers. Then later if you change the
    inheritance tree on some object you will get a predictable error
    instead of undefined behavior should you miss such casts.

    Now the trick will be to get this to be used by the team and begin
    replacing current C-Style casts and misc. pointer passing with this
    more predictable setup.
     
    Noah Roberts, Oct 30, 2006
    #5
  6. Noah Roberts wrote:
    > Gianni Mariani wrote:
    >> Noah Roberts wrote:

    ....
    > I'm kind of new to using boost but this appears to be a good answer.
    > You can use the type "boost::any*" as the only pointer type you allow
    > to be passed through generic pointers. Then later if you change the
    > inheritance tree on some object you will get a predictable error
    > instead of undefined behavior should you miss such casts.
    >
    > Now the trick will be to get this to be used by the team and begin
    > replacing current C-Style casts and misc. pointer passing with this
    > more predictable setup.


    Thats exactly what I proposed. Now you have an issue with how to manage
    the lifetime of the any object.

    If the callback is a synchronous this, you can use the exact same thing
    you're using - all is well.

    If pointer is stored in "C" land and comes back at some later event,
    then you need to match the lifetime of the any object with the lifetime
    of the object it's pointing to.

    This can also be done without using the boost::any class - just have a
    blase class that stores it's typeid - that's all that all that
    boost::any_cast does, it checks that the typeid is equal.

    One way to do this is to inherit this monster ugly thing but the use
    case is quite nice.

    // caution - brain dump alert - all the code below is directly from
    // brain to you with no compile checks - useful as a demo

    class C_CallbackBase
    {
    protected:
    C_CallbackBase( const typeinfo & i_callback_type )
    : m_sentinel( 0xca11bac8 )
    m_callback_type( i_callback_type )
    {
    }

    const unsigned m_sentinel;
    const typeinfo & m_callback_type;

    // make this assignable ...
    C_CallbackBase & operator( const C_CallbackBase & )
    {
    // my derived type does not change when I am assigned ...!
    }

    private:
    // default copy constructor is not ok
    C_CallbackBase( const C_CallbackBase & ); // never called

    };


    template <typename DerivedType>
    class CallBackBase
    : public C_CallbackBase
    {
    CallBackBase()
    : C_CallbackBase( typeid( DerivedType ) )
    {
    }

    CallBackBase( const CallBackBase & )
    : C_CallbackBase( typeid( DerivedType ) )
    {
    }

    void * GetCallbackPtr()
    {
    return static_cast< void * >(
    static_cast<C_CallbackBase *>( this )
    );
    }

    };

    class CallBackCast
    {
    void * m_ptr;

    CallBackCast( void * ptr )
    : m_ptr( ptr )
    {
    }

    template <typename DerivedType>
    DerivedType * operator()
    {
    C_CallbackBase * ptr =
    static_cast<C_CallbackBase *>( m_ptr );

    // this is UB if the cast is wrong but it
    // probably do the right thing
    if ( ptr->m_sentinel != 0xca11bac8 )
    {
    throw "CALLBACK CLASS CORRUPT";
    }
    if ( m_callback_type == typeid( DerivedType ) )
    {
    return static_cast<DerivedType *>( ptr );
    }

    throw "TYPE MISMATCH FROM C CALLBACK";
    }
    };


    class APPCLASS
    : public CallBackBase<APPCLASS>
    {
    };

    void c_callback_func( void * cb )
    {
    APPCLASS * appptr = CallBackCast( cb );

    ... do your thing
    }

    int main()
    {

    APPCLASS app;

    c_callback_func( app.GetCallbackPtr() );

    }


    Note that you can have two versions of this thing if performance is an
    issue, a debug version that checks (like this one) and one that has an
    empty base class and no checks are done.

    Note that you can't use boost::any as a member (or base class) because
    the copy and assignment make no sense. Note that C_CallBackBase has an
    empty assignment and copy construction is not allowed.
     
    Gianni Mariani, Oct 31, 2006
    #6
  7. Gianni Mariani:

    > The whole meaning of reinterpret_cast is "please compiler, get out of
    > the way and do what I tell you to do, those bits you have, they're of
    > this type and just do it".



    That only happens when you cast to a reference type. Elsewhere, it performs a
    proper conversion:

    MyClass obj;

    char unsigned *p = reinterpret_cast<char unsigned*>(&obj);

    --

    Frederick Gotham
     
    Frederick Gotham, Oct 31, 2006
    #7
  8. Noah Roberts

    werasm Guest

    Gianni Mariani wrote:

    > class C_CallbackBase
    > {
    > protected:
    > C_CallbackBase( const typeinfo & i_callback_type )
    > : m_sentinel( 0xca11bac8 )
    > m_callback_type( i_callback_type )
    > {
    > }


    Do you have a specific way in which you select your sentinal, or did
    you use an arbitrary value? I have in the past used the this pointer
    for this. Is that viable?

    Werner
     
    werasm, Oct 31, 2006
    #8
  9. Noah Roberts

    red floyd Guest

    werasm wrote:
    > Gianni Mariani wrote:
    >
    >> class C_CallbackBase
    >> {
    >> protected:
    >> C_CallbackBase( const typeinfo & i_callback_type )
    >> : m_sentinel( 0xca11bac8 )
    >> m_callback_type( i_callback_type )
    >> {
    >> }

    >
    > Do you have a specific way in which you select your sentinal, or did
    > you use an arbitrary value? I have in the past used the this pointer
    > for this. Is that viable?


    his sentinel is the word "callback" written as best as possible in hex
    digits.
     
    red floyd, Oct 31, 2006
    #9
  10. Noah Roberts

    Bart Guest

    Frederick Gotham wrote:
    > Gianni Mariani:
    >
    > > The whole meaning of reinterpret_cast is "please compiler, get out of
    > > the way and do what I tell you to do, those bits you have, they're of
    > > this type and just do it".

    >
    > That only happens when you cast to a reference type. Elsewhere, it performs a
    > proper conversion:
    >
    > MyClass obj;
    >
    > char unsigned *p = reinterpret_cast<char unsigned*>(&obj);


    I don't know what you mean by "proper conversion". The mapping
    performed by reinterpret_cast is always implementation-defined. It
    could just "take those bits and reinterpret them to mean something
    else" or it could "properly convert those bits" (whatever that means).
    It's all up to the implementation.

    Also, the standard explicitly states that reinterpret_cast<T&>(x) is
    equivalent to *reinterpret_cast<T*>(&x).

    Regards,
    Bart.
     
    Bart, Oct 31, 2006
    #10
  11. Noah Roberts

    Bart Guest

    Noah Roberts wrote:
    > Jim Langston wrote:
    > > Other that that, if you use it wrong you shoot yoru self in the foot.

    >
    > very insightfull.


    Indeed. People often forget that C++ is a language where you can shoot
    yourself in the foot and blow your whole leg off. That stems directly
    from the "the programmer knows best" philosophy.

    If you want Java you know where to find it.

    Regards,
    Bart.
     
    Bart, Oct 31, 2006
    #11
  12. Bart:

    >> MyClass obj;
    >>
    >> char unsigned *p = reinterpret_cast<char unsigned*>(&obj);

    >
    > I don't know what you mean by "proper conversion".


    By proper conversion, I mean that the behaviour of the following snippet is
    well-defined:

    int arr[10];

    char unsigned *p = reinterpret_cast<char unsigned*>(arr);
    char unsigned const *const pover = p + sizeof arr;

    do *p++ = 0;
    while (pover != p);

    The reinterpret_cast doesn't merely take the bits of an int* and stick them
    in a char*, it actually renders the address accurately as a char*. The
    different will definitely be noticeable on systems where sizeof(char*) >
    sizeof(int*).


    > The mapping performed by reinterpret_cast is always
    > implementation-defined.



    Not when it comes to pointers; it's well-defined when it comes to pointers.

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 1, 2006
    #12
  13. Noah Roberts

    Kai-Uwe Bux Guest

    Frederick Gotham wrote:

    > Bart:
    >
    >>> MyClass obj;
    >>>
    >>> char unsigned *p = reinterpret_cast<char unsigned*>(&obj);

    >>
    >> I don't know what you mean by "proper conversion".

    >
    > By proper conversion, I mean that the behaviour of the following snippet
    > is well-defined:
    >
    > int arr[10];
    >
    > char unsigned *p = reinterpret_cast<char unsigned*>(arr);
    > char unsigned const *const pover = p + sizeof arr;
    >
    > do *p++ = 0;
    > while (pover != p);
    >
    > The reinterpret_cast doesn't merely take the bits of an int* and stick
    > them in a char*, it actually renders the address accurately as a char*.
    > The different will definitely be noticeable on systems where sizeof(char*)
    > > sizeof(int*).

    >
    >
    >> The mapping performed by reinterpret_cast is always
    >> implementation-defined.

    >
    >
    > Not when it comes to pointers; it's well-defined when it comes to
    > pointers.


    This last statement seems to be overly general. Is the following defined
    behavior?

    int main ( ) {

    int i;
    int* ip = &i;
    unsigned* up = reinterpret_cast< unsigned* >( ip );
    *up = 0;

    }

    I cannot find the anything in the standard that would prevent *up = 0 from
    segfaulting.


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Nov 1, 2006
    #13
  14. Kai-Uwe Bux:

    > Is the following defined
    > behavior?
    >
    > int main ( ) {
    >
    > int i;
    > int* ip = &i;
    > unsigned* up = reinterpret_cast< unsigned* >( ip );
    > *up = 0;
    >
    > }



    There's no problem with the code, because:

    sizeof(int) == sizeof(unsigned)
    alignof(int) == alignof(unsigned)

    You might have a little problem though if you try to read the value of "i"
    subsequent to the zero assignment, but only if:

    (1) An unsigned int contains padding.
    (2) An unsigned int has an object representation of the value zero in which
    the padding bits are not all set to zero.
    (3) The bit-pattern for the value from (2) is an invalid object
    representation for an int.

    In such a system, the zero assignment could result in an object
    representation of:

    1111 0000 0000 0000 0000
    (i.e. 4 padding bits, 16 value bits)

    , which may be invalid for an int.

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 1, 2006
    #14
  15. Noah Roberts

    Kai-Uwe Bux Guest

    Frederick Gotham wrote:

    > Kai-Uwe Bux:
    >
    >> Is the following defined
    >> behavior?
    >>
    >> int main ( ) {
    >>
    >> int i;
    >> int* ip = &i;
    >> unsigned* up = reinterpret_cast< unsigned* >( ip );
    >> *up = 0;
    >>
    >> }

    >
    >
    > There's no problem with the code, because:
    >
    > sizeof(int) == sizeof(unsigned)
    > alignof(int) == alignof(unsigned)
    >
    > You might have a little problem though if you try to read the value of "i"
    > subsequent to the zero assignment, but only if:
    >
    > (1) An unsigned int contains padding.
    > (2) An unsigned int has an object representation of the value zero in
    > which the padding bits are not all set to zero.
    > (3) The bit-pattern for the value from (2) is an invalid object
    > representation for an int.
    >
    > In such a system, the zero assignment could result in an object
    > representation of:
    >
    > 1111 0000 0000 0000 0000
    > (i.e. 4 padding bits, 16 value bits)
    >
    > , which may be invalid for an int.


    I think you are assuming in your argument that up points to the same object
    (i.e., region of memory) as ip. I do not find that guarantee in the
    standard. All it requires is that if you convert back from up, you get the
    original ip. Otherwise, the result of a pointer conversion is unspecified.
    In particular, up could be an invalid pointer.

    The only case where I know of a requirement that pointer conversion
    preserves the actual memory location is conversion to and from (unsigned)
    char*.


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Nov 1, 2006
    #15
  16. Kai-Uwe Bux:

    > I think you are assuming in your argument that up points to the same object
    > (i.e., region of memory) as ip. I do not find that guarantee in the
    > standard. All it requires is that if you convert back from up, you get the
    > original ip. Otherwise, the result of a pointer conversion is unspecified.
    > In particular, up could be an invalid pointer.



    Given that:

    sizeof(int) == sizeof(unsigned)
    alignof(int) == alignof(unsigned)

    , there's no reason to think that their pointers would be any different.

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 1, 2006
    #16
  17. Noah Roberts

    Kai-Uwe Bux Guest

    Frederick Gotham wrote:

    > Kai-Uwe Bux:
    >
    >> I think you are assuming in your argument that up points to the same
    >> object (i.e., region of memory) as ip. I do not find that guarantee in
    >> the standard. All it requires is that if you convert back from up, you
    >> get the original ip. Otherwise, the result of a pointer conversion is
    >> unspecified. In particular, up could be an invalid pointer.

    >
    >
    > Given that:
    >
    > sizeof(int) == sizeof(unsigned)
    > alignof(int) == alignof(unsigned)
    >
    > , there's no reason to think that their pointers would be any different.


    The problem is that there is no reason to think that they would be equal,
    either. As I pointed out in some other thread, the C++ standard does allow
    for pointers that store more information than just a location in memory.
    This additional information would not be used in finding the location but
    just for defining undefined behavior in surprising ways :)


    Best

    Kai-Uwe Bux
     
    Kai-Uwe Bux, Nov 1, 2006
    #17
  18. Kai-Uwe Bux:

    > The problem is that there is no reason to think that they would be equal,
    > either. As I pointed out in some other thread, the C++ standard does allow
    > for pointers that store more information than just a location in memory.
    > This additional information would not be used in finding the location but
    > just for defining undefined behavior in surprising ways :)



    Then maybe C++ is too loosely defined.

    --

    Frederick Gotham
     
    Frederick Gotham, Nov 1, 2006
    #18
  19. Noah Roberts

    werasm Guest

    Kai-Uwe Bux wrote:

    > The problem is that there is no reason to think that they would be equal,
    > either. As I pointed out in some other thread, the C++ standard does allow
    > for pointers that store more information than just a location in memory.
    > This additional information would not be used in finding the location but
    > just for defining undefined behavior in surprising ways :)


    How often does it lead to UB in practise (even though it hypothetically
    can)?

    W
     
    werasm, Nov 1, 2006
    #19
  20. Noah Roberts

    Noah Roberts Guest

    Kai-Uwe Bux wrote:

    > I think you are assuming in your argument that up points to the same object
    > (i.e., region of memory) as ip. I do not find that guarantee in the
    > standard. All it requires is that if you convert back from up, you get the
    > original ip. Otherwise, the result of a pointer conversion is unspecified.
    > In particular, up could be an invalid pointer.


    Correct, any use of a reinterpret_casted pointer results in undefined
    behavior. The only defined behavior is casting back and forth.
    >
    > The only case where I know of a requirement that pointer conversion
    > preserves the actual memory location is conversion to and from (unsigned)
    > char*.


    The problem isn't necissarily that reinterpret_cast will change the
    address of the pointer but that it won't. This becomes a major issue
    when dealing with MI.
     
    Noah Roberts, Nov 1, 2006
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Christopher Benson-Manica

    reinterpret_cast and enumerations

    Christopher Benson-Manica, Nov 2, 2004, in forum: C++
    Replies:
    5
    Views:
    869
    Ron Natalie
    Nov 3, 2004
  2. Kobe
    Replies:
    3
    Views:
    618
    Tomás
    Feb 15, 2006
  3. Alex Vinokur
    Replies:
    4
    Views:
    595
    Jakob Bieling
    Mar 27, 2006
  4. ciccio
    Replies:
    4
    Views:
    1,152
    James Kanze
    Apr 15, 2008
  5. Alex Vinokur
    Replies:
    1
    Views:
    606
Loading...

Share This Page