type safety and reinterpret_cast<>

N

Noah Roberts

What steps do people take to make sure that when dealing with C API
callback functions that you do the appropriate reinterpret_cast<>? For
instance, today I ran into a situation in which the wrong type was the
target of a cast. Of course with a reinterpret_cast nothing complains
until the UB bites you in the ass. It seems to me that there ought to
be a way to deal with these kinds of functions yet still retain some
semblance of type safety. Perhaps either any or variant from boost
would help?

What do you guys do to keep from stabbing your own foot in these
situations?
 
J

Jim Langston

Noah Roberts said:
What steps do people take to make sure that when dealing with C API
callback functions that you do the appropriate reinterpret_cast<>? For
instance, today I ran into a situation in which the wrong type was the
target of a cast. Of course with a reinterpret_cast nothing complains
until the UB bites you in the ass. It seems to me that there ought to
be a way to deal with these kinds of functions yet still retain some
semblance of type safety. Perhaps either any or variant from boost
would help?

What do you guys do to keep from stabbing your own foot in these
situations?

If you use reinterpret_cast you better know what the heck you're doing, or
don't do it. reinterpret_cast is the most dangerous cast and should be used
only when absolutly necessary. Use static_cast if possible which is a bit
mroe safer.

Other that that, if you use it wrong you shoot yoru self in the foot.
 
G

Gianni Mariani

Noah said:
What steps do people take to make sure that when dealing with C API
callback functions that you do the appropriate reinterpret_cast<>? For
instance, today I ran into a situation in which the wrong type was the
target of a cast. Of course with a reinterpret_cast nothing complains
until the UB bites you in the ass. It seems to me that there ought to
be a way to deal with these kinds of functions yet still retain some
semblance of type safety. Perhaps either any or variant from boost
would help?

What do you guys do to keep from stabbing your own foot in these
situations?

The whole meaning of reinterpret_cast is "please compiler, get out of
the way and do what I tell you to do, those bits you have, they're of
this type and just do it".

I usually relegate the reinterpret casts to very minimal usage where I
can check visually very easily that I know is happening.

One thing you could do is use a single type that embodies some type
safety whenever talking to the C api - perhaps use an "Any" class that
contains the desired pointer. That way you only expect one type to come
back and forth from the C call backs and you can verify that they
contain the right type. How you manage their lifetime however is
another story.
 
N

Noah Roberts

Jim said:
If you use reinterpret_cast you better know what the heck you're doing, or
don't do it. reinterpret_cast is the most dangerous cast and should be used
only when absolutly necessary. Use static_cast if possible which is a bit
mroe safer.

Heh, really?
Other that that, if you use it wrong you shoot yoru self in the foot.

very insightfull.
 
N

Noah Roberts

Gianni said:
The whole meaning of reinterpret_cast is "please compiler, get out of
the way and do what I tell you to do, those bits you have, they're of
this type and just do it".

But sometimes it is a necissary evil and you would like to retain some
safety net.
I usually relegate the reinterpret casts to very minimal usage where I
can check visually very easily that I know is happening.

Well, I _prefer_ to avoid it all together.
One thing you could do is use a single type that embodies some type
safety whenever talking to the C api - perhaps use an "Any" class that
contains the desired pointer. That way you only expect one type to come
back and forth from the C call backs and you can verify that they
contain the right type. How you manage their lifetime however is
another story.

This appears like it will do the trick. You have to implement it as a
coding standard but once so required and practiced you can't undo
yourself.

#include <iostream>
#include <boost/any.hpp>

class Test
{
int x;
public:
Test(int y) : x(y) {}
int GetSerial() const { return x; }
};

class NotTest
{
double y;
public:
NotTest(double x) : y(x) {}
double f() const { return y; }
};

void f_test(void * t)
{
boost::any * at = reinterpret_cast<boost::any*>(t);

// attempt cast to Test
Test * test = boost::any_cast<Test>(at);

if (test)
std::cout << test->GetSerial() << std::endl;
}

int main()
{

boost::any x;
x = Test(50);
f_test(&x); // output 50

x = NotTest(5.09);
f_test(&x); // no output

int y; std::cin >> y;
return 0;
}

I'm kind of new to using boost but this appears to be a good answer.
You can use the type "boost::any*" as the only pointer type you allow
to be passed through generic pointers. Then later if you change the
inheritance tree on some object you will get a predictable error
instead of undefined behavior should you miss such casts.

Now the trick will be to get this to be used by the team and begin
replacing current C-Style casts and misc. pointer passing with this
more predictable setup.
 
G

Gianni Mariani

Noah said:
....
I'm kind of new to using boost but this appears to be a good answer.
You can use the type "boost::any*" as the only pointer type you allow
to be passed through generic pointers. Then later if you change the
inheritance tree on some object you will get a predictable error
instead of undefined behavior should you miss such casts.

Now the trick will be to get this to be used by the team and begin
replacing current C-Style casts and misc. pointer passing with this
more predictable setup.

Thats exactly what I proposed. Now you have an issue with how to manage
the lifetime of the any object.

If the callback is a synchronous this, you can use the exact same thing
you're using - all is well.

If pointer is stored in "C" land and comes back at some later event,
then you need to match the lifetime of the any object with the lifetime
of the object it's pointing to.

This can also be done without using the boost::any class - just have a
blase class that stores it's typeid - that's all that all that
boost::any_cast does, it checks that the typeid is equal.

One way to do this is to inherit this monster ugly thing but the use
case is quite nice.

// caution - brain dump alert - all the code below is directly from
// brain to you with no compile checks - useful as a demo

class C_CallbackBase
{
protected:
C_CallbackBase( const typeinfo & i_callback_type )
: m_sentinel( 0xca11bac8 )
m_callback_type( i_callback_type )
{
}

const unsigned m_sentinel;
const typeinfo & m_callback_type;

// make this assignable ...
C_CallbackBase & operator( const C_CallbackBase & )
{
// my derived type does not change when I am assigned ...!
}

private:
// default copy constructor is not ok
C_CallbackBase( const C_CallbackBase & ); // never called

};


template <typename DerivedType>
class CallBackBase
: public C_CallbackBase
{
CallBackBase()
: C_CallbackBase( typeid( DerivedType ) )
{
}

CallBackBase( const CallBackBase & )
: C_CallbackBase( typeid( DerivedType ) )
{
}

void * GetCallbackPtr()
{
return static_cast< void * >(
static_cast<C_CallbackBase *>( this )
);
}

};

class CallBackCast
{
void * m_ptr;

CallBackCast( void * ptr )
: m_ptr( ptr )
{
}

template <typename DerivedType>
DerivedType * operator()
{
C_CallbackBase * ptr =
static_cast<C_CallbackBase *>( m_ptr );

// this is UB if the cast is wrong but it
// probably do the right thing
if ( ptr->m_sentinel != 0xca11bac8 )
{
throw "CALLBACK CLASS CORRUPT";
}
if ( m_callback_type == typeid( DerivedType ) )
{
return static_cast<DerivedType *>( ptr );
}

throw "TYPE MISMATCH FROM C CALLBACK";
}
};


class APPCLASS
: public CallBackBase<APPCLASS>
{
};

void c_callback_func( void * cb )
{
APPCLASS * appptr = CallBackCast( cb );

... do your thing
}

int main()
{

APPCLASS app;

c_callback_func( app.GetCallbackPtr() );

}


Note that you can have two versions of this thing if performance is an
issue, a debug version that checks (like this one) and one that has an
empty base class and no checks are done.

Note that you can't use boost::any as a member (or base class) because
the copy and assignment make no sense. Note that C_CallBackBase has an
empty assignment and copy construction is not allowed.
 
F

Frederick Gotham

Gianni Mariani:
The whole meaning of reinterpret_cast is "please compiler, get out of
the way and do what I tell you to do, those bits you have, they're of
this type and just do it".


That only happens when you cast to a reference type. Elsewhere, it performs a
proper conversion:

MyClass obj;

char unsigned *p = reinterpret_cast<char unsigned*>(&obj);
 
W

werasm

Gianni said:
class C_CallbackBase
{
protected:
C_CallbackBase( const typeinfo & i_callback_type )
: m_sentinel( 0xca11bac8 )
m_callback_type( i_callback_type )
{
}

Do you have a specific way in which you select your sentinal, or did
you use an arbitrary value? I have in the past used the this pointer
for this. Is that viable?

Werner
 
R

red floyd

werasm said:
Do you have a specific way in which you select your sentinal, or did
you use an arbitrary value? I have in the past used the this pointer
for this. Is that viable?

his sentinel is the word "callback" written as best as possible in hex
digits.
 
B

Bart

Frederick said:
Gianni Mariani:


That only happens when you cast to a reference type. Elsewhere, it performs a
proper conversion:

MyClass obj;

char unsigned *p = reinterpret_cast<char unsigned*>(&obj);

I don't know what you mean by "proper conversion". The mapping
performed by reinterpret_cast is always implementation-defined. It
could just "take those bits and reinterpret them to mean something
else" or it could "properly convert those bits" (whatever that means).
It's all up to the implementation.

Also, the standard explicitly states that reinterpret_cast<T&>(x) is
equivalent to *reinterpret_cast<T*>(&x).

Regards,
Bart.
 
B

Bart

Noah said:
very insightfull.

Indeed. People often forget that C++ is a language where you can shoot
yourself in the foot and blow your whole leg off. That stems directly
from the "the programmer knows best" philosophy.

If you want Java you know where to find it.

Regards,
Bart.
 
F

Frederick Gotham

Bart:
I don't know what you mean by "proper conversion".

By proper conversion, I mean that the behaviour of the following snippet is
well-defined:

int arr[10];

char unsigned *p = reinterpret_cast<char unsigned*>(arr);
char unsigned const *const pover = p + sizeof arr;

do *p++ = 0;
while (pover != p);

The reinterpret_cast doesn't merely take the bits of an int* and stick them
in a char*, it actually renders the address accurately as a char*. The
different will definitely be noticeable on systems where sizeof(char*) >
sizeof(int*).

The mapping performed by reinterpret_cast is always
implementation-defined.


Not when it comes to pointers; it's well-defined when it comes to pointers.
 
K

Kai-Uwe Bux

Frederick said:
Bart:
I don't know what you mean by "proper conversion".

By proper conversion, I mean that the behaviour of the following snippet
is well-defined:

int arr[10];

char unsigned *p = reinterpret_cast<char unsigned*>(arr);
char unsigned const *const pover = p + sizeof arr;

do *p++ = 0;
while (pover != p);

The reinterpret_cast doesn't merely take the bits of an int* and stick
them in a char*, it actually renders the address accurately as a char*.
The different will definitely be noticeable on systems where sizeof(char*)
sizeof(int*).

The mapping performed by reinterpret_cast is always
implementation-defined.


Not when it comes to pointers; it's well-defined when it comes to
pointers.

This last statement seems to be overly general. Is the following defined
behavior?

int main ( ) {

int i;
int* ip = &i;
unsigned* up = reinterpret_cast< unsigned* >( ip );
*up = 0;

}

I cannot find the anything in the standard that would prevent *up = 0 from
segfaulting.


Best

Kai-Uwe Bux
 
F

Frederick Gotham

Kai-Uwe Bux:
Is the following defined
behavior?

int main ( ) {

int i;
int* ip = &i;
unsigned* up = reinterpret_cast< unsigned* >( ip );
*up = 0;

}


There's no problem with the code, because:

sizeof(int) == sizeof(unsigned)
alignof(int) == alignof(unsigned)

You might have a little problem though if you try to read the value of "i"
subsequent to the zero assignment, but only if:

(1) An unsigned int contains padding.
(2) An unsigned int has an object representation of the value zero in which
the padding bits are not all set to zero.
(3) The bit-pattern for the value from (2) is an invalid object
representation for an int.

In such a system, the zero assignment could result in an object
representation of:

1111 0000 0000 0000 0000
(i.e. 4 padding bits, 16 value bits)

, which may be invalid for an int.
 
K

Kai-Uwe Bux

Frederick said:
Kai-Uwe Bux:



There's no problem with the code, because:

sizeof(int) == sizeof(unsigned)
alignof(int) == alignof(unsigned)

You might have a little problem though if you try to read the value of "i"
subsequent to the zero assignment, but only if:

(1) An unsigned int contains padding.
(2) An unsigned int has an object representation of the value zero in
which the padding bits are not all set to zero.
(3) The bit-pattern for the value from (2) is an invalid object
representation for an int.

In such a system, the zero assignment could result in an object
representation of:

1111 0000 0000 0000 0000
(i.e. 4 padding bits, 16 value bits)

, which may be invalid for an int.

I think you are assuming in your argument that up points to the same object
(i.e., region of memory) as ip. I do not find that guarantee in the
standard. All it requires is that if you convert back from up, you get the
original ip. Otherwise, the result of a pointer conversion is unspecified.
In particular, up could be an invalid pointer.

The only case where I know of a requirement that pointer conversion
preserves the actual memory location is conversion to and from (unsigned)
char*.


Best

Kai-Uwe Bux
 
F

Frederick Gotham

Kai-Uwe Bux:
I think you are assuming in your argument that up points to the same object
(i.e., region of memory) as ip. I do not find that guarantee in the
standard. All it requires is that if you convert back from up, you get the
original ip. Otherwise, the result of a pointer conversion is unspecified.
In particular, up could be an invalid pointer.


Given that:

sizeof(int) == sizeof(unsigned)
alignof(int) == alignof(unsigned)

, there's no reason to think that their pointers would be any different.
 
K

Kai-Uwe Bux

Frederick said:
Kai-Uwe Bux:



Given that:

sizeof(int) == sizeof(unsigned)
alignof(int) == alignof(unsigned)

, there's no reason to think that their pointers would be any different.

The problem is that there is no reason to think that they would be equal,
either. As I pointed out in some other thread, the C++ standard does allow
for pointers that store more information than just a location in memory.
This additional information would not be used in finding the location but
just for defining undefined behavior in surprising ways :)


Best

Kai-Uwe Bux
 
F

Frederick Gotham

Kai-Uwe Bux:
The problem is that there is no reason to think that they would be equal,
either. As I pointed out in some other thread, the C++ standard does allow
for pointers that store more information than just a location in memory.
This additional information would not be used in finding the location but
just for defining undefined behavior in surprising ways :)


Then maybe C++ is too loosely defined.
 
W

werasm

Kai-Uwe Bux said:
The problem is that there is no reason to think that they would be equal,
either. As I pointed out in some other thread, the C++ standard does allow
for pointers that store more information than just a location in memory.
This additional information would not be used in finding the location but
just for defining undefined behavior in surprising ways :)

How often does it lead to UB in practise (even though it hypothetically
can)?

W
 
N

Noah Roberts

Kai-Uwe Bux said:
I think you are assuming in your argument that up points to the same object
(i.e., region of memory) as ip. I do not find that guarantee in the
standard. All it requires is that if you convert back from up, you get the
original ip. Otherwise, the result of a pointer conversion is unspecified.
In particular, up could be an invalid pointer.

Correct, any use of a reinterpret_casted pointer results in undefined
behavior. The only defined behavior is casting back and forth.
The only case where I know of a requirement that pointer conversion
preserves the actual memory location is conversion to and from (unsigned)
char*.

The problem isn't necissarily that reinterpret_cast will change the
address of the pointer but that it won't. This becomes a major issue
when dealing with MI.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top