Is this portable? [static pointer casts, char* arithmetic]

S

SG

Hi!

I was toying around with some "copy-on-write" wrapper class design
(cow<>) and felt the need to use some ugly pointer casts. I was
wondering whether it is even 100% portable and if not what other
alternatives there are that don't affect the public interface of the
cow<> class template.

I basically use Boost.Function-like "type erasure" with reference
counting. I keep two pointers in the class cow<T> as data members.
One pointer to an abstract wrapper class (for cloning and ref-
counting) and another pointer T* that points directly to the wrapped
object that lives inside the wrapper. To support conversions from
cow<T> to cow<U> in case T* is convertible to U* I made the abstract
wrapper class to be a non-template. The problem arises when I need to
create a copy and adjust the 2nd pointer accordingly.

Here're some code fragments (for brevity) so you know what I'm talking
about:

--------8<----------------8<----------------8<--------

class abstract_wrapper
{
public:
abstract_wrapper() : ref_counter(1) {}

virtual ~abstract_wrapper() {}
virtual abstract_wrapper* clone() const = 0;

void refct_inc() {++ref_counter;}
bool refct_dec() {return (--ref_counter)==0;}

bool unique() const {return ref_counter<=1;}

private:
long ref_counter;
};

// .....

template<typename T>
class cow
{
private:

template<typename> friend class cow;

abstract_wrapper* paw;
T* ptr; // points to a member of *paw

void make_copy();

public:

// .....

T const& operator*() const {return *ptr;}
T const* operator->() const {return ptr;}

T& wref()
{
if (!paw->unique()) make_copy();
return *ptr;
}

// .....
};

template<typename T>
void cow<T>::make_copy()
{
assert( !paw->unique() );

typedef ..... char_t;
typedef ..... void_t;

// is T const? | char_t | void_t
// ------------+------------+-----------
// yes | const char | const void
// no | char | void

abstract_wrapper* paw2 = paw->clone();

char_t* bas1 = static_cast<char_t*>(static_cast<void_t*>(paw));
char_t* bas2 = static_cast<char_t*>(static_cast<void_t*>(paw2));
char_t* sub1 = static_cast<char_t*>(static_cast<void_t*>(ptr));
char_t* sub2 = bas2 + (sub1-bas1);

ptr = static_cast<T*>(static_cast<void_t*>(sub2));
paw->refct_dec();
paw = paw2;
}

--------8<----------------8<----------------8<--------

Obviously the private function "make_copy" looks a bit ugly with all
the casts. But as far as I can tell this should be portable. The
dynamic types of *paw and *paw2 are the same. The assumption is that
the object layout is consistent over all possible objects of that
dynamic type and that I can safely compute the address of the new
member object the way I did.

Am I correct?


Cheers!
SG
 
A

Alf P. Steinbach

* SG:
I was toying around with some "copy-on-write" wrapper class design
(cow<>) and felt the need to use some ugly pointer casts. I was
wondering whether it is even 100% portable and if not what other
alternatives there are that don't affect the public interface of the
cow<> class template.

I basically use Boost.Function-like "type erasure" with reference
counting. I keep two pointers in the class cow<T> as data members.
One pointer to an abstract wrapper class (for cloning and ref-
counting) and another pointer T* that points directly to the wrapped
object that lives inside the wrapper. To support conversions from
cow<T> to cow<U> in case T* is convertible to U* I made the abstract
wrapper class to be a non-template. The problem arises when I need to
create a copy and adjust the 2nd pointer accordingly.

Here're some code fragments (for brevity) so you know what I'm talking
about:

--------8<----------------8<----------------8<--------

class abstract_wrapper
{
public:
abstract_wrapper() : ref_counter(1) {}

virtual ~abstract_wrapper() {}
virtual abstract_wrapper* clone() const = 0;

void refct_inc() {++ref_counter;}
bool refct_dec() {return (--ref_counter)==0;}

bool unique() const {return ref_counter<=1;}

private:
long ref_counter;
};

Think about separating concerns.

Why should the same class have responsibility for cloning and reference counting?

Also, when you do reference counting you should really not allow the reference
count to drop to zero, as implied by your 'unique' implementation. When the
reference count drops to zero, self-destroy. That is, after all, the point.

// .....

template<typename T>
class cow
{
private:

template<typename> friend class cow;

abstract_wrapper* paw;

I suggest using a boost::intrusive_ptr here.

T* ptr; // points to a member of *paw

Why a member of?


void make_copy();

public:

// .....

T const& operator*() const {return *ptr;}
T const* operator->() const {return ptr;}

T& wref()
{
if (!paw->unique()) make_copy();
return *ptr;
}

// .....
};

OK so far, assuming you have just left out declarations of copy assignment and
copy construction.


template<typename T>
void cow<T>::make_copy()
{
assert( !paw->unique() );

typedef ..... char_t;
typedef ..... void_t;

// is T const? | char_t | void_t
// ------------+------------+-----------
// yes | const char | const void
// no | char | void

abstract_wrapper* paw2 = paw->clone();

char_t* bas1 = static_cast<char_t*>(static_cast<void_t*>(paw));
char_t* bas2 = static_cast<char_t*>(static_cast<void_t*>(paw2));
char_t* sub1 = static_cast<char_t*>(static_cast<void_t*>(ptr));
char_t* sub2 = bas2 + (sub1-bas1);

Offset calculations are only formally well-defined for PODs, which this isn't.

The clone function either returns a pointer that can be downcasted to T*, or it
doesn't, in which case it returns too little information.

Anyway, forget the silly and misleading double static_cast and use
reinterpret_cast when you really want casting to char*.

It's not like reinterpret_cast can fail to preserve the address value.

For it supports casting back and obtaining the original pointer, exactly.

ptr = static_cast<T*>(static_cast<void_t*>(sub2));
paw->refct_dec();

Client code has no business messing with internal reference counts.

Instead use a boost::intrusive_ptr for 'paw'.

For example.

paw = paw2;
}

--------8<----------------8<----------------8<--------

Obviously the private function "make_copy" looks a bit ugly with all
the casts. But as far as I can tell this should be portable. The
dynamic types of *paw and *paw2 are the same. The assumption is that
the object layout is consistent over all possible objects of that
dynamic type and that I can safely compute the address of the new
member object the way I did.

Am I correct?

Formally it's debatable, that is, whether the compiler is allowed to place some
part of an object some unrelated place in memory and just include an offset or
pointer or something. It can do that for virtual multiple inheritance. The
formal level problem is whether it can do so more generally, because as far as I
know nobody's found any language that forbids it (for non-POD).

In practice it's well-defined, as long as you're dealing with complete objects.

If OTOH you're cloning an object that's a base class sub-object of another
object and that inherits virtually from some base class, and that virtual base
class sub-object is your T*, then all bets are off. But presumably that's not
what you're doing. However, the fact that you're dealing separately with the paw
and the ptr, not simply having the same pointer of 2 different types, seems to
indicate that your design doesn't properly enforce identity of these pointers.

But as I hope the comments above make clear, a better solution is to re-design
so that you have available the required information.

The missing information, the presence of the casts, indicates some design flaw.


Cheers & hth.,

- Alf
 
S

SG

Think about separating concerns.
Why should the same class have responsibility for cloning and reference counting?

If I separated this it would not solve the problem, only replace one
to-be-managed object with two to-be-managed objects. In order to
support runtime polymorphism as well -- yes, it's probably overkill, i
know -- I need a wrapper with a virtual clone() anyways. So, why NOT
combining clonable wrapper with ref-counter? :p
Also, when you do reference counting you should really not allow the reference
count to drop to zero, as implied by your 'unique' implementation. When the
reference count drops to zero, self-destroy. That is, after all, the point.

True. I just avoided writing "delete this;" for style reasons. Since
there's only one place where I need to check whether I need to delete
it (the destructor) it doesn't bother me.
Why a member of?

Goals:
- copy-on-write wrapper "cow<>" for value types
- supports conversions from cow<U> to cow<T> in case
Convertible<U*,T*>
- manage the life-time of only one heap allocated object

I think the 2nd requirement implies decltype(paw) to be independent
from U/T. Otherwise I wouldn't need these address calculations and
could just ask the wrapper for the pointer to its member.
OK so far, assuming you have just left out declarations of copy assignment and
copy construction.

Of course.
Offset calculations are only formally well-defined for PODs, which this isn't.

Hmmm... I should have expected that.
The clone function either returns a pointer that can be downcasted to T*, or it
doesn't, in which case it returns too little information.

The 'clone' function returns a pointer to an abstract_wrapper that
contains a T object (or some U object where Convertible said:
Client code has no business messing with internal reference counts.

No, of course not. But this wasn't "client code". It was a private
Instead use a boost::intrusive_ptr for 'paw'.

Of course I could make abstract_wrapper compatible with
intrusive_ptr. I don't see the advantage, though.
[...]
Formally it's debatable, that is, whether the compiler is allowed to place some
part of an object some unrelated place in memory and just include an offset or
pointer or something. It can do that for virtual multiple inheritance. The

I was under the impression that the compiler is required to use a
consequtive sequence of sizeof(T) bytes to represent the (whole)
object.
[...]
In practice it's well-defined, as long as you're dealing with complete objects.

What do you mean? The dynamic type of *paw is never mentioned
anywhere. Are you saying that I used a construct in "make_copy" that
would require T to be a complete type? The dynamic type of *paw is
If OTOH you're cloning an object that's a base class sub-object of another
object and that inherits virtually from some base class, and that virtual base
class sub-object is your T*, then all bets are off. But presumably that's not

I honestly don't know what you're talking about. There's no virtual
inheritence involved (excluding the set of possible T's). The object
what you're doing. However, the fact that you're dealing separately with the paw
and the ptr, not simply having the same pointer of 2 different types, seems to
indicate that your design doesn't properly enforce identity of these pointers.

They are not identical. It's just that *ptr is a data member from the
dynamic object *paw whose type has been erased to support
conversions. I believe the standard shared_ptr implementation also
stores two pointers. One pointer to the object that contains the ref-
counter and deleter and one pointer to the object being managed. The
difference here is that I merged them for reasons earlier mentioned.
But as I hope the comments above make clear, a better solution is to re-design
so that you have available the required information.

The missing information, the presence of the casts, indicates some design flaw.

I wasn't satisfied with the design, either. The casting part bugged
me. I wouldn't go as far and say the presence of casts implies a bad
design. They can be useful, too. I think I just tried to hard.
Maybe there is no solution that meets all the goals I mentioned above
that avoids this pointer arithmetic. At least I don't see any.


Cheers!
SG
 
S

SG

Goals:
- copy-on-write wrapper "cow<>" for value types
- supports conversions from cow<U> to cow<T> in
case Convertible<U*,T*>
- manage the life-time of only one heap allocated object

I should add that I expected such a conversion not to create a new
copy (of the "pointee") but just to "work" polymorphically (which is
probably too much to ask for).

A simple example would look like this:

void test_string()
{
cow<string> x = string("hello");
cow<string> y = x;
cout << "*x --> " << *x << endl;
cout << "*y --> " << *y << endl;
cout << "shared = " << (&*x == &*y) << endl;
x.wref() += "123"; // "wref" for write access ...
cout << "*x --> " << *x << endl;
cout << "*y --> " << *y << endl;
cout << "shared = " << (&*x == &*y) << endl;
}

which is supposed to output:

*x --> hello
*y --> hello
shared = 1
*x --> hello123
*y --> hello
shared = 0

Without the polymorphism requirement this would be fairly easy to
achieve. But I also tried to make this code:

struct base
{
private:
virtual void greet_once() const
{ cout << "hello from base\n"; }
public:
void greet() const
{ for (int k=0; k<repeat; ++k) greet_once(); }
int repeat;
};

struct derived : base
{
private:
void greet_once() const
{ cout << "hello from derived\n"; }
public:
~derived()
{ cout << "this is ~derived speaking\n"; }
};

void test_polymorphism()
{
cow<base> x = derived();
x.wref().repeat = 3;
x->greet();
}

to output the following text:

this is ~derived speaking
hello from derived
hello from derived
hello from derived
this is ~derived speaking

where the first "~derived" message is due to the temporary in the copy-
initialization and the 2nd one comes from the copy of derived hidden
behind a type-agnostic abstract wrapper.


Cheers!
SG
 
A

Alf P. Steinbach

* SG:
I should add that I expected such a conversion not to create a new
copy (of the "pointee")

You can have value semantics or reference semantics but not both.

Choose poison! :)

But one approach that I've investigated is the smart pointer that takes
responsibility for object creation, not just destruction, that is, you provide
the construction arguments to the smart pointer, which delegates. Conclusion
that as of C++98 it's doable but somewhat awful -- lots of macro stuff. And
there's a notational problem regarding default and copy construction.

but just to "work" polymorphically (which is
probably too much to ask for).

A simple example would look like this:

void test_string()
{
cow<string> x = string("hello");
cow<string> y = x;
cout << "*x --> " << *x << endl;
cout << "*y --> " << *y << endl;
cout << "shared = " << (&*x == &*y) << endl;
x.wref() += "123"; // "wref" for write access ...
cout << "*x --> " << *x << endl;
cout << "*y --> " << *y << endl;
cout << "shared = " << (&*x == &*y) << endl;
}

which is supposed to output:

*x --> hello
*y --> hello
shared = 1
*x --> hello123
*y --> hello
shared = 0

Uhm, I like your approach, at least the idea. When I've done this[1] I've always
abstracted the machinery as a building block to be used internally by a class,
while you put it around, which is nicer. :) My main motivation has so far just
been to explore, though, and in particular having reference counting for an
array where the client code retains only a smart pointer to a part of it, such
as a pointer to a process argument (I believe example given in project [1]).

However, I didn't run into very much need for casting.

Without the polymorphism requirement this would be fairly easy to
achieve. But I also tried to make this code:

struct base
{
private:
virtual void greet_once() const
{ cout << "hello from base\n"; }
public:
void greet() const
{ for (int k=0; k<repeat; ++k) greet_once(); }
int repeat;
};

struct derived : base
{
private:
void greet_once() const
{ cout << "hello from derived\n"; }
public:
~derived()
{ cout << "this is ~derived speaking\n"; }
};

void test_polymorphism()
{
cow<base> x = derived();

For this to work that cow class would need a templated constructor that
dynamically allocated a clone of the derived argument.

x.wref().repeat = 3;
x->greet();
}

to output the following text:

this is ~derived speaking
hello from derived
hello from derived
hello from derived
this is ~derived speaking

where the first "~derived" message is due to the temporary in the copy-
initialization and the 2nd one comes from the copy of derived hidden
behind a type-agnostic abstract wrapper.

Yes, but what's the problem?


Cheers,

- Alf


Notes:
[1] E.g. search for "Alf's stringvalue" in SourceForge if interested. I
discovered later that design issues become thorny when adding into that mix a
vector replacement and support for intrusive ref counting for such vectors. I'll
probably take that up again but I just put it aside when it became "complex"...
 
S

SG

SG:


You can have value semantics or reference semantics but not both.
Choose poison! :)

I did. I'd like to think of it as "polymorphic value semantic". :)
But one approach that I've investigated is the smart pointer that takes
responsibility for object creation, not just destruction, that is, you provide
the construction arguments to the smart pointer, which delegates. Conclusion
that as of C++98 it's doable but somewhat awful  --  lots of macro stuff. And
there's a notational problem regarding default and copy construction.

Yeah. This should be a piece of cake with C++0x features. I was
thinking of a little helper function template that magically
constructs T for you without copying it:

cow said:
   void test_string()
   {
      cow<string> x = string("hello");
      cow<string> y = x;
      cout << "*x --> " << *x << endl;
      cout << "*y --> " << *y << endl;
      cout << "shared = " << (&*x == &*y) << endl;
      x.wref() += "123"; // "wref" for write access ...
      cout << "*x --> " << *x << endl;
      cout << "*y --> " << *y << endl;
      cout << "shared = " << (&*x == &*y) << endl;
   }
which is supposed to output:
   *x --> hello
   *y --> hello
   shared = 1
   *x --> hello123
   *y --> hello
   shared = 0

Uhm, I like your approach, at least the idea. When I've done this[1] I've always
abstracted the machinery as a building block to be used internally by a class,
while you put it around, which is nicer. :)

Heheh :) Thanks.
My main motivation has so far just
been to explore, though, and in particular having reference counting for an
array where the client code retains only a smart pointer to a part of it, such
as a pointer to a process argument (I believe example given in project [1]).

However, I didn't run into very much need for casting.

I did. It's because I wanted to much. :)
Even though it may not be 100% UB-free it works like intended on 32bit
linux with G++ 4.3.2.

Here's the source code (will probably expire after one month):
http://en.pastebin.ca/1392123
Without the polymorphism requirement this would be fairly easy to
achieve.  But I also tried to make this code:
[...]
For this to work that cow class would need a templated constructor that
dynamically allocated a clone of the derived argument.

Yes, correct. Well, it's not cloned (as in clone function being
invoked) but simply copy-constructed.
[...]
where the first "~derived" message is due to the temporary in the copy-
initialization and the 2nd one comes from the copy of derived hidden
behind a type-agnostic abstract wrapper.

Yes, but what's the problem?

I can't work around this ugly char* pointer arithmetic. Maybe you
can. If you can my hat is off to you. I was thinking of alternatives
but haven't come up with working ones so far.

For example, if C++ provided a feature (magic library function) that
lets me adjust a void pointer so that the following two pieces are
equivalent:

1. A* pa = ...;
B* pb = pa; // A derives from B

2. void adjust(void *&, typeinfo const& from, typeinfo const&
to);
void adjust(void const*&, typeinfo const& from, typeinfo const&
to);

A* pa = ...;
void* pvoid = pa;
adjust(pvoid,typeid(A*),typeid(B*));
B* pb = static_cast<B*>(pvoid);

then I wouldn't need this char* arithmetic anymore. The difference
between (1) and (2) is that (2) can be used with typeinfo objects that
are only known at runtime. I could use (2) because I don't know
anymore that typeid(*paw) is a concrete_wrapper<A> but I could get the
void* and typeinfo object through the type-agnostic abstract wrapper
interface.


Cheers!
SG
 
A

Alf P. Steinbach

* SG:
I can't work around this ugly char* pointer arithmetic. Maybe you
can. If you can my hat is off to you. I was thinking of alternatives
but haven't come up with working ones so far.

For example, if C++ provided a feature (magic library function) that
lets me adjust a void pointer so that the following two pieces are
equivalent:

1. A* pa = ...;
B* pb = pa; // A derives from B

2. void adjust(void *&, typeinfo const& from, typeinfo const&
to);
void adjust(void const*&, typeinfo const& from, typeinfo const&
to);

A* pa = ...;
void* pvoid = pa;
adjust(pvoid,typeid(A*),typeid(B*));
B* pb = static_cast<B*>(pvoid);

then I wouldn't need this char* arithmetic anymore. The difference
between (1) and (2) is that (2) can be used with typeinfo objects that
are only known at runtime. I could use (2) because I don't know
anymore that typeid(*paw) is a concrete_wrapper<A> but I could get the
void* and typeinfo object through the type-agnostic abstract wrapper
interface.

As I wrote I think redesign, so that the required information is directly
available, is better, and by that I mean simpler and more efficient.

But for the currently present pointer adjustment, have you considered a member
pointer, since, as your comment stated, it's a pointer to a member? ;-)

C++ member pointers are not really pointers: they're more like offsets,
automating the kind of trickery that you attempted to do via casts.

It won't work when the "member" is an element of an array, and you also need a
typed full object pointer.

But chances are that that's what you have.


Cheers & hth.,

- Alf
 
S

SG

As I wrote I think redesign, so that the required information is directly
available, is better, and by that I mean simpler and more efficient.

But for the currently present pointer adjustment, have you considered a member
pointer, since, as your comment stated, it's a pointer to a member? ;-)

I don't think you appreciate the difficulty of the actual problem. In
order to support polymorphism I have to use a type-agnostic wrapper.
The communication to this wrapper is restricted to function calls that
don't mention any other types that depend on template parameters
("T"). In this case it's the virtual clone function (returning a
pointer to a non-template type "abstract_wrapper) and the virtual
destructor.

I could change the requirements and get away with something different,
though. But that wouldn't be as interesting / challenging. It
strikes me as odd that I'm able to get it to work by (possibly)
invoking UB but fail to do it in a standard-conforming way.

Thanks for taking interest, though. :)


Cheers!
SG
 
S

SG

Hi!

Just in case anyone cares I made UML-like sketch of what I was trying
to do.
http://img245.imageshack.us/img245/5633/cow.png

Basically, the polymorphism requirement forces me to "erase" the the
dynamic type of *paw. This becomes a problem when I need to duplicate
the wrapper object and get a new pointer to the wrapper's member
object.

I think -- in case this pointer arithmetic invokes UB -- there's no
portable solution (that achieves the same observable behaviour and
doesn't result in a weaker cow<> interface).

btw: The early source code example contained a typo which leads to a
memory leak and may delete objects too early. Here's the fix:
http://en.pastebin.ca/1392625

Cheers!
SG
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,565
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top