Problem with inheritance and arbitrary "features" support (viatemplates).

K

KRao78

I have the following C++ design problem and I would really appreciate
any suggestion/solution.
Please notice that my background is not in computer science, so I may
be missing some obvious solution.

The way I usually separate key components in the code is to define
interfaces via abstract classes and pure virtual functions.

Example1:

class B
{
public:
virtual double f( double x ) = 0;
};

class D1 : public B
{
public:
double f( double x ) const
{
return 0.0;
}
};

class D2 : public B
{
public:
double f( double x ) const
{
return 1.0;
}
};

This way I can nicely separate interface from implementation.
This approach is also quite fast (and as what I am working on is a
numerical library this is important :p).

Now, the problem I am facing is the following one.

I have a set of "functionalities" which can be summarized by functions
(defined below) f(), g(), and h()
Notice that all these functions will in general differ in arguments
and return types.

Suppose I have some code that expects a pointer to an object that
implements the functionalities f() and g().
What I would like to do is to being able to pass something which has
"more or equal" functionalities, for example something which supports f
(), g() and h().

To better explain myself here is some code.
Please notice that instead of multiple inheritance I can have used a
"nested intheritance" approach, like in boost::eek:perators. The point
here is that I will never have the case in which f() is the same as g
(). All the features are different.
Te problem is that in order to make this work I need to use
reinterpret_cast as in the example below (so this is not really a
solution):

Example2:


class F {
public:
virtual double f( double x ) = 0;
};

class G {
public:
virtual double g( double x ) = 0;
};

class H {
public:
virtual double h( double x ) = 0;
};

class N {};

template<class T1, class T2=N, class T3=N>
class Feature : public T1 , public T2 , public T3
{
};

template<class T1, class T2>
class Feature<T1,T2,N> : public T1, public T2
{
};

template<class T1>
class Feature<T1,N,N> : public T1
{
};

//Supp for Supports/Implements
class SuppFandG : public Feature<F,G>
{
public:
double f( double x ) { return 0.0; }
double g( double x ) { return 1.0; }
};

class SuppFandH : public Feature<F,H>
{
public:
double f( double x ) { return 0.0; }
double h( double x ) { return 1.0; }
};

class SuppFandGandH : public Feature<F,G,H>
{
public:
double f( double x ) { return 0.0; }
double g( double x ) { return 1.0; }
double h( double x ) { return 2.0; }
};

int main()
{
Feature<F,G>* featureFandGPtr;
Feature<F,H>* featureFandHPtr;
Feature<H,F>* featureHandFPtr;
Feature<F,G,H>* featureFandGandHPtr;

SuppFandGandH suppFandGandH;
featureFandGandHPtr = &suppFandGandH;

//featureFandGPtr = featureFandGandHPtr; //Illegal. static_cast
illegal too.
//the reason to do this is that I would like to pass a pointer to an
object
//of type Feature<F,G,H> to a function (or constructor) that expects
a pointer to Feature<F,G>
featureFandGPtr = reinterpret_cast< Feature<F,G>* >
( featureFandGandHPtr );
featureFandHPtr = reinterpret_cast< Feature<F,H>* >
( featureFandGandHPtr );
featureHandFPtr = reinterpret_cast< Feature<H,F>* >
( featureFandGandHPtr );

featureFandGPtr->f( 1.0 );
featureFandGandHPtr->h( 1.0 );
}


Or I can try to construct a inheritance hierarcy but changing the
definition of Feature but the following example makes the Visual
studio 2008 professional compiler crash, so I cannot test it.

Example 3:

//This will not work, Visual studio 2008 professional crash.
template<class T1, class T2=N, class T3=N>
class Feature : public Feature<T1,T2> , public Feature<T1,T3> , public
Feature<T2,T3>
{
};

template<class T1, class T2>
class Feature<T1,T2,N> : public Feature<T1>, public Feature<T2>
{
};

template<class T1>
class Feature<T1,N,N> : public T1
{
};

With this approach I still have the problems
1) Feature<F,G> is logically equivalent (for what I want to achieve)
to Feature<G,F> but their types are different.
This can however be solved by some fancy metaprogramming using the MPL
boost library (always "sort" the types), so for simplicity let's
assume this is not a problem.

2) Problem of multiple bases, and I want to avoid virtual inheritance
via virtual bases (performance penalty).
This is probably solvable by using directives inside the Feature
specializations.

Still I am not 100% sure I can make this work and it will not scale
well for a large number of features.
In fact the number of elements composing the hierarchi is given by the
binomial coefficient, almost factorial:
F - > F (1)
F,G - > FG, F, G (3)
F,G,H -> FGH, FG, GH, FH, F, G, H (7)

I would like to know if there is a solution to the design problem
which involves the following conditions:

1) The code should have a runtime performance equivalent to Example 1.

2) I want to be able to specify easily some set of features and being
able to "pass in" any pointers to objects that have this (and usually
extra) functionality.

3) I want code that depends on features f() and g() not to require re-
compiling whenever I consider a new feature h() somewhere else.

4) I do not want to template everything that want to use such features
(almost all the code). There should be some kind of "separation", see
point 3.

Looking in the (numerical) libraries I have usually found the two
approaches:

1) Define a huge abstract base class B that have f(),g(),h(),......
Problems: whenever I want to add a new feature z(), B has to be
modified, everything needs to be re-compiled (even if this code does
not care about z() at all), all the existing implementations D1,
D2,... of B needs to be modified (usually by having them throw an
exception for z() apart for the new implementation that supports z()).
The solution of enlarging B progressively when I need to add features
is not a good one for the problem at hand, as the featurs f() and g()
are really "as important" as h() and i(), an neither is "more basic"
then the others.

2) Separate all the functionalities and use one pointer for each
functionality.
However, this is cumbersome for the user (in most situations 4 or more
pointers would have to be carried around), and for the problem at hand
this approach is not optimal (here really it is 1 object that may or
may not do something, in fact calling f() will modify the result
obtained by g() and vice-versa).

Thank you in advance for your help.

KRao.
 
A

Alf P. Steinbach

* KRao78:
I have the following C++ design problem and I would really appreciate
any suggestion/solution.
Please notice that my background is not in computer science, so I may
be missing some obvious solution.

The way I usually separate key components in the code is to define
interfaces via abstract classes and pure virtual functions.

Example1:

class B
{
public:
virtual double f( double x ) = 0;
};

class D1 : public B
{
public:
double f( double x ) const
{
return 0.0;
}
};

What's the point of hiding B::f in class D1 (which is still abstract)?

Hm, let's assume that was a typo.

But please, when posting code, copy and paste *working* code.

class D2 : public B
{
public:
double f( double x ) const
{
return 1.0;
}
};

This way I can nicely separate interface from implementation.
This approach is also quite fast (and as what I am working on is a
numerical library this is important :p).

Now, the problem I am facing is the following one.

I have a set of "functionalities" which can be summarized by functions
(defined below) f(), g(), and h()
Notice that all these functions will in general differ in arguments
and return types.

Suppose I have some code that expects a pointer to an object that
implements the functionalities f() and g().
What I would like to do is to being able to pass something which has
"more or equal" functionalities, for example something which supports f
(), g() and h().

To better explain myself here is some code.
Please notice that instead of multiple inheritance I can have used a
"nested intheritance" approach, like in boost::eek:perators. The point
here is that I will never have the case in which f() is the same as g
(). All the features are different.
Te problem is that in order to make this work I need to use
reinterpret_cast as in the example below (so this is not really a
solution):

Example2:


class F {
public:
virtual double f( double x ) = 0;
};

class G {
public:
virtual double g( double x ) = 0;
};

class H {
public:
virtual double h( double x ) = 0;
};

class N {};

template<class T1, class T2=N, class T3=N>
class Feature : public T1 , public T2 , public T3
{
};

Again, please copy and paste *working* code.

You can't inherit directly twice or more from the same class.

template<class T1, class T2>
class Feature<T1,T2,N> : public T1, public T2
{
};

template<class T1>
class Feature<T1,N,N> : public T1
{
};

//Supp for Supports/Implements
class SuppFandG : public Feature<F,G>
{
public:
double f( double x ) { return 0.0; }
double g( double x ) { return 1.0; }
};

class SuppFandH : public Feature<F,H>
{
public:
double f( double x ) { return 0.0; }
double h( double x ) { return 1.0; }
};

class SuppFandGandH : public Feature<F,G,H>
{
public:
double f( double x ) { return 0.0; }
double g( double x ) { return 1.0; }
double h( double x ) { return 2.0; }
};

Here you're into combinatorial nightmare.

But you probably know that.

Perhaps that is the question.

int main()
{
Feature<F,G>* featureFandGPtr;
Feature<F,H>* featureFandHPtr;
Feature<H,F>* featureHandFPtr;
Feature<F,G,H>* featureFandGandHPtr;

SuppFandGandH suppFandGandH;
featureFandGandHPtr = &suppFandGandH;

//featureFandGPtr = featureFandGandHPtr; //Illegal. static_cast
illegal too.
//the reason to do this is that I would like to pass a pointer to an
object
//of type Feature<F,G,H> to a function (or constructor) that expects
a pointer to Feature<F,G>
featureFandGPtr = reinterpret_cast< Feature<F,G>* >
( featureFandGandHPtr );
featureFandHPtr = reinterpret_cast< Feature<F,H>* >
( featureFandGandHPtr );
featureHandFPtr = reinterpret_cast< Feature<H,F>* >
( featureFandGandHPtr );

featureFandGPtr->f( 1.0 );
featureFandGandHPtr->h( 1.0 );
}

Oh my. :)

Why not just remove those featureThisAndThat classes?

Remember: KISS - Keep It Simple, Stupid.

(You may want to Google or Wikipedia that principle.)

Or I can try to construct a inheritance hierarcy but changing the
definition of Feature but the following example makes the Visual
studio 2008 professional compiler crash, so I cannot test it.

Example 3:

//This will not work, Visual studio 2008 professional crash.
template<class T1, class T2=N, class T3=N>
class Feature : public Feature<T1,T2> , public Feature<T1,T3> , public
Feature<T2,T3>
{
};

template<class T1, class T2>
class Feature<T1,T2,N> : public Feature<T1>, public Feature<T2>
{
};

template<class T1>
class Feature<T1,N,N> : public T1
{
};

What do you want the Feature... classes *for*?

With this approach I still have the problems
1) Feature<F,G> is logically equivalent (for what I want to achieve)
to Feature<G,F> but their types are different.
This can however be solved by some fancy metaprogramming using the MPL
boost library (always "sort" the types), so for simplicity let's
assume this is not a problem.

2) Problem of multiple bases, and I want to avoid virtual inheritance
via virtual bases (performance penalty).

Don't think about performance.

Let the compiler do that.

It's not that it's very much smarter, it may even do plain stupid things, but
whatever it does, by deciding to trust that whatever it does is good enough
This is probably solvable by using directives inside the Feature
specializations.

Still I am not 100% sure I can make this work and it will not scale
well for a large number of features.
In fact the number of elements composing the hierarchi is given by the
binomial coefficient, almost factorial:
F - > F (1)
F,G - > FG, F, G (3)
F,G,H -> FGH, FG, GH, FH, F, G, H (7)
Yes.


I would like to know if there is a solution to the design problem
which involves the following conditions:

1) The code should have a runtime performance equivalent to Example 1.

Then use example 1. After correcting it, of course.

2) I want to be able to specify easily some set of features and being
able to "pass in" any pointers to objects that have this (and usually
extra) functionality.
Huh?


3) I want code that depends on features f() and g() not to require re-
compiling whenever I consider a new feature h() somewhere else.

4) I do not want to template everything that want to use such features
(almost all the code). There should be some kind of "separation", see
point 3.

Looking in the (numerical) libraries I have usually found the two
approaches:

1) Define a huge abstract base class B that have f(),g(),h(),......
Problems: whenever I want to add a new feature z(), B has to be
modified, everything needs to be re-compiled (even if this code does
not care about z() at all), all the existing implementations D1,
D2,... of B needs to be modified (usually by having them throw an
exception for z() apart for the new implementation that supports z()).
The solution of enlarging B progressively when I need to add features
is not a good one for the problem at hand, as the featurs f() and g()
are really "as important" as h() and i(), an neither is "more basic"
then the others.

2) Separate all the functionalities and use one pointer for each
functionality.
However, this is cumbersome for the user (in most situations 4 or more
pointers would have to be carried around), and for the problem at hand
this approach is not optimal (here really it is 1 object that may or
may not do something, in fact calling f() will modify the result
obtained by g() and vice-versa).

Thank you in advance for your help.

You're into bad design and should primarily look at that.

It needs total revamping.

But as technical solution for your current design, use one interface class per
function f(), g(), h() (that's essentially your example 2 without the template).


Cheers & hth.,

- Alf
(Just deciding to answer some clc++ posts)
 
V

Vladimir Jovic

Alf said:
* KRao78:

What do you want the Feature... classes *for*?

Sounds like the example from the "c++ templates - the complete guide" by
Vandevoorde and Josuttis, chapter 16, except they got it right ;)

I agree with KISS principle
 
K

KRao78

What's the point of hiding B::f in class D1 (which is still abstract)?

Hm, let's assume that was a typo.

But please, when posting code, copy and paste *working* code.

Yes, sorry I apologise for the mistake (a const was missing).
The other code I postes was tested (the did not complain and the
exectuable produced the expected result).
Again, please copy and paste *working* code.

It works with the compiler I am using at the moment (Visual Studio
Professional 2008).
Moreover (at least to my understanding), it never inherits from the
same object more then once, as the specializations provided below are
used when only one or two template arguments are specified by the
user. When 3 template arguments are specified clearly there is issue
(as long as the user do not use the same type twice, and this is
intended as he should not do it).

template<class T1, class T2>
class Feature<T1,T2,N> : public T1, public T2
{};

template<class T1>
class Feature<T1,N,N> : public T1
{};

Here you're into combinatorial nightmare.

But you probably know that.

Perhaps that is the question.

Yes I know that, how to solve the problem is the question in fact.
Or how to achieve a similar result with another design.
Why not just remove those featureThisAndThat classes?
[...]
What do you want the Feature... classes *for*?

I think it is probably better to explain the specific numerical
problem at hand to clarify why I do need these classes.
This part of the library deals with the generations of samples (or
vector of samples) distributed according to some statistical
distribution.
Now, the way it usually works is that we have a random number
generator which only generates doubles from 0 and 1 from which, using
different algorithms, we obtain samples from arbitrary distributions.

Say I have some code that requires the generation of Exponential and
Gaussian samples.
Then, for the user, it makes perfect sense to be able to work with
objects of type
Rvg<Exponential,Gaussian>* rvgPtr (Rvg stands for Random Variate
Generator) and use it like:

double sample = rvgPtr->gaussian( mu , sigma ); //Where mu and sigma
are doubles
double sample2 = rvgPtr->exponential( mu );

Conceptually we are dealing with a specific random variate generator
(an aggregate of algorithms and one random number generator), so
separating the Gaussian and Exponential "features" is confusing.
Moreover, as in most applications there is the need to work with
multiple distributions, passing around lot of pointers is seriously
inconvenient. The previous version of the library did so and I am
rewriting it for a good reason.
In fact most of the libraries available just use the huge abstract
base class paradigm which supportes everything that may be needed,
just to avoid having to pass around all these pointers.
For instance the GSL (Gnu scientific library) written in C uses only 1
pointer to pass around the random number generator and then defines
functions that generate (according to fixed algorithms selected by
function name) the samples from different distributions. Example

double rng_gaussian_boxmuller( gsl_rng* r , double mu , double
sigma ); //gaussian for gaussian distribution, boxmuller for used
algorithm

I wanted to achieve the same usable sintax while allowing for more
generality (change algorithm transparently from the code that uses it)
but without having to template all my code to use looseley defined
interfaces.
Don't think about performance.

Let the compiler do that.

It's not that it's very much smarter, it may even do plain stupid things, but
whatever it does, by deciding to trust that whatever it does is good enough
*you* will be working smarter. <g>

I am perfectly aware of the principles of writing good/elegant
(correct :p) code first, optimize it later (if needed, but do
profiling before ecc ecc).
In fact I have done this, I have profiled the code from a previous
version of the library, I have found this a critical point where most
of the run-time performance can be lost.
Reason is random number generation has become really fast (see for
instance CUDA on latest Nvidia GPU cards, we talking of hundered
millions random numbers/sec) and you do not want to have these calls
be bottlekneck of your simulations.
So, given the requisite of interface/implementation seperation, I need
to select the most efficient solutions.
Then use example 1. After correcting it, of course.

I already explained that passing manually multiple pointers is doable
but inconvenient.

I think we agree that "logically" speaking you can expect something
that is able to do A, B and C to be able to do A and B too right?
You're into bad design and should primarily look at that.

That is why I was asking for suggestions.
And I am trying to keep everything as simple as possible (for the
user).

I still think that having to deal with
Rvg<Poisson,Gaussian,Exponential>* is more intuitive (and convenient
and closer tho what is really happening) then three separate pointers.

Given the further insight do you have a design suggestion?

At the moment I am considering working with a container storing the
pointers one for each feature but make this invisible to the user.

Cheers
KRao
 
K

KRao78

Sounds like the example from the "c++ templates - the complete guide" by
Vandevoorde and Josuttis, chapter 16, except they got it right ;)

That is useful I will have a look, thank you.
I agree with KISS principle

Please see my reply above.

Cheers,
KRao
 
A

Alf P. Steinbach

* KRao78:
It works with the compiler I am using at the moment (Visual Studio
Professional 2008).
Nope.


Moreover (at least to my understanding), it never inherits from the
same object more then once, as the specializations provided below are
used when only one or two template arguments are specified by the
user.

If the above is never instantiated as-is, then don't define N, don't use
defaults on the template parameters, and don't define the class. Like

template< class T1, class T2, class T3>
class Feature;

That's it.


[snip]
I think it is probably better to explain the specific numerical
problem at hand to clarify why I do need these classes.
This part of the library deals with the generations of samples (or
vector of samples) distributed according to some statistical
distribution.
Now, the way it usually works is that we have a random number
generator which only generates doubles from 0 and 1 from which, using
different algorithms, we obtain samples from arbitrary distributions.

Say I have some code that requires the generation of Exponential and
Gaussian samples.
Then, for the user, it makes perfect sense to be able to work with
objects of type
Rvg<Exponential,Gaussian>* rvgPtr (Rvg stands for Random Variate
Generator) and use it like:

double sample = rvgPtr->gaussian( mu , sigma ); //Where mu and sigma
are doubles
double sample2 = rvgPtr->exponential( mu );

Conceptually we are dealing with a specific random variate generator
(an aggregate of algorithms and one random number generator), so
separating the Gaussian and Exponential "features" is confusing.
Moreover, as in most applications there is the need to work with
multiple distributions, passing around lot of pointers is seriously
inconvenient. The previous version of the library did so and I am
rewriting it for a good reason.

It's unclear where you get all those pointers from.

Are you allocating random number generators dynamically?

To pass an object by reference, use a C++ reference, not a pointer.

Anyways, at the design level a random number generator is one thing, and a
distribution is another thing that *uses* an rng.

So that means something like

class RandomVariateGenerator
{
public:
virtual double next() = 0;
};

class Gaussian:
public RandomVariateGenerator
{
private:
Random myRng;
public:
double next() { ... }
};

It's as simple as that.

Although as I recall in C++0x the designers managed to mess up even something
this simple. They were thinking in mathematical terms, not in practical terms.
Sort of like hiring a chemist as a chef because she's awfully good with
chemistry, and hey, cooking involves chemistry, right?


Cheers & hth.,

- Alf
 
K

KRao78


Have you tested this?
If the above is never instantiated as-is, then don't define N, don't use
defaults on the template parameters, and don't define the class. Like

   template< class T1, class T2, class T3>
   class Feature;

That's it.

No, if the user specifies all three template parameters then we need
the inheritance to all three classes T1, T2, T3 and this does not
cause any issues.
The default argument is needed so that:
Feature<Gaussian,Exponential,Poisson> -> Feature<T1,T2,T2> is used
(which inherits from T1,T2,T3)
Feature<Gaussian,Exponential> -> Feature<T1,T2,N> is used (which
inherits from T1,T2)
It's unclear where you get all those pointers from.

Are you allocating random number generators dynamically?

Yes. I am going to use Smart Pointers with them.
But really I will just select one object (between the possible
candidates) that generate samples accordig to different distributions.
To pass an object by reference, use a C++ reference, not a pointer.

The example above was from the GSL library which is in C.
As stated my intention was to provide a similar inteface, with minimal
performance impact and gretaer generality.
Anyways, at the design level a random number generator is one thing, and a
distribution is another thing that *uses* an rng.

So that means something like

    class RandomVariateGenerator
    {
    public:
        virtual double next() = 0;
    };

    class Gaussian:
        public RandomVariateGenerator
    {
    private:
        Random   myRng;
    public:
        double next() { ... }
    };

It's as simple as that.

But it's not that simple.
Because for example the CUDA random variate generator just consistes
in some functions that returns samples from distributions (the
connection between the random number generators and random variate
generators is not exposed in the sense that there is no such thing as
a function wich takes in random numbers and returns random variates).

And anyway using what you are proposing I would have to work with
multiple pointers, which is my intention to avoid.
This is why I made the original post.
Although as I recall in C++0x the designers managed to mess up even something
this simple. They were thinking in mathematical terms, not in practical terms.
Sort of like hiring a chemist as a chef because she's awfully good with
chemistry, and hey, cooking involves chemistry, right?

As my library will be primarly used by statisticians/mathematicians,
not by expert C++ programmers, I think it makes sense to write it in
such a way that the target users find it intuitive.
I think that something among the lines the GSL library interface is
what I should aim for.
 
A

Alf P. Steinbach

* KRao78:
Have you tested this?

Of course not.

It's invalid code when you instantiate it.

You think it's working because you haven't actually instantiated it.

No, if the user specifies all three template parameters then we need
the inheritance to all three classes T1, T2, T3 and this does not
cause any issues.

For that you need inheritance from all three classes, not defaults on the
template parameters.

Try this.

Keep in mind that when you code you're creating a fine machine. You wouldn't
create a machine by trowing in parts higgedly-piggedly, now would you? Don't do
that when programming, either.

The default argument is needed so that:
Feature<Gaussian,Exponential,Poisson> -> Feature<T1,T2,T2> is used
(which inherits from T1,T2,T3)
Feature<Gaussian,Exponential> -> Feature<T1,T2,N> is used (which
inherits from T1,T2)


Yes. I am going to use Smart Pointers with them.

In that case the appearance of pointers doesn't have much to do with your classes.

It has to do with your decision to allocate dynamically, possibly for the
purpose of sharing.

But what is the purpose of sharing them?

But really I will just select one object (between the possible
candidates) that generate samples accordig to different distributions.


The example above was from the GSL library which is in C.
As stated my intention was to provide a similar inteface, with minimal
performance impact and gretaer generality.


But it's not that simple.
Because for example the CUDA random variate generator just consistes
in some functions that returns samples from distributions (the
connection between the random number generators and random variate
generators is not exposed in the sense that there is no such thing as
a function wich takes in random numbers and returns random variates).

Sorry, that's not meaningful to me.

I respectfully submit that there's something fundamental that you're
overlooking. :)

And anyway using what you are proposing I would have to work with
multiple pointers, which is my intention to avoid.

Where on earth do you get this pointer stuff from?

Just forget it.

There's no need for it with what you've explained so far, just a random decision
to allocate things dynamically, which you generally don't have to.

This is why I made the original post.


As my library will be primarly used by statisticians/mathematicians,
not by expert C++ programmers, I think it makes sense to write it in
such a way that the target users find it intuitive.

I think that something among the lines the GSL library interface is
what I should aim for.

I'm unfamiliar with that so can't comment on its suitability.


Cheers & hth.,

- Alf
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,071
Latest member
MetabolicSolutionsKeto

Latest Threads

Top