"array of Derived" is not a kind-of "array of Base" question

J

Joseph Turian

Fellow hackers,

I have a class BuildNode that inherits from class Node.

Similarly, I have a class BuildTree that inherits from class Tree.

Tree includes a member variable:
vector<Node> nodes; // For clarity, let this be "orig_nodes"

BuildTree includes a member variable:
vector<BuildNode> nodes; // For clarity, let this be "build_nodes"

So as far as I can tell, this overloads "nodes", defining two different
"nodes" variables instead of having build_nodes clobber orig_nodes (as
was my intention).

Now, let's say I have a Tree method defined:
bool Tree::is_degenerate const { return nodes.empty(); }

If I have a BuildTree object build_tree, and I call
build_tree.is_degenerate(), it returns "orig_nodes.empty()", not
"build_nodes.empty()" as is desired.

So the question is:
Is there any way to define methods in Tree that reference "nodes", and
get those methods to act upon "build_nodes" if the calling object is a
BuildTree and "orig_nodes" if the calling object is a Tree, without
duplicating the code in BuildTree? [I was thinking maybe make a virtual
method "get_nodes()", which in a Tree object returns orig_nodes and in
a BuildTree object returns "build_nodes", and then call get_nodes()
whenever I would normally use "nodes" in Tree methods. Would this
work?]

I don't mind refactoring my code to get this right, if anyone can
suggest a good way to redesign these objects.

Sub-question: Having two "nodes" objects with the same name in a
BuildNodes object is a really good way to shoot oneself in the foot.
(I'm surprised the code worked for so long before I found this issue
earlier today.) Is there a solution in which I can do away with this
name clash entirely?

Best,
JOSEPH
 
V

Victor Bazarov

Joseph said:
I have a class BuildNode that inherits from class Node.

Similarly, I have a class BuildTree that inherits from class Tree.

Tree includes a member variable:
vector<Node> nodes; // For clarity, let this be "orig_nodes"

BuildTree includes a member variable:
vector<BuildNode> nodes; // For clarity, let this be "build_nodes"

So as far as I can tell, this overloads "nodes", defining two different
"nodes" variables instead of having build_nodes clobber orig_nodes (as
was my intention).

I don't think I heard the term "clobber" in relation to C++ constructs.
"Overload", "override", and "hide" is what members do to each other.
Now, let's say I have a Tree method defined:
bool Tree::is_degenerate const { return nodes.empty(); }

bool Tree::is_degenerate() const ...
If I have a BuildTree object build_tree, and I call
build_tree.is_degenerate(), it returns "orig_nodes.empty()", not
"build_nodes.empty()" as is desired.
Right...

So the question is:
Is there any way to define methods in Tree that reference "nodes", and
get those methods to act upon "build_nodes" if the calling object is a
BuildTree and "orig_nodes" if the calling object is a Tree, without
duplicating the code in BuildTree? [I was thinking maybe make a virtual
method "get_nodes()", which in a Tree object returns orig_nodes and in
a BuildTree object returns "build_nodes", and then call get_nodes()
whenever I would normally use "nodes" in Tree methods. Would this
work?]

Yes, a virtual function would work. However, any time a container or the
like comes up, a template comes to my mind. So, a question to you: why
not make 'Tree' a template? Yes, you'd have to make 'Node' a template as
well.
I don't mind refactoring my code to get this right, if anyone can
suggest a good way to redesign these objects.

How are they used? Without knowing that, how can one suggest a decent
design?
Sub-question: Having two "nodes" objects with the same name in a
BuildNodes object is a really good way to shoot oneself in the foot.
(I'm surprised the code worked for so long before I found this issue
earlier today.) Is there a solution in which I can do away with this
name clash entirely?

Not sure what you mean. There is no name clash. BuildTree's 'nodes'
_hides_ Tree's 'nodes'. Inside BuildTree the base class' 'nodes' does
not exist, essentially (except that it takes up some space).

Victor
 
G

Greg Schmidt

Fellow hackers,

I have a class BuildNode that inherits from class Node.

Similarly, I have a class BuildTree that inherits from class Tree.

Tree includes a member variable:
vector<Node> nodes; // For clarity, let this be "orig_nodes"

BuildTree includes a member variable:
vector<BuildNode> nodes; // For clarity, let this be "build_nodes"

So as far as I can tell, this overloads "nodes", defining two different
"nodes" variables instead of having build_nodes clobber orig_nodes (as
was my intention).

Is there a reason why the base class can't have a vector of pointers?
vector<Node*> nodes;
would allow you to store (pointers to) objects of either Node or
BuildNode type. Of course, something like
vector< boost::shared_ptr<Node> >;
makes memory management easier.

Alternately, depending on your needs, making Tree a template class might
work for you.
Now, let's say I have a Tree method defined:
bool Tree::is_degenerate const { return nodes.empty(); }

If I have a BuildTree object build_tree, and I call
build_tree.is_degenerate(), it returns "orig_nodes.empty()", not
"build_nodes.empty()" as is desired.

That certainly sounds like it's doing what it should, according to the
standard.
So the question is:
Is there any way to define methods in Tree that reference "nodes", and
get those methods to act upon "build_nodes" if the calling object is a
BuildTree and "orig_nodes" if the calling object is a Tree, without
duplicating the code in BuildTree? [I was thinking maybe make a virtual
method "get_nodes()", which in a Tree object returns orig_nodes and in
a BuildTree object returns "build_nodes", and then call get_nodes()
whenever I would normally use "nodes" in Tree methods. Would this
work?]

There are probably lots of hackish ways of getting this to work with the
two separate nodes variables. All of them are likely to cause you a
problem somewhere down the line when you (or someone else working on the
code) forget to use the workaround. Best to get the design right in the
first place, so that it does the "natural" thing.
I don't mind refactoring my code to get this right, if anyone can
suggest a good way to redesign these objects.

Without knowing more about the problem you're trying to solve, it's very
hard to give concrete suggestions. One of the two general ideas above
might work, but I can easily think of cases where neither would be
useful.
 
M

Mike Wahler

Victor Bazarov said:
I don't think I heard the term "clobber" in relation to C++ constructs.
"Overload", "override", and "hide" is what members do to each other.

I've seen (and been amused by) it used with C: The manual for a database
library, in descriptions of several functions warned: "Be sure you
allocate sufficient buffer space, or memory will be clobbered".
I would have probably had it say 'overwritten', but hey, I got
the message. That was probably the only software-related document
in existence that used the term 'clobber' dozens of times. :)

-Mike
 
J

Joseph Turian

Victor,

Thank you for your reply.
I don't think I heard the term "clobber" in relation to C++ constructs.
"Overload", "override", and "hide" is what members do to each other.

Yes, "override" is the term I was looking for.
Yes, a virtual function would work. However, any time a container or the
like comes up, a template comes to my mind. So, a question to you: why
not make 'Tree' a template? Yes, you'd have to make 'Node' a template as
well.

I'm not sure, I hadn't considered that option.

The main objection is that making Tree and Node into templates is
misleading, since they aren't generic. BuildNode and BuildTree are the
only classes inheriting from them. [Perhaps this will be more clear
when I reply to your next question.]
How are they used? Without knowing that, how can one suggest a decent
design?

Without going into too much detail, I'll try to explain what I'm trying
to do:

A Tree is a decision tree that's already been constructed. The most
important operation is finding the confidence for an Example: Percolate
an Example down to a (leaf) Node, and return Node::confidence(). i.e.
An Example's confidence is given by the confidence of the (leaf) Node
down to which the Example percolates.

A BuildTree is a kind of Tree that implements a super-set of Tree
functionality. Essentially, it's a decision tree that's still being
built. Similarly, a BuildNode is a kind of Node that implements a
super-set of Node functionality. viz. A BuildNode includes a feature
vector, which are used to consider potential splits for this BuildNode.

Ideally, a BuildTree would include both Node objects (for internal
nodes) and BuildNode objects (for leaf nodes). Tree would only need
Nodes, since we aren't trying to split the leaves any more. (We only
need a feature vector in leaves that we are trying to split.)

Does this make sense?

Here's a slightly more involved Tree method:

/// Find the node that some example falls into.
const Node* Tree::find_leaf(const Example& e) const {
assert(!nodes.empty());
const Node* n = &nodes.at(root_node);
const Node* newn;

while (!n->is_leaf()) {
if (e.has_feature(n->split())) {
newn = &nodes.at(n->pos_child());
} else {
newn = &nodes.at(n->neg_child());
}
assert(newn->parent() == n->id());
n = newn;
}
return n;
}
[Apologies if you don't see any indenting in the above function.]

To get the BuildTree equivalent of the above function::
s/Tree/BuildTree/g
s/Node/BuildNode/g

Best,
Joseph
 
J

Joseph Turian

Admittedly, alarm bells were ringing in my head as I typed that,
since---regardless of C++ terminology---the term "clobber" in never
used in the context of being *desired* behavior.
But I had forgotten the correct term, so I'm grateful Victor proposed
some alternatives.

Joseph
 
J

Joseph Turian

Greg,
Is there a reason why the base class can't have a vector of pointers?
vector<Node*> nodes;
would allow you to store (pointers to) objects of either Node or
BuildNode type.

That seems like it would work. Do you see any limitations to this
approach?

If I have a Node* which I know is a BuildNode* (because it's contained
in a BuildTree), then can I call a BuildNode method from this pointer
without casting the Node* to BuildNode*? (The C++ FAQ 27.10 says
pointer casts are evil, so I'm wondering if there's an alternative.)
Of course, something like
vector< boost::shared_ptr<Node> >;
makes memory management easier.

True, except that since I know that there will be only a single pointer
to each Node (in the Tree object), memory-management is not a concern.
Hence, I'd rather not incur yet another dependency for successfully
compiling my code.

Joseph
 
V

Victor Bazarov

Joseph said:
[..]
Without going into too much detail, I'll try to explain what I'm trying
to do:

[...]
Does this make sense?

What I am not sure of is why do you need two pairs of classes while you
could (or so it seems) simply have a flag in the Tree object to indicate
that it's still building and in a Node object to indicate that it still
can split...

Essentially, when all Nodes freeze, so does the tree. And you need only
one mechanism to search for nodes...

Victor
 
G

Greg Schmidt

Greg,


That seems like it would work. Do you see any limitations to this
approach?

Well, if you wanted to make sure that BuildNodes only ever go into
BuildTrees, then you have to do some extra checking that your original
method wouldn't. For any design idea, there will always be some
pathological case where that idea doesn't work. This particular
solution isn't one where you would have to look far for such a case, but
if it works for your case, then it is a very simple method.
If I have a Node* which I know is a BuildNode* (because it's contained
in a BuildTree), then can I call a BuildNode method from this pointer
without casting the Node* to BuildNode*? (The C++ FAQ 27.10 says
pointer casts are evil, so I'm wondering if there's an alternative.)

Well, to get from a Node* to a BuildNode*, some cast will be required.
Doing it safely is what dynamic_cast is all about. If the dynamic_cast
succeeds, then you know you have a BuildNode. Otherwise, you know it's
just a normal Node.
True, except that since I know that there will be only a single pointer
to each Node (in the Tree object), memory-management is not a concern.
Hence, I'd rather not incur yet another dependency for successfully
compiling my code.

Well, you still have to delete them eventually. Use of an appropriate
smart pointer class (which one is best depends on the situation, of
course) saves you that bother. Plus, you can learn how to use them now
when it's not critical, and you'll like them so much you'll never use
normal pointers again, which will save you agony eventually. I just
spent the better part of two days tracking down a bug that would never
have happened if the original author had used a reference counting
pointer class instead of a raw pointer.
 
M

msalters

....
There is no name clash. BuildTree's 'nodes'
_hides_ Tree's 'nodes'. Inside BuildTree the base class' 'nodes' does
not exist, essentially (except that it takes up some space).

That's a bit too strong. Inside members of BuildTree the expression
(nodes) or equivalently this->nodes refers to BuildTree::nodes, but
the parent member nodes can be accessed as Tree::nodes.

This trick of explicitly naming the base class in which to look
will work until you start doing weird stuff with Multiple Inheritance
from a single base class.

HTH,
Michiel Salters
 
J

Joseph Turian

Greg,
Plus, you can learn how to use them now
when it's not critical, and you'll like them so much you'll never use
normal pointers again, which will save you agony eventually.

This is sound advice, which I will heed.
In thinking back upon some of the code I've written these past months,
reference counting pointers would have saved me a lot of time.
Joseph
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top