Basic design question, about data structures

J

JSeb

Hi there gurus,

This is a design question. Hope I'll get some insights!

Trying to get back in shape with C++, I'm writing a nice BST class. So
I have a nice class TreeNode:

template <typename T> class TreeNode
{
private:
T* mpData;
const TreeNode<T>* mpLeftChild;
const TreeNode<T>* mpRightChild;

TreeNode(const T& aData)
: mpData(&aData), mpLeftChild(NULL), mpRightChild(NULL)
{ }

// ...
};

Type T can be a huuuuuge class, so when I initialize a node with data
of this type, I make mpData point directly to the original data. That
is, I don't call type T's copy constructor.

Q1: Is that the usual approach (avoid copy constructors) when writing
a data structure?

Actually, I initially wrote
const T* const mpData;
in the declaration above. This seemed a sensible thing to do, because
I don't want the data to change while it is in the tree. Otherwise,
the tree property might become violated.

But that didn't do it. Consider for instance:

MyType data = MyType();
TreeNode<MyType> node = TreeNode(data)
data.modify()

The last line modifies the content of data, so the tree (that will
eventually contain node) might become invalid...

Q2: How does one typically deal with that king of situation? It seems
we would want the content of the data structure to be "frozen", but
only while it is in the data structure...

I think I might have a couple more seemingly basic questions, but I'll
stop here and wait for now.

Thank you!
 
J

Jeff Schwab

JSeb said:
Hi there gurus,

This is a design question. Hope I'll get some insights!

Trying to get back in shape with C++, I'm writing a nice BST class. So
I have a nice class TreeNode:

template <typename T> class TreeNode
{
private:
T* mpData;
const TreeNode<T>* mpLeftChild;
const TreeNode<T>* mpRightChild;

TreeNode(const T& aData)
: mpData(&aData), mpLeftChild(NULL), mpRightChild(NULL)
{ }

// ...
};

Type T can be a huuuuuge class, so when I initialize a node with data
of this type, I make mpData point directly to the original data. That
is, I don't call type T's copy constructor.

Q1: Is that the usual approach (avoid copy constructors) when writing
a data structure?

Nope. Copy T directly. If copies are expensive, it's up to the client
Actually, I initially wrote
const T* const mpData;
in the declaration above. This seemed a sensible thing to do, because
I don't want the data to change while it is in the tree. Otherwise,
the tree property might become violated.

Yep, that gets tricky. A similar issue arose re. the std::set class:
http://std.dkuug.dk/jtc1/sc22/wg21/docs/lwg-defects.html#103
But that didn't do it. Consider for instance:

MyType data = MyType();
TreeNode<MyType> node = TreeNode(data)
data.modify()

The last line modifies the content of data, so the tree (that will
eventually contain node) might become invalid...

But that's because the pointers are const, and has nothing to do with
whether the payloads are constant. Rather than "T const* const",
consider "T const*". Since the payload-pointer (mpData) is private, you
have a chance to review all attempts to modify the pointer, and can
reject them (or update your tree) as necessary.
 
K

Kai-Uwe Bux

JSeb said:
Hi there gurus,

This is a design question. Hope I'll get some insights!

Trying to get back in shape with C++, I'm writing a nice BST class. So
I have a nice class TreeNode:

template <typename T> class TreeNode
{
private:
T* mpData;
const TreeNode<T>* mpLeftChild;
const TreeNode<T>* mpRightChild;

TreeNode(const T& aData)
: mpData(&aData), mpLeftChild(NULL), mpRightChild(NULL)
{ }

// ...
};

Type T can be a huuuuuge class, so when I initialize a node with data
of this type, I make mpData point directly to the original data. That
is, I don't call type T's copy constructor.

Q1: Is that the usual approach (avoid copy constructors) when writing
a data structure?

Not for container like classes. All standard containers copy the objects so
that they can destruct their contents properly without worrying. Also, your
code is more generic if you use T instead of T* internally: if you want
pointers, just instantiate the template like

Actually, I initially wrote
const T* const mpData;
in the declaration above. This seemed a sensible thing to do, because
I don't want the data to change while it is in the tree. Otherwise,
the tree property might become violated.

Well, then do

TreeNode said:
But that didn't do it. Consider for instance:

MyType data = MyType();
TreeNode<MyType> node = TreeNode(data)
data.modify()

The last line modifies the content of data, so the tree (that will
eventually contain node) might become invalid...

Huh? Invalid in what sense? Note:


int dummy = 5;
int const * ip = &dummy;
dummy = 6;

There is nothing invalid about ip after the last line. It will happily point
to the int dummy, which now has value 6. The const modifier does not render
it invalid to modify an non-const object, it only prevents modification
through the "access route" that was declared const.

Q2: How does one typically deal with that king of situation?

What kind of situation? It is not really clear what the underlying problem
is that you are trying to solve. Without background information, it is hard
to tell you what best practices would be.
It seems
we would want the content of the data structure to be "frozen", but
only while it is in the data structure...

Huh?


[snip]


Best

Kai-Uwe Bux
 
J

JSeb

Huh? Invalid in what sense? Note:
int dummy = 5;
int const * ip = &dummy;
dummy = 6;

I was referring to the fact that the tree property might get violated
if some of the data contained in the tree is modified (without the
tree being aware of it).

Thanks for the info. I appreciate it. So I'll copy the data into the
nodes, and consider using TreeNode<MyType const*> when the copy
constructor is impractical.

I appreciate the answers guys! Jeff:I will check the link you
provided. I'll come back with other stuff that make me scratch my
head.
 
J

Juha Nieminen

JSeb said:
template <typename T> class TreeNode
{
private:
T* mpData;
const TreeNode<T>* mpLeftChild;
const TreeNode<T>* mpRightChild;

TreeNode(const T& aData)
: mpData(&aData), mpLeftChild(NULL), mpRightChild(NULL)
{ }

// ...
};

There are many problems in storing just a pointer to the data. The
most prominent one is that it's very unsafe, as it requires the user to
be very careful about how he creates and stores the data. For instance,
giving the TreeNode constructor a temporary would cause a malfunction
(which the compiler probably doesn't even detect). Likewise giving it a
local instance of an object which will go out of scope sooner than the
tree will also cause malfunction.

Another problem is the one of responsibility: Who is responsible for
freeing the data, the user or this data container? There would be
advantages and disadvantages in both solutions. For example, if the
responsibility for freeing the data is left to the data container, it's
handier from the point of view of the user, and makes the code slightly
safer (as mistakes are less likely to happen). OTOH, it ties the hands
of the user: He *must* always allocate the data with 'new'. He can't,
for example, have an array of the data and give pointers to the data
elements to the tree (because then the tree would try to delete the
individual elements of the data, which is undefined behavior, afaik).
OTOH, if managing the data is left to the user then it becomes more
error-prone, as the user must be very careful when and how he deletes
the data.
Using smart pointers can alleviate this problem a bit, but OTOH it
would still be impossible to give temporaries or elements of an array to
the data container.

And of course there's the slight problem of memory usage (although
it's not such a big problem with a binary tree, as it's already using
quite a log of ancillary memory for each node): If you force the user to
allocate each object with 'new', that consumes additional RAM, compared
to the case where the object is directly stored in the tree node.

The approach of the STL containers is that they store copies of the
data objects in the data structure elements. It solves most
responsibility problems and in many cases it's much more
memory-efficient (std::vector and std::deque being good examples), and
it allows giving temporaries and local objects to the container. If
copying the objects is very heavy, the responsibility of making it
lighter is left to the user. (He can, for example, make the STL
container store pointers instead of instances if he wants.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,007
Latest member
obedient dusk

Latest Threads

Top