Help with fast tree-like structure

David Hirschfield · Jan 25, 2006

I've written a tree-like data structure that stores arbitrary python
objects. The objective was for the tree structure to allow any number of
children per node, and any number of root nodes...and for it to be
speedy for trees with thousands of nodes.

At its core, the structure is just a list of lists arranged so that if
node A has children B and C and node B has child D the data looks like:

A = [<A node data>, B, C]
B = [<B node data>, D]
C = [<C node data>]

where B, C and D are all lists with similar structures to A. I am
holding references to the individual nodes so that access to individual
nodes by reference is quick. Access by "tree path" is done by giving a
tuple of integers indicating where in the tree the node you want lies.
The path (1,2,5) indicates the 6th child of the 3rd child of the 2nd
root node. Everything involving basic tree functions works fine. Finding
any particular node this way is just a function of the depth of the node
in the tree, so it's very quick unless you have some degenerate tree
structure where nodes end up miles deep.

Here's my problem: I need to allow the tree to "hide" the roots, so that
the top-level appears to the outside world to be the children under the
root nodes, not the root nodes themselves. That means a path of (5,2)
indicates the 3rd child of the 6th "pseudo-root" node. Unfortunately, in
a tree with many root nodes, each containing many children (a common use
case for me) finding the 6th pseudo-root is going to be slow, and the
ways I've thought of to make it fast all require slow bookkeeping when
new nodes are inserted or removed at the pseudo-root level.

I'm running out of ideas, so if anyone has any thoughts on how to deal
with fudging which nodes are the roots efficiently...I'm all ears.
Thanks in advance,
-David

Lonnie Princehouse · Jan 26, 2006

B = [<B node data>, D]
C = [<C node data>]

Why store node data in the same list that contains the children? It
seems like the OO approach would be more extensible, e.g.

class Node:
def __init__(self):
self.children = [] # list of nodes
# store node data as attributes...

# If you wanted to, you could even implement the [] operator for the
exact same syntax:

def __getitem__(self, i):
return self.children

(Note that using slots will make the above class more efficient,
particularly for trees with many nodes)

Of course, that doesn't answer your question. If I've interpreted it
right, then what you're trying to do could be restated as "make
multiple nodes at the root tier appear to be a single node", the
implication of which is that finding the n-th child of this meta-node
is now going to take O(R) time, where R is the number of concealed
roots. You could make a new list of pseudo-roots everytime one is
inserted or deleted --- the book-keeping you refer to, probably --- and
that would make your searches constant time at the expense of inserts
and deletes which are ~linear w.r.t the total number of pseudo-roots.
Not really any way around it... the best approach will likely be
determined by frequency of search operations vs. frequency of
insert/deletes.

How can I structure the final array to meet the requirements of Bootstrap Tree View for building a tree in JavaScript?	1	Mar 29, 2024
Programing Challenge: Constructing a Tree Given Its Edges.	0	Jan 8, 2014
Tree structure	3	Jul 26, 2011
Tree data structure with: single, double and triple children optionalong with AVM data at each node	1	Nov 24, 2011
how to make a tree with randomly selected html tags from an array in python?	0	Mar 10, 2013
Need help finding Segmentation fault C++	0	Apr 16, 2022
Need help with bisection method	1	Oct 4, 2022
parsing nested unbounded XML fields with ElementTree	6	Nov 25, 2013

Help with fast tree-like structure

David Hirschfield

Lonnie Princehouse

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads