[Followups set to c.p.]
I would like opinions on whether this solution could be important and
whether anyone knows if it is already solved. (I would also like to
know if anyone thinks it cannot be solved.)
First, we start with a general tree (any of the trees that my data
structures book calls rooted trees or unordered trees). Then we are
given two specific nodes (Na and Nb) that belong to that tree.
We would like to know if the second node (Nb) is an ancestor of the
first node (Na).
Here's my attempt at a summary of what's been said in the various
subthreads, as much for my benefit as yours.
The problem doesn't seem terribly useful, although it is useful for
hierarchies, such as the example given in
http://www.barneyb.com/blog/archives/000570.jsp
food
/ \
meat fruit
| / \
beef apple pear
If we can solve the "is-ancestor-of" problem in O(1), then we can answer
questions such as "Is beef food?" and "Is an apple a fruit?" This could
be useful in typechecking object-oriented languages that use a lot of
subtyping and inheritance.
So, how can we implement this? Joe Celko's "nested sets" solution
looks like this:
1.food.12
/ \
2.meat.5 6.fruit.11
| / \
3.beef.4 7.apple.8 9.pear.10
Each node has two fields "left" and "right", such that
parent.left < child.left and child.right < parent.right. Then checking
that Na is a descendant of Nb simply means checking that
(Nb.left < Na.left && Na.right < Nb.right)
This solution is easy to implement, and deleting a node is easy, but
adding new nodes is naively an O(n) operation, since the "left" and
"right" fields of about half the tree's nodes need to be updated.
However, for the OO-typechecking application, lookups are going to be
much more common than inserts, so we win.
Googling "Joe Celko nested sets" does indeed turn up useful information
on this method.
Another method was proposed by Alf Steinbach in this thread: In each
node, instead of a parent pointer, store the path from the root itself,
represented as a sequence of "left" and "right" directions. For example,
the path stored in the "apple" node above would be "right,left". (Note
that this method only really works for binary trees, or with extensions,
N-ary trees. It doesn't seem applicable to general rooted trees.)
Then, checking whether Na is a descendant of Nb boils down to checking
whether Nb.path is a prefix of Na.path, which can be done in O(lg N) time
naively, and is probably possible in much less time using bitwise
arithmetic. (See below.)
As Alf points out, O(lg N) is basically O(1) on a machine with N-bit
addressing. (lg 32 is just a constant.) This means that the real-world
solution of simply storing a parent pointer in each node is basically
constant-time, too --- but the stored-path solution saves time if
dereferencing is expensive.
A third approach, which boils down to Alf's solution, is to notice that
iff Nb is an ancestor of Na, then the lowest common ancestor of Na and Nb
is Nb itself. Let's number the nodes in the order of an in-order traversal
on a complete binary tree, like this:
food.100
/ \
meat.010 fruit.110
/ / \
beef.001 apple.101 pear.111
Notice that Nx.number can also be described as "the path from the root
to Nx, followed by a 1 and right-padded with zeros." Then the paths of Na
and Nb diverge at the leftmost 1 bit of (Na.number XOR Nb.number), and
we can get their lowest common ancestor by computing
/* suppose Na = apple, Nb = fruit */
xor = Na.number ^ Nb.number; /* xor = 011 */
left = find_leftmost_one(xor); /* left = 010 */
pathmask = ~(2*left-1); /* pathmask = 100 */
path = (Na.number & pathmask) | left; /* path = 110 */
Now, since path == Nb.number, we see that indeed, Nb is the lowest common
ancestor of Na and Nb.
Unfortunately, actually /finding/ the leftmost 1 bit in a word isn't
possible in constant time using only the standard C operators. I think
Java provides a library function to do it quickly; and there is a hack
involving the FPU (convert to 'double', examine the exponent field) that
would work if you /really/ needed speed at the cost of portability.
More information on the "lowest common ancestor" algorithm is available
here:
http://homepage.usask.ca/~ctl271/810/lowest_common_ancestor.pdf
The find-lowest-one FPU hack is documented here:
http://www.diku.dk/hjemmesider/ansatte/jyrki/Course/Performance-Engineering-1998/
HTH,
-Arthur