[ ... ]
Yes, of course! Therefore the traversion does not need
to be done again in distance() ! That's the point I'm trying
to make clear. If you reread my inital posting then you will see that
I said that I need the distance value immediately after an insert()
or find() call... Do you see what I mean?
I see what you mean, but I think you're wrong. Consider (for example),
inserting an item that's going to be in the right subtree. insert()
compares the new item to the root node, finds that the new item is
larger than the root, and proceeds to the node to the right of the root.
In doing so, it has ignored the _entire_ left sub-tree.
When you use distance to find the position of the new node in the tree,
it cannot ignore the left subtree. Instead, it traverses the entire left
subtree counting each node as it goes. It then proceeds to count through
the right subtree until it reaches the point at which the new node was
inserted.
Currently the tree is traversed twice: once in insert() and find(),
and the second time in distance(). The second traversion is
IMO unneccessary if insert() and find() would store this value
somewhere internally, and distance() then just returns that value
w/o traversing the tree.
See above. The traversals are entirely _different_ from each other. The
traversal involved in an insertion or deletion is NOT sufficient to
determine the overall position of the new node in the tree.
You _could_ determine an _extremely_ rough estimate of the overall
position in the tree with only a little extra work. A set, map, etc., is
normally implemented as a balanced binary tree (e.g. red-black or AVL
tree). To maintain balancing, such a tree stores some information about
the _relative_ depths of the left and right subtrees in each node. They
limit the difference in height between the left and right subtrees
(though r-b trees and AVL trees impose slightly different restrictions).
Based upon that height difference, and the depth of the tree you _did_
traverse, you could very quickly compute upper and lower bounds for the
overall position of the new node in the tree -- but those bounds may
easily differ from each other by a factor of 2 or so. Getting the
correct number is going to be linear.
[ ... ]
I just want to make this last point clear: I just have pointed out
IMO a shortcoming, performance issues or missing features
of the language that I in the field have encountered.
It's of course up to the language and library designers to
understand the problem and to fix or implement the functionality.
I'm just informing the people who are more involved in such issues.
If I were the designer of the library I would do it the way I described above.
Maybe so -- but if you did, I doubt much of anybody would use your
library. In fact, except for this specific situation, even YOU probably
wouldn't use it. Making things work as you seem to want would involve a
_large_ slowdown in common operations in the hope of achieving a _small_
speedup for a relatively rare sequence of operations.