Note that O() and "amortized time" are different things. O() denotes
the *worst-case* scenario of one single operation. Even if rebalancing
the tree after a removal could be done in *amortized* constant time,
it could still be O(lg(N)).
Not really. Remember that deleting an item must require no more than an
amortized constant number of operations: therefore, if erasing an item from
a set takes O(log(N)) operations (say, to rebalance the underlying RB
tree), then it must be the case that none of the preceding (or following)
items erased could have required more than a constant number of operations
to complete. In other words, in order to achieve a constant amortized
operation count, there must be a sharp limit on how often a "worst case" may
occur relative to the number of "best cases" that occur when erasing an item
from a set.
So, unlike, say, searching for an item in a set (in which every search could
require a "worst case" or O(log(N)) number of operations), erasing an item
from a set may require O(log(N)) operations only rarely (I estimate around 1
in e^N items erased could turn out to be a worst case).
A clearer example would be a std::vector that doubles its capacity whenever
its size is about to exceed its current capacity. Now, even though inserting
an item into this vector could require a (worst case) N number of operations
- to allocate the added capacity - it should be clear that not every
insertion could be a worst case. In fact the number of such worst cases is
limited to 1 per every 2^N insertions. So in this case, even though the
worst case has linear complexity, the average case still has a constant
complexity, because as the worst cases become worse - they also become less
frequent by the same degree.
Granted, I'm not acquainted with the exact algorithm used for
rebalancing the tree, but I'm pretty sure it takes lg(N) time in
the worst case.
Yes, but the worst cases occur less and less frequently and are therefore
always compensated by an ever-increasing number of constant "best cases". So
the average complexity per deletion when the best cases are included with
the worst cases - is still a constant.
It's the same thing as with the tree traversal: Even though a whole
traversal can be done in linear time (iow. the whole traversal takes
O(N) steps), one single iterator increment is O(lg(N)) (because there
are cases where it takes that many operations).
Your explanation is entirely backwards. First, it seems that you must be
describing the number of steps needed to find a specified item in a
container. Second, Linear complexity, O(N), is much worse than logarithmic
complexity, O(log(N)) So O(N) is the expected worst case to find an item in
certain unsorted containers (such as a vector). To find an item in a set on
the other hand, has only a logarithmic worst-case complexity O(log(N)) - a
significant improvement over a vector.
In any event, searching in a set is not comparable to erasing or adding an
item (at a known location) - since, when searching there is no limit on the
frequency of "worst" cases searches versus "best" case searches - and in
fact every search could be a worst case - so its complexity has to be
reported with the most pessimistic assumption - just in case.
Greg