(e-mail address removed) (Dave O'Hearn) wrote in message
[ ... ]
What about page faults? A tree can cause a page fault on every one of
those 20 operations. Most likely, the first couple will be cached, but
once you get down several levels in the tree, you will start getting
page faults.
Remember that a tree grows exponentially. The single bottom level of a
(balanced) tree accounts for _half_ the data. If only the bottom four
levels of the tree have been paged out, that means only 1/16th of the
data is in memory.
That can only happen in one of two ways: either your memory is HUGELY
overcommitted, or your tree is used VERY little (or both). If memory
is overcommitted so badly that only 1/16th of frequently-used data is
in memory, NOTHING is going to fast -- and we're still only talking
about 4 disk accesses, which means the lookup itself is still likely
to appear instantaneous.
Alternatively, the data is simply used SO rarely that essentially ALL
the data has been paged out. First of all, even at that, the penalty
isn't terrible. With a modern disk a single lookup might be expected
to take around 15 ms. 20 of those translates to 300 ms total, which is
long enough for the user to notice a pause, but that's about it. Our
original posutlate to get to this point says this is happening only
(say) once in something like a matter of hours or so, at which point
saving milliseconds starts to become pointless.
In reality, if these lookups are really that rare, chances are the
user will notice a lot longer pause, but it'll be while the code to DO
the lookup gets paged in. In this case, optimizing the lookup itself
will usually make so little difference the user will never even notice
-- if the code takes (for example) 4 seconds to page in, then we might
be looking at 4.1 vs. 4.3 seconds, and the average user won't even
know which was faster.
In this case, we're clearly optimizing the wrong thing -- eliminating
the lookup time entirely still makes less difference than a 10%
improvement in the code paging time.
There are special tree implementations that reduce page faults, by
making B-Trees where each node has many children, like hundreds of
children, instead of just a 'left' and a 'right'. It's very unlikely
that std::set is going to use that optimization, as it only makes
sense on large data sets.
It's certainly true that such things exist, but based on what's been
said about the intended use, I doubt they're suitable. B-trees (and
their brethren such as B* and B+ trees) normally make little sense
unles you know up-front that the majority of the data WILL reside on
disk.
Given that he said he's storing information about files, that seems
hard for me to believe -- just for example, the computer I'm writing
this on has about 350,000 files and 1 Gig. of RAM. I'd guess typical
users have an even higher ratio of RAM:file-count than I do (at least
IME, programmers generally have more, smaller files than "normal"
users). There are certainly file servers with more files, but they
typically have more memory as well.
As such, we more or less have to postulate something like using a
wimpy workstation to store data about all the files on a big server,
or else storing a LOT of information about each file. The former case
sounds to me like a bad enough system design that no data sructure can
save it. The latter case simply calls for separating the large data
from the index that finds it (or only putting data into the leaf
nodes, about like a B+tree).
I would take advantage of generic programming, and implement the
application first using std::set, and then with whichever hash_set
extension you like, and profiling which actually behaves better in
practice. With generic code and a few typedefs, possibly only one line
of code will have to be changed to try out another container type.
I quite agree -- in fact, this is pretty much what I said in my
follow-up.