Meyers's preference for vectors over maps

F

Fred Ma

Hello,

I was looking at Meyers's "Effective STL", item 23
about choosing between vectors and maps (at least
that the choice for me). In many cases, using
sorted vectors is faster for lookups. The two
reasons are (1) that vector elements are smaller,
since map elements need at least 3 pointers, and
(2) contiguity of vector elements in memory ensure
fewer page faults.

For me, I will a container of objects that are much
bigger than pointers, so reason (1) doesn't apply.

Reason (2) is what I'm trying to understand better.
Let me describe my application.

In my case, I will have N items in the container,
will look up n (n<N) items, do some processing,
then replace the n worst items in the container
with the n new items from processing. Here, worse
is defined by the sorting function. The problem
is that n can vary from 1 to N, making the advantages
of vector vs. map very unclear. For small n,
map is better because I'm basically interspersing
1 modification of the database with 1 lookup.
For large n, vector is better because it's faster
to sort a whole bunch of objects at once rather
than building a map in a piecemeal fashion.

Since I can't get a clear answer from that line of
reasoning, I looked more carefully at Meyers's
reason (2). I will be using binary search for
lookup, and he claims that contiguity of elements
is good for binary search in particular. More
specifically, I understood him as meaning that
closeness of elements in memory should match
closeness of elements in the traversal of
container elements. However, my understanding
of binary search is that it doesn't traverse
elements in sequential order. So why the
contiguity of elements in a vector be of any
advantage?

Thanks for clarifying. And if there are any
other considerations that would make it easier
to decide upon a proper container, I'd appreciate
those too. Right now, I'm tempted to go for a
vector because I haven't used a map before. But
for that reason, I'm also thinking it's not a bad
reason to dabble in maps, too.

Fred
 
H

Hendrik Belitz

Fred said:
In my case, I will have N items in the container,
will look up n (n<N) items, do some processing,
then replace the n worst items in the container
with the n new items from processing. Here, worse
is defined by the sorting function. The problem
is that n can vary from 1 to N, making the advantages
of vector vs. map very unclear. For small n,
map is better because I'm basically interspersing
1 modification of the database with 1 lookup.
For large n, vector is better because it's faster
to sort a whole bunch of objects at once rather
than building a map in a piecemeal fashion.

That looks like a good candidate for usage of a priority queue.
Removing items from both a vector and a map is slow, and since you need
a sorting criteria, a normal deque will not suffice.
 
F

Fred Ma

Hendrik said:
That looks like a good candidate for usage of a priority queue.
Removing items from both a vector and a map is slow, and since you need
a sorting criteria, a normal deque will not suffice.


Hi, Hendrik,

I looked up priority queues in Josuttis's "The C++
Standard Library". It's built on top of the standard
iterator-supporting containers, with the vector as a
default. Wouldn't I still be faced with the comparison
of issues between vectors and maps for my usage? My
impression is that the special container adapters
simplify the interface rather than changing the
underlying mechanism of data storage and retrieval.
Admittedly, this could be a misimpression on my part.

Fred
 
H

Hendrik Belitz

Fred said:
Hi, Hendrik,

I looked up priority queues in Josuttis's "The C++
Standard Library". It's built on top of the standard
iterator-supporting containers, with the vector as a
default. Wouldn't I still be faced with the comparison
of issues between vectors and maps for my usage? My
impression is that the special container adapters
simplify the interface rather than changing the
underlying mechanism of data storage and retrieval.
Admittedly, this could be a misimpression on my part.

Uh well, I didn't even recognized that priority queues are available as an
STL adapter. Normally priority queues are implemented by fibonacci heaps
and are very efficient. Maybe you should look for some implementation which
does it that way instead of using a stl container/adapter.
 
T

tom_usenet

Hello,

I was looking at Meyers's "Effective STL", item 23
about choosing between vectors and maps (at least
that the choice for me). In many cases, using
sorted vectors is faster for lookups. The two
reasons are (1) that vector elements are smaller,
since map elements need at least 3 pointers, and
(2) contiguity of vector elements in memory ensure
fewer page faults.

For me, I will a container of objects that are much
bigger than pointers, so reason (1) doesn't apply.

Reason (2) is what I'm trying to understand better.
Let me describe my application.

In my case, I will have N items in the container,

Is N fixed at runtime? IOW, is the size of the container constant?
will look up n (n<N) items, do some processing,
then replace the n worst items in the container
with the n new items from processing. Here, worse
is defined by the sorting function. The problem
is that n can vary from 1 to N, making the advantages
of vector vs. map very unclear. For small n,
map is better because I'm basically interspersing
1 modification of the database with 1 lookup.
For large n, vector is better because it's faster
to sort a whole bunch of objects at once rather
than building a map in a piecemeal fashion.

I would recommend a vector. Keep another empty vector handy. Start by
sorting your main vector. Calculate your new range of n to add and
sort it (in O(n)), then merge that with your old vector into your same
sized vector.

tempvec.reserve(v.size());
std::sort(newrange.begin(), newrange.end());
//merge the N-n remaining elements from v with the new range:
std::merge(v.begin() + n, v.end(), newrange.begin(), newrange.end(),
std::back_inserter(tempvec));

//swap
tempvec.swap(v);
tempvec.clear();

etc. That way the operation is only n log n, not N log N.
Since I can't get a clear answer from that line of
reasoning, I looked more carefully at Meyers's
reason (2). I will be using binary search for
lookup, and he claims that contiguity of elements
is good for binary search in particular. More
specifically, I understood him as meaning that
closeness of elements in memory should match
closeness of elements in the traversal of
container elements. However, my understanding
of binary search is that it doesn't traverse
elements in sequential order. So why the
contiguity of elements in a vector be of any
advantage?

Because they are all close to each other in memory. If a 4k chunk of
the vector is pulled into cache memory, you'll get a lot of cache
hits.
Thanks for clarifying. And if there are any
other considerations that would make it easier
to decide upon a proper container, I'd appreciate
those too. Right now, I'm tempted to go for a
vector because I haven't used a map before. But
for that reason, I'm also thinking it's not a bad
reason to dabble in maps, too.

Given the fixed size, vector seems a good choice. If your elements are
expensive to copy, I'd go for a vector or smart pointers, or just
pointers (with manual memory management).

Tom

C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html
 
D

Daniel T.

Fred Ma said:
I was looking at Meyers's "Effective STL", item 23
about choosing between vectors and maps (at least
that the choice for me). In many cases, using
sorted vectors is faster for lookups. The two
reasons are (1) that vector elements are smaller,
since map elements need at least 3 pointers, and
(2) contiguity of vector elements in memory ensure
fewer page faults.

I guess the obvious answer is to create an adaptor that will allow you
to easily switch between the two. Then profile both and pick the most
efficient one.
 
D

Dietmar Kuehl

Fred Ma said:
I looked up priority queues in Josuttis's "The C++
Standard Library". It's built on top of the standard
iterator-supporting containers, with the vector as a
default. Wouldn't I still be faced with the comparison
of issues between vectors and maps for my usage?

If the operations provided for a priority queue fit your needs, it is almost
certainly the better choice: this data structure is specialized for finding
the smallest (or biggest; depends on the direction of your predicate)
element as fast as possible. That is, insertion into a priority queue just
does the necessary work to provide fast access to the smallest element.
Likewise for removal.

If you need to access or change other elements than the smallest one, things
change quite a lot, however. The priority queue in the standard library is
not at all suited for this but there are priority queues which are helpful
here, too. However, the difference to other data structures like eg. maps
is reduced.
My
impression is that the special container adapters
simplify the interface rather than changing the
underlying mechanism of data storage and retrieval.

This is true for stack and queue but not for priority_queue. The latter
implements the logic of a 2-heap (a special case of a d-heap which is just
a balanced tree with d children for each inner node). This is quite different
from doing eg. a binary search etc. Also, the number of elements changed
during inserts or removals is not linear in the size of the vector but
logarithmic.

Concerings Scott's comment about the advantages of vector over map: at first
sight, a map looks superiour to a vector. The first insight why it is not so
in all situations is the amount of work necessary to traverse the map's
internal tree compared to a binary search within the vector which just
effectively uses an "ad-hoc" binary tree. To move from one node to it child,
the map has to dereference a pointer whereas the vector simply a clever
addition or subtraction. This is often much faster. Towards the end of the
search where most of the work is going on and the steps become smaller,
the "nodes" (in case of the map the explicit tree nodes, in case of the vector
the implicit current location) are more likely to sit on one page for the
vector because they are [nearly] adjacent to each other. A page miss is
pretty expensive. Also, the vector's nodes are much smaller (just the data
you store there) compared to the map's nodes (which typically contain at least
three pointers for the map's maintenance in addition to the data you store)
increasing the chances to be on one page even more. Especially for small user
data, the vector's movements are likely to eventually move around only in a
cache-line of the CPU while the map is still causing page misses or at least
cache misses.

For searching it is pretty obvious that you want to use a vector in favour of
a map. If you can go with an unorder structure, a hash is an even better
choice, BTW.

Things become less obvious if you need to change the data: if the number of
searches compared to the number of modifications (insertions and removals;
a change is effectively a removal followed by an insertion for both a map and
a sorted vector although it can be improved a little bit) becomes small, ie.
if you have only few searches between modification, things start to change:
a huge collection of data will cause a lot of moving for vectors while it will
cause only a few pointer adjustments for maps. On the other hand, the pointer
adjustments are more involved than simply moving a set of small PODs in a
fairly small vector. The tricky part is figuring out where one approach is
better than the other in such a situation because apart from the CPU cycles
things like the memory foot-print also have an effect. This normally results
in vectors with even a few hunderd elements being faster than maps.

The details depend on the exact setting, including the size of elements, the
size of the container, the total amount of memory used, the access patterns,
etc. Thus, it is normally reasonable to implement a working solution using an
appropriate class and later profile the whole application to find out whether
this part is critical at all (normally the critical parts are others than one
suspects...) and if so, replace the solution by alternatives to measure them,
too. It may be reasonable to use a class with an appropriate interface which
is internally implemented by one of the choices and whose implementation is
later changed as needed.
 
D

Dietmar Kuehl

Hendrik said:
Uh well, I didn't even recognized that priority queues are available as an
STL adapter. Normally priority queues are implemented by fibonacci heaps
and are very efficient.

Are they, now? Well, the 'std::priority_queue' adapter is by an order
of magnitude faster than the best Fibonacci heap implementation I could
do. Of course, this alone does not say much but I could also improve on
the adapter implementation from SGI-STL, although only by something like
5%.

The advantage of Fibonacci heaps over the 2-heap normally used by the
priority queue adapter is when it comes down to modifying elements:
this operation is not supported at all while it is the operation
Fibonacci heaps are good at. If this feature is needed, Fibonacci heaps
are on par with d-heaps (d == 2 is actually not the best choice here;
I think d == 5 was optimum for my tests) extended to provide this
feature for all number of elements I tested it with (which was up to
10 million if I remember correctly).
Maybe you should look for some implementation
which does it that way instead of using a stl container/adapter.

Have a look at <http://www.dietmar-kuehl.de/cxxrt/heaps.tar.gz>. This
is a collection of priority queues I have implemented. It includes
quite a few different ones, including Fibonacci heaps.
 
F

Fred Ma

Dietmar said:
If you need to access or change other elements than the smallest one, things
change quite a lot, however. The priority queue in the standard library is
not at all suited for this but there are priority queues which are helpful
here, too. However, the difference to other data structures like eg. maps
is reduced.

See, that's the problem. In each outer-most iteration, I lookup n items
that can occur anywhere in the N-length vector, not just at the ends. If
n was either small or almost the same as N, then I can say whether I have
lots of lookups per resorting of the vector (after replacing the n
worst vectors with new elements created from the n elements that were
looked up). (It's a genetic algorithm, in case that's of interest to
anyone). If the standard C++ library's priority queue is not optimized
for this, then I should probably choose between a vector or a map. I'm
not adverse to exploring more custom solutions, but for the first time
wandering away from the ol' vector, I prefer a smaller step, with lots of
published documentation (books, website references, news articles, faqs).

Concerings Scott's comment about the advantages of vector over map: at first
sight, a map looks superiour to a vector. The first insight why it is not so
in all situations is the amount of work necessary to traverse the map's
internal tree compared to a binary search within the vector which just
effectively uses an "ad-hoc" binary tree. To move from one node to it child,
the map has to dereference a pointer whereas the vector simply a clever
addition or subtraction. This is often much faster. Towards the end of the
search where most of the work is going on and the steps become smaller,
the "nodes" (in case of the map the explicit tree nodes, in case of the vector
the implicit current location) are more likely to sit on one page for the
vector because they are [nearly] adjacent to each other. A page miss is
pretty expensive. Also, the vector's nodes are much smaller (just the data
you store there) compared to the map's nodes (which typically contain at least
three pointers for the map's maintenance in addition to the data you store)
increasing the chances to be on one page even more. Especially for small user
data, the vector's movements are likely to eventually move around only in a
cache-line of the CPU while the map is still causing page misses or at least
cache misses.

For searching it is pretty obvious that you want to use a vector in favour of
a map. If you can go with an unorder structure, a hash is an even better
choice, BTW.

Things become less obvious if you need to change the data: if the number of
searches compared to the number of modifications (insertions and removals;
a change is effectively a removal followed by an insertion for both a map and
a sorted vector although it can be improved a little bit) becomes small, ie.
if you have only few searches between modification, things start to change:
a huge collection of data will cause a lot of moving for vectors while it will
cause only a few pointer adjustments for maps. On the other hand, the pointer
adjustments are more involved than simply moving a set of small PODs in a
fairly small vector. The tricky part is figuring out where one approach is
better than the other in such a situation because apart from the CPU cycles
things like the memory foot-print also have an effect. This normally results
in vectors with even a few hunderd elements being faster than maps.

The details depend on the exact setting, including the size of elements, the
size of the container, the total amount of memory used, the access patterns,
etc. Thus, it is normally reasonable to implement a working solution using an
appropriate class and later profile the whole application to find out whether
this part is critical at all (normally the critical parts are others than one
suspects...) and if so, replace the solution by alternatives to measure them,
too. It may be reasonable to use a class with an appropriate interface which
is internally implemented by one of the choices and whose implementation is
later changed as needed.

Yes, I suspected that Meyer's reasoning about page misses would be more
applicable toward the end of the search, since you are bisecting smaller
pieces of your vector. I just wasn't sure whether it was enough for it
to happen toward the end. Also, how much toward the end depends on the
element sizes, as you mentioned, as well as the cache size. I have a
vector of 24 doubles, plus various overhead, and I was trying to keep
away from having to consider the hardware parameters e.g. cache size.
But it seems there's no way to avoid it. Profiling seems to be the
only sure way. I will put that off for now, since I'm trying to get
some results for a deadline (no one else's problem, of course).

I also get the point about the 3 or more pointers per node in a map.
At first, I thought that the issue here was the size of the pointers,
which is much smaller than the container elements in my case. However,
you clarified that the dereferencing overhead also mattered. Thanks for
clearing that up.

I think I'll stick with the ol' vector this time. Expanding my horizons
will happen another day, even such a small expansion as to maps. And, of
course, to a comparison between maps and vectors.

Thanks, all, for your elaborations.

Fred
 
F

Fred Ma

tom_usenet said:
Given the fixed size, vector seems a good choice. If your elements are
expensive to copy, I'd go for a vector or smart pointers, or just
pointers (with manual memory management).


Yeah, vectors seem to have a strong case behind them.
Thanks. (The vector length is constant).

Fred
 
F

Fred Ma

Daniel T. said:
I guess the obvious answer is to create an adaptor that will allow you
to easily switch between the two. Then profile both and pick the most
efficient one.

Daniel,

I agree. Implementing both and profiling will certainly
determine the answer. And I would certainly like to
learn the profiler some time (it always seems like such
a luxury, time-wise, but it would immediately answer
lots of questions that can't be figured out otherwise).

For starters, I will hammer out a vector implementation.

Thanks.

Fred
 
J

Jerry Coffin

[ ... ]
In my case, I will have N items in the container,
will look up n (n<N) items, do some processing,
then replace the n worst items in the container
with the n new items from processing. Here, worse
is defined by the sorting function. The problem
is that n can vary from 1 to N, making the advantages
of vector vs. map very unclear. For small n,
map is better because I'm basically interspersing
1 modification of the database with 1 lookup.
For large n, vector is better because it's faster
to sort a whole bunch of objects at once rather
than building a map in a piecemeal fashion.

Finding n smallest (or largest) items in a container can be done
considerably more quickly than simply sorting the container and then
taking the first (or last) n items. The latter requires a sort, which
is O(N lg N), where the former can be done in approximately O(n lg N)
time -- a big advantage if n is quite a bit smaller than N, and a
progressively smaller one as n approaches N.

The usual way of doing this is basically a modification of a Quick Sort.
In a Quick Sort, you pick a pivot, partition your elements, and then
recursively do the same with each partition.

To pick n elements instead, you simply pick the partition those elements
will fall into, and only recurse into the partition, ignoring the other.
For example, assume you want the 10 smallest elements out of 100. For
the moment I'll assume you pick perfect pivot elements every time, so
the first time you divide it into the 50 smallest and the 50 largest.
Clearly the 10 smallest aren't in the 50 largest, so you simply ignore
the right hand partition, and continue with the left partition. The
next time, you divide it into two groups of 25 elements apiece. Again,
you ignore the larger elements, and concentrate only on the smaller
ones. The next time around, you're down to 12 elements in the left
partition -- from there, it's probably easiest to just search linearly
to find the to largest of what's left, and discard them, leaving the 10
that you care about.

IIRC, whether this is better or worse than a heap depends on the
relative frequency of updating elements vs. inserting entirely new
elements. If (for example) you start with 100 elements, and every time
around, you insert 50 entirely new elements, find the 10 worst out of
the entire array, and so on iteratively, then chances are that this will
work better than a heap. At the opposite extreme, if the only updating
on the data is due to the processing you do after finding your n
elements, then chances are much better that a heap will work better.

A heap puts in a bit more work up-front, and does a partial sorting on
the entire array, so if you can put that to use, it will usually be a
win. OTOH, if you just search for n elements, you put less work into
arranging the data (so it's faster the first time) but you're leaving
the rest of the array relatively random, so you haven't gained much for
the next iteration.
 
D

Daniel T.

Fred Ma said:
I agree. Implementing both and profiling will certainly
determine the answer. And I would certainly like to
learn the profiler some time (it always seems like such
a luxury, time-wise, but it would immediately answer
lots of questions that can't be figured out otherwise).

For starters, I will hammer out a vector implementation.

For starters, you should design an interface that will make this special
queue of yours easy to use for the classes that use it. Then make it so
that you can switch out which kind of queue gets used by simply changing
one or two lines of code. *Then* hammer out a vector implementation.

This way, the endeavor isn't such a time waste. Instead it is a
opportunity to improve your design.
 
F

Fred Ma

Daniel T. said:
For starters, you should design an interface that will make this special
queue of yours easy to use for the classes that use it. Then make it so
that you can switch out which kind of queue gets used by simply changing
one or two lines of code. *Then* hammer out a vector implementation.

This way, the endeavor isn't such a time waste. Instead it is a
opportunity to improve your design.


I believe it is not a waste of time either.
However, I have to prioritize the things I investigate.
I also have a deadline which I will miss if I try to
pursue too much in terms of expanding my C++ abilities
per se. I appreciate all the response, and my question
was posed more for a better understanding of the relevant
issues rather than solely determining the implementation
path I will choose. In fact, after some more reflection,
I think it's better to not even use a binary search (be it via
a map or binary searching a vector). I will first sort
the n elements I am searching for, then scan the N-element
lookup table for those elements (n<N). I after finding
the first sought-for element in the lookup table, I will
scan for the next sought-for element starting from that
point in the lookup table. On average, n=N/4, so that
means I scan an average of 4 consecutive elements before
encountering the next sought-for element. I can live with
that.

I'm sure there are more sophisticated ways that change
depending on the size of n and N, but again, the sort
and lookup is a tiny part of a much bigger problem, all
of whose aspects I will have to shortly and entirely
address. Thus the compromise in thoroughness of
investigating the sort/lookup, though I do appreciate
the thoughts on the things that matter for that problem.
There will certainly be situations in future in which I
will have to rely on the ideas that I've picked up in
this thread.

Fred
 
F

Fred Ma

Jerry said:
[ ... ]
In my case, I will have N items in the container,
will look up n (n<N) items, do some processing,
then replace the n worst items in the container
with the n new items from processing. Here, worse
is defined by the sorting function. The problem
is that n can vary from 1 to N, making the advantages
of vector vs. map very unclear. For small n,
map is better because I'm basically interspersing
1 modification of the database with 1 lookup.
For large n, vector is better because it's faster
to sort a whole bunch of objects at once rather
than building a map in a piecemeal fashion.

Finding n smallest (or largest) items in a container can be done
considerably more quickly than simply sorting the container and then
taking the first (or last) n items. The latter requires a sort, which
is O(N lg N), where the former can be done in approximately O(n lg N)
time -- a big advantage if n is quite a bit smaller than N, and a
progressively smaller one as n approaches N.

The usual way of doing this is basically a modification of a Quick Sort.
In a Quick Sort, you pick a pivot, partition your elements, and then
recursively do the same with each partition.

To pick n elements instead, you simply pick the partition those elements
will fall into, and only recurse into the partition, ignoring the other.
For example, assume you want the 10 smallest elements out of 100. For
the moment I'll assume you pick perfect pivot elements every time, so
the first time you divide it into the 50 smallest and the 50 largest.
Clearly the 10 smallest aren't in the 50 largest, so you simply ignore
the right hand partition, and continue with the left partition. The
next time, you divide it into two groups of 25 elements apiece. Again,
you ignore the larger elements, and concentrate only on the smaller
ones. The next time around, you're down to 12 elements in the left
partition -- from there, it's probably easiest to just search linearly
to find the to largest of what's left, and discard them, leaving the 10
that you care about.

IIRC, whether this is better or worse than a heap depends on the
relative frequency of updating elements vs. inserting entirely new
elements. If (for example) you start with 100 elements, and every time
around, you insert 50 entirely new elements, find the 10 worst out of
the entire array, and so on iteratively, then chances are that this will
work better than a heap. At the opposite extreme, if the only updating
on the data is due to the processing you do after finding your n
elements, then chances are much better that a heap will work better.

A heap puts in a bit more work up-front, and does a partial sorting on
the entire array, so if you can put that to use, it will usually be a
win. OTOH, if you just search for n elements, you put less work into
arranging the data (so it's faster the first time) but you're leaving
the rest of the array relatively random, so you haven't gained much for
the next iteration.

Jerry, I am replacing the worst n of N elements (n<N) with new elements,
and I realize that partitioning is the way to go for that step (I am
going to use the partition() in STL for convenience). But prior to
this step, I lookup n out of N elements, and these are not the same
as the n worst elements. They can occur anywhere. I also neglected
to mention details which ultimately decided that binary searching (either
by a tree like a map, or binary search) is not the best way to perform
those lookups. That is, n will average N/4. What I will do is sort
the n elements to be looked up, then look for each one in turn by linearly
scanning the N elements of the lookup table. When one of the n elements
is found, the next one will be scanned for from that position in the lookup
table. So there will be an average of 3 skipped elements in the lookup
table for every one found. I take advantage of locality in the cache
more so than using a binary search, since the lookup table is traversed
sequentially, and the overhead of 3 skipped elements is acceptable because
of fast incrementing between adjacent vector elements.

I guess I'm exploiting a situation specific detail to get around the
larger question of profiling vector/map implementations for the time
being. The heap was a bit of a step which I prefer to find out more
about, but in the future. Reason being, my background is not CS, and
I posed my original choice as a vector/map decision because it's
readily available in standardized form via the STL, and I've got a
nice explanatory book about the formalities of their use in Josuttis.
So restricting my choices is a tradeoff; certainly, I won't have the
best solution, but I can choose the best one for which I can allot
the time.

Thanks for explanation of partitioning, and examples comparing it to
the heap (though I have to admit, I need to follow it up with more
textbook reading--I've briefly read about heaps in the past when
pondering the sort/lookup problem).

Fred
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,774
Messages
2,569,599
Members
45,165
Latest member
JavierBrak
Top