Swapping data between vector and map...

D

deancoo

Ok, I've got a vector I've filled with a LOT of data, and after massaging it
just right, I need to copy it all over to a map where I'll have the
advantage of assigning a custom key as the key attribute to do fast lookups.
Is there an STL treasure that will help me do this copy without choking my
computer with the heavy memory demand? It looks like both vectors and maps
have their own 'swap' member functions, but I need something that crosses
the container barrier. Any advice would be greatly appreciated.

d
 
D

davidrubin

deancoo said:
Ok, I've got a vector I've filled with a LOT of data, and after massaging it
just right, I need to copy it all over to a map where I'll have the
advantage of assigning a custom key as the key attribute to do fast lookups.
Is there an STL treasure that will help me do this copy without choking my
computer with the heavy memory demand? It looks like both vectors and maps
have their own 'swap' member functions, but I need something that crosses
the container barrier. Any advice would be greatly appreciated.

If your vector already has 'std::pair's in it, you can use
'map::insert'. Otherwise you have to assign key values to your vector
elements some other way, e.g., by looping over the vector elements,
creating pairs, and inserting them into the map. /david
 
I

Ivan Vecerina

deancoo said:
Ok, I've got a vector I've filled with a LOT of data, and after massaging
it just right, I need to copy it all over to a map where I'll have the
advantage of assigning a custom key as the key attribute to do fast
lookups. Is there an STL treasure that will help me do this copy without
choking my computer with the heavy memory demand? It looks like both
vectors and maps have their own 'swap' member functions, but I need
something that crosses the container barrier. Any advice would be greatly
appreciated.

Fast look-up is not a good reason to use an std::map -- don't use it
if you have the data you need already stored in a (sorted) vector.
An std::map is only efficient when you need to dynamically insert/remove
items while keeping the collection sorted.

using std::lower_bound (does a binary search) on a sorted vector is most
likely faster than looking-up an item in a map. Or eventually you could
use a separate index, storing pointers into the vector.


hth -Ivan
 
D

deancoo

Ivan Vecerina said:
Fast look-up is not a good reason to use an std::map -- don't use it
if you have the data you need already stored in a (sorted) vector.
An std::map is only efficient when you need to dynamically insert/remove
items while keeping the collection sorted.

using std::lower_bound (does a binary search) on a sorted vector is most
likely faster than looking-up an item in a map. Or eventually you could
use a separate index, storing pointers into the vector.


hth -Ivan

Good point. What might be good container to hold such an index? Let me go
into a little bit more detail. The key I would like to use to identify my
objects is almost like a hash or even a check sum. It is very fast and easy
to produce, and completely unique, however quite disjointed. So off the top
of my head, it (the index) might be a vector, storing pairs (key, *ptr_obj)?
Then do something like overload the < operator to allow lower_bound to be
able to search on the 'key' attribute of the pair? Does this sound about
right?

d
 
I

Ivan Vecerina

deancoo said:
Good point. What might be good container to hold such an index? Let me
go into a little bit more detail. The key I would like to use to identify
my objects is almost like a hash or even a check sum. It is very fast and
easy to produce, and completely unique, however quite disjointed. So off
the top of my head, it (the index) might be a vector, storing pairs (key,
*ptr_obj)? Then do something like overload the < operator to allow
lower_bound to be able to search on the 'key' attribute of the pair? Does
this sound about right?
Yes, you could do that.
An alternative would be to have one vector with the objects, and one vector
with the keys, sorted in sync (best way to obtain this depends on details).
Searching a linearly sorted vector of (only)keys is nearly optimum in terms
of cache usage (only eventually beaten by a heap-based structure).


Cheers,
Ivan
 
D

deancoo

Ivan Vecerina said:
Yes, you could do that.
An alternative would be to have one vector with the objects, and one
vector
with the keys, sorted in sync (best way to obtain this depends on
details).
Searching a linearly sorted vector of (only)keys is nearly optimum in
terms
of cache usage (only eventually beaten by a heap-based structure).

Cool. That's going to work awesome. Thanks.
 
D

davidrubin

deancoo said:
[...] So off the top of my head, it (the index) might be a vector,
storing pairs (key, *ptr_obj)? Then do something like overload the <
operator to allow lower_bound to be able to search on the 'key' attribute
of the pair? Does this sound about right?
Yes, you could do that.
An alternative would be to have one vector with the objects, and one
vector
with the keys, sorted in sync (best way to obtain this depends on
details).
Searching a linearly sorted vector of (only)keys is nearly optimum in
terms
of cache usage (only eventually beaten by a heap-based structure).

Cool. That's going to work awesome. Thanks.

If you've already got 'std::pair's, you may as well use a map. It's
much easier. Once you have it working, you can profile to find
performance bottlenecks. I'm quite sure that the constant factor
differentiating std::map::find and std::lower_bound is not going to be
in the top 10.

/david
 
D

deancoo

[...] So off the top of my head, it (the index) might be a vector,
storing pairs (key, *ptr_obj)? Then do something like overload the <
operator to allow lower_bound to be able to search on the 'key' attribute
of the pair? Does this sound about right?
Yes, you could do that.
An alternative would be to have one vector with the objects, and one
vector
with the keys, sorted in sync (best way to obtain this depends on
details).
Searching a linearly sorted vector of (only)keys is nearly optimum in
terms
of cache usage (only eventually beaten by a heap-based structure).

Cool. That's going to work awesome. Thanks.

If you've already got 'std::pair's, you may as well use a map. It's
much easier. Once you have it working, you can profile to find
performance bottlenecks. I'm quite sure that the constant factor
differentiating std::map::find and std::lower_bound is not going to be
in the top 10.

/david
Ya, using map is really tempting because of the ease. Since one of the
reasons I'm building this app is to get myself back into C++, I think I'll
try implementing both. Thanks again guys.

d
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top