intersection of two std::maps

jacek.dziedzic · Nov 5, 2007

Hi!

What is the canonical way of finding an intersection of two
std::maps?
i.e. I have

std::map<whatever,size_t> map1;
std::map<whatever,size_t> map2;

.... and I need an std::vector containing all the values that occur
simultaneously in both of the maps, nevermind the keys. In each
of the maps the values do not repeat.

One idea is to copy all values from map1 into a vector, and then
for each value from map2 check if it's present in the vector already,
if not -- then push it into the vector. This, of course, has ugly
complexity, so perhaps a set would do better. Is there a better
way?

I was also thinking of putting the values into a multiset and then
taking out all values with a count of 2, but there must be a simpler
way.

TIA,
- J.

Mark P · Nov 5, 2007

Hi!

What is the canonical way of finding an intersection of two
std::maps?

I'm not sure there is one. It's a pretty strange request since,
ignoring the keys, the map is not a very useful value container.
Pushing all of the values into a set is one possibility. I don't see
any advantage to your multiset proposal. I think what I would do is:

1. Copy all values from each map to a separate vector.
2. Sort each vector.
3. Use std::set_intersection to obtain the intersection.

If copying values is too costly you can also use vectors of pointers to
the original values in the map, but then you'll need to write your own
comparison functions for steps 2 and 3, and do some additional copying
at the end.

-Mark

jacek.dziedzic · Nov 5, 2007

Mark said:
I'm not sure there is one. It's a pretty strange request since,
ignoring the keys, the map is not a very useful value container.

The keys were needed earlier to make the values unique,
in each of the map. Or, in other words, I had pairs like this
key1, value_a
key2, value_b
key1, value_c
key3, value_d
key1, value_e

and by using a map I could eliminate 'value_a' and 'value_c'
and retain only 'value_e', since they were all stored at the
same entry of the map.

Pushing
all of the values into a set is one possibility. I don't see any
advantage to your multiset proposal. I think what I would do is:

1. Copy all values from each map to a separate vector.
2. Sort each vector.
3. Use std::set_intersection to obtain the intersection.

I understand, except for the vector part. Why not copy
each map to a separate set and then do set_intersection?

If copying values is too costly you can also use vectors of pointers to
the original values in the map, but then you'll need to write your own
comparison functions for steps 2 and 3, and do some additional copying
at the end.

Nope, the elements are lightweight (size_t's), but there
are a lot of them. Typically there would be 1M elements in
each of the maps and the maps would be almost identical,
with differences in the order of several elements.

thanks, I guess I'll stick with your proposal,
- J.

Mark P · Nov 5, 2007

I understand, except for the vector part. Why not copy
each map to a separate set and then do set_intersection?

You could do that as well. It's fewer lines of code, perhaps, but
conventional wisdom is that unless you need to sort "online", it's
generally more efficient to collect all values and perform a single sort
operation at the end (as I suggest) rather than maintaining a sorted
structure as items are individually added (the set approach). See Scott
Meyers "Effective STL" and do keep in mind the usual caveats about
premature optimization, but I would expect better performance with the
vector.

jacek.dziedzic · Nov 5, 2007

You could do that as well. It's fewer lines of code, perhaps, but
conventional wisdom is that unless you need to sort "online", it's
generally more efficient to collect all values and perform a single sort
operation at the end (as I suggest) rather than maintaining a sorted
structure as items are individually added (the set approach). See Scott
Meyers "Effective STL" and do keep in mind the usual caveats about
premature optimization, but I would expect better performance with the
vector.

I see, thank you.
- J.

Jim Langston · Nov 6, 2007

Mark P said:
You could do that as well. It's fewer lines of code, perhaps, but
conventional wisdom is that unless you need to sort "online", it's
generally more efficient to collect all values and perform a single sort
operation at the end (as I suggest) rather than maintaining a sorted
structure as items are individually added (the set approach). See Scott
Meyers "Effective STL" and do keep in mind the usual caveats about
premature optimization, but I would expect better performance with the
vector.

Also, your values are not guaranteed to be unique in the map since they are
not keyed. Copying to std::vector or std::set probably depends on what you
want to do with duplicate values in the same map. Do you want to enforce
uniqueness on each the values in each copied set/vector? If so, use set.
If not, use vector.

Vector should be faster however for copying the data into as it does not
have to build the index.

jacek.dziedzic · Nov 6, 2007

Also, your values are not guaranteed to be unique in the map since they are
not keyed.

Note however that in the OP I wrote:

JD> In each of the maps the values do not repeat.

I know for sure that in each of the maps there are no duplicate
values.

Copying to std::vector or std::set probably depends on what you
want to do with duplicate values in the same map. Do you want to enforce
uniqueness on each the values in each copied set/vector? If so, use set.
If not, use vector.

Vector should be faster however for copying the data into as it does not
have to build the index.

I guess I'll stick with vector.

thanks,
- J.

SENTINEL CONTROL LOOP WHEN DEALING WITH TWO ARRAYS	1	Oct 26, 2023
Returning map from map of maps without any copy	4	Dec 3, 2011
Memory efficient way to store strings in hash maps using	3	Apr 17, 2012
Multiple index maps	9	Jul 1, 2008
Copying Maps Question	0	Jul 16, 2009
Problem with defining operator<< for std::ostream_iterator	5	May 30, 2007
two issues(string processing)	4	Feb 26, 2008
maps, iterators, and const	3	May 21, 2010

intersection of two std::maps

jacek.dziedzic

Mark P

jacek.dziedzic

Mark P

jacek.dziedzic

Jim Langston

jacek.dziedzic

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads