Performance of the std::map during a search

M

mveygman

Hi,

I am writing code that is using std::map and having a bit of an issue
with its performance.

It appears that the std::map is significantly slower searching for an
element then a sequential search in a vector.

Has anyone run into this before?
 
I

Ian Collins

Hi,

I am writing code that is using std::map and having a bit of an issue
with its performance.

It appears that the std::map is significantly slower searching for an
element then a sequential search in a vector.
The elements in a map are sorted on insertion, what are you attempting
to do?
 
R

Ron AF Greve

Hi,


Hi,

I am writing code that is using std::map and having a bit of an issue
with its performance.

It appears that the std::map is significantly slower searching for an
element then a sequential search in a vector.

Has anyone run into this before?

Do you use map.find ? Otherwise post some code.

e.g.

#include <iostream>
#include <string>
#include <map>
using namespace std;

int main(){
map<string, string> Dict;
Dict[ "boek" ] = "book";
Dict[ "winkel" ] = "shop";

map<string,string>::iterator Found = Dict.find( "boek" );
if( Found != Dict.end() )
{

cout << Found->first << " translates to " << Found->second << " (or
vice versa) " << endl;

}
else
{
cout << "No such word" << endl;
}
return 0;
}





Regards, Ron AF Greve

http://www.InformationSuperHighway.eu
 
E

Erik Wikström

Hi,

I am writing code that is using std::map and having a bit of an issue
with its performance.

It appears that the std::map is significantly slower searching for an
element then a sequential search in a vector.

For small collections that might be true, but for larger sets map should
be faster (if used correctly), unless perhaps if the vector is sorted
and you use a binary search.
 
M

mveygman

The elements in a map are sorted on insertion, what are you attempting
to do?

I am attempting to do the following:

insert element into the map

find element in the map.

The elements for the test that I have done were inserted in sequential
order and the vector was then randomly shuffled.

In the real world code the element insertion into the map is not
necessarily ordered.
 
M

mveygman

For small collections that might be true, but for larger sets map should
be faster (if used correctly), unless perhaps if the vector is sorted
and you use a binary search.


Operating key word being "should".

The assumption that I went in with was that this was the case and
worst case scenario would be that serial search of the vector and
serial search of the map would be on par for time. They are not even
close.

Regards,

Mikhail
 
M

mveygman

Hi,


I am writing code that is using std::map and having a bit of an issue
with its performance.
It appears that the std::map is significantly slower searching for an
element then a sequential search in a vector.
Has anyone run into this before?

Do you use map.find ? Otherwise post some code.

e.g.

#include <iostream>
#include <string>
#include <map>
using namespace std;

int main(){
map<string, string> Dict;
Dict[ "boek" ] = "book";
Dict[ "winkel" ] = "shop";

map<string,string>::iterator Found = Dict.find( "boek" );
if( Found != Dict.end() )
{

cout << Found->first << " translates to " << Found->second << " (or
vice versa) " << endl;

}
else
{
cout << "No such word" << endl;
}
return 0;

}

Regards, Ron AF Greve

http://www.InformationSuperHighway.eu



Here is the test code:

#include <map>
#include <vector>
#include <iostream>
#include <ostream>
#include <sys/time.h> /* gettimeofday */

unsigned long get_current_time()
{
struct timeval tv;
long usec;
gettimeofday (&tv, NULL);
usec = tv.tv_sec * 1000000;
usec += tv.tv_usec;
return usec;
}

int main(int argc, char **argv)
{
std::map< unsigned int, unsigned int > test_map;
std::vector< unsigned int > test_vector;

std::cout << "Setting up the test" << std::endl;

for (unsigned int i=0; i < 1000000; ++i)
{
test_map=i;
test_vector.push_back(i);
}
std::random_shuffle(test_vector.begin(), test_vector.end());

time_t seed = time(NULL);

srand(seed);
unsigned long vec_start = get_current_time();
for(unsigned int i=0; i < 500000; ++i)
{
unsigned int value_to_find = rand() % 1000000;
std::vector< unsigned int >::const_iterator end_itr =
test_vector.end();
std::vector< unsigned int >::const_iterator itr =
test_vector.begin();
for( ; itr == end_itr ; ++itr )
{
if( *itr == value_to_find )
{
break;
}
}
}
unsigned long vec_end = get_current_time();

srand(seed);
unsigned long map_start = get_current_time();
for(unsigned int i=0; i < 500000; ++i)
{
unsigned int value_to_find = rand() % 1000000;
std::map< unsigned int, unsigned int >::iterator itr =
test_map.find(value_to_find);
}
unsigned long map_end = get_current_time();

std::cout << "Vec Test took " << (vec_end - vec_start) << "
microseconds. Map test took " << (map_end - map_start) << "
microseconds." << std::endl;
}
 
H

Hans Bos

Here is the test code:
....
int main(int argc, char **argv)
{
std::map< unsigned int, unsigned int > test_map;
std::vector< unsigned int > test_vector;

std::cout << "Setting up the test" << std::endl;

for (unsigned int i=0; i < 1000000; ++i)
{
test_map=i;
test_vector.push_back(i);
}
std::random_shuffle(test_vector.begin(), test_vector.end());

time_t seed = time(NULL);

srand(seed);
unsigned long vec_start = get_current_time();
for(unsigned int i=0; i < 500000; ++i)
{
unsigned int value_to_find = rand() % 1000000;
std::vector< unsigned int >::const_iterator end_itr =
test_vector.end();
std::vector< unsigned int >::const_iterator itr =
test_vector.begin();
for( ; itr == end_itr ; ++itr )


Since itr == end_itr is always false, the loop always exists immediatly, so you
don't search the vector at all.
This should be itr != end.
If I change this, the vector version takes a lot more time .
 
M

mveygman

Here is the test code:
...
int main(int argc, char **argv)
{
std::map< unsigned int, unsigned int > test_map;
std::vector< unsigned int > test_vector;
std::cout << "Setting up the test" << std::endl;
for (unsigned int i=0; i < 1000000; ++i)
{
test_map=i;
test_vector.push_back(i);
}
std::random_shuffle(test_vector.begin(), test_vector.end());

time_t seed = time(NULL);
srand(seed);
unsigned long vec_start = get_current_time();
for(unsigned int i=0; i < 500000; ++i)
{
unsigned int value_to_find = rand() % 1000000;
std::vector< unsigned int >::const_iterator end_itr =
test_vector.end();
std::vector< unsigned int >::const_iterator itr =
test_vector.begin();
for( ; itr == end_itr ; ++itr )

Since itr == end_itr is always false, the loop always exists immediatly, so you
don't search the vector at all.
This should be itr != end.
If I change this, the vector version takes a lot more time .
{
if( *itr == value_to_find )
{
break;
}
}
}
unsigned long vec_end = get_current_time();
srand(seed);
unsigned long map_start = get_current_time();
for(unsigned int i=0; i < 500000; ++i)
{
unsigned int value_to_find = rand() % 1000000;
std::map< unsigned int, unsigned int >::iterator itr =
test_map.find(value_to_find);
}
unsigned long map_end = get_current_time();
std::cout << "Vec Test took " << (vec_end - vec_start) << "
microseconds. Map test took " << (map_end - map_start) << "
microseconds." << std::endl;
}



That would explain everyting. :)))
 
I

Ian Collins

*Please* don't quote signatures.
I am attempting to do the following:

insert element into the map

find element in the map.
How?

The elements for the test that I have done were inserted in sequential
order and the vector was then randomly shuffled.

In the real world code the element insertion into the map is not
necessarily ordered.

Then your implementation of std::map is broken.
 
E

Erik Wikström

Operating key word being "should".

The assumption that I went in with was that this was the case and
worst case scenario would be that serial search of the vector and
serial search of the map would be on par for time. They are not even
close.

Searching a map is O(log n) while a sequential search of a map is O(n),
which means that when the number of elements is large a map should be
faster. Exactly how many elements you need to have before the map is
faster depends on a lot of things but I would guess that with 1000
elements the map is almost always faster on a modern PC. On the other
hand if the vector is sorted then it might be faster searching the
vector using lower_bound() which is also O(log n), but most other
operations will still be faster on the map.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,189
Latest member
CryptoTaxSoftware

Latest Threads

Top