map.insert(key,val) vs. map[key]=val ?

P

Patrick Guio

Dear all,

I have benchmarked 2 different ways to add an element in a map container.

1 map.insert(key,val)
2 map[key]=val

I used gcc and it comes out that the second way is about twice as fast as
the first one.
Is there a reason/case why I should use the insert() method rather the
operator[]?
Are the 2 different completely equivalent?

Sincerely,

Patrick
 
V

Victor Bazarov

Patrick said:
I have benchmarked 2 different ways to add an element in a map container.

1 map.insert(key,val)
2 map[key]=val

I used gcc and it comes out that the second way is about twice as fast
as the first one.
Is there a reason/case why I should use the insert() method rather the
operator[]?
Are the 2 different completely equivalent?

No, they are not. According to the Standard, the second one creates
a default 'value' associated with 'key' in the container, and then uses
the assignment op to get the contents of 'val' into the newly created
value. The operator[] is said to be implemented in terms of 'insert'.
So, all things being standard, I'd expect 'insert' to work faster in
general.

As to what you should use, I don't know. If your map::value_type does
not have an assignment operator, you won't be able to use method 2. If
your type does have the assignment op, only testing (like you did) should
be able to tell the difference. Use what you think is better. The end
result of them is probably the same.

V
 
P

Patrick Guio

I have benchmarked 2 different ways to add an element in a map container.

1 map.insert(key,val)
2 map[key]=val

I used gcc and it comes out that the second way is about twice as fast as
the first one.
Is there a reason/case why I should use the insert() method rather the
operator[]?
Are the 2 different completely equivalent?

No, they are not. According to the Standard, the second one creates
a default 'value' associated with 'key' in the container, and then uses
the assignment op to get the contents of 'val' into the newly created
value. The operator[] is said to be implemented in terms of 'insert'.
So, all things being standard, I'd expect 'insert' to work faster in
general.

Strange, here is my very simple (maybe too simple?) benchmark

#include <map>
#include <string>
#include <iostream>
#include <ctime>

typedef std::map<int, std::string> lut;
typedef lut::value_type pair;

#define BENCH(BLOCK,NAME,N) \
{ \
lut a; \
clock_t tic=clock(); \
for (int i=0; i<N; i++) { \
BLOCK \
} \
double elapsed = double(clock()-tic)/N/CLOCKS_PER_SEC; \
std::cout << "Time elapsed = " << 1e9*elapsed << " ns" << std::endl; \
} \

int const N=100000;

int main()
{
BENCH(a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
BENCH(a[i%1000]="foo1";,"map::eek:perator[]",N)

BENCH(lut a; a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
BENCH(lut a; a[i%1000]="foo1";,"map::eek:perator[]",N)
}

And the output on a linux box runnig g++ 3.2.3 compiling without any
optimisation

Time elapsed = 900 ns
Time elapsed = 500 ns
Time elapsed = 1100 ns
Time elapsed = 1200 ns

How do you explain this?

Sincerely,

P
 
V

Victor Bazarov

Patrick said:
I have benchmarked 2 different ways to add an element in a map
container.

1 map.insert(key,val)
2 map[key]=val

I used gcc and it comes out that the second way is about twice as
fast as the first one.
Is there a reason/case why I should use the insert() method rather
the operator[]?
Are the 2 different completely equivalent?


No, they are not. According to the Standard, the second one creates
a default 'value' associated with 'key' in the container, and then uses
the assignment op to get the contents of 'val' into the newly created
value. The operator[] is said to be implemented in terms of 'insert'.
So, all things being standard, I'd expect 'insert' to work faster in
general.


Strange, here is my very simple (maybe too simple?) benchmark

#include <map>
#include <string>
#include <iostream>
#include <ctime>

I had to add

#define pair mypair

here
typedef std::map<int, std::string> lut;
typedef lut::value_type pair;

#define BENCH(BLOCK,NAME,N) \
{ \
lut a; \
clock_t tic=clock(); \
for (int i=0; i<N; i++) { \
BLOCK \
} \
double elapsed = double(clock()-tic)/N/CLOCKS_PER_SEC; \
std::cout << "Time elapsed = " << 1e9*elapsed << " ns" << std::endl; \
} \

int const N=100000;

int main()
{
BENCH(a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
BENCH(a[i%1000]="foo1";,"map::eek:perator[]",N)

BENCH(lut a; a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
BENCH(lut a; a[i%1000]="foo1";,"map::eek:perator[]",N)
}

And the output on a linux box runnig g++ 3.2.3 compiling without any
optimisation

Time elapsed = 900 ns
Time elapsed = 500 ns
Time elapsed = 1100 ns
Time elapsed = 1200 ns

How do you explain this?

I am not going to even try. In my run (gcc v.2.96) I get

Time elapsed = 1100 ns
Time elapsed = 1000 ns
Time elapsed = 1100 ns
Time elapsed = 1300 ns

Which is close enough to not to be concerned. BTW, what do you think the
*real* resolution of 'clock' on your system is?

I went ahead and compiled and ran it on another system several times in
a row (with less than a second between the runs, as fast as it took me to
press the up arrow and 'enter'). Here is what I got:

/home/vbazarov/temp% ./test
Time elapsed = 700 ns
Time elapsed = 600 ns
Time elapsed = 900 ns
Time elapsed = 900 ns
/home/vbazarov/temp% ./test
Time elapsed = 700 ns
Time elapsed = 700 ns
Time elapsed = 800 ns
Time elapsed = 1000 ns
/home/vbazarov/temp% ./test
Time elapsed = 600 ns
Time elapsed = 600 ns
Time elapsed = 800 ns
Time elapsed = 1200 ns
/home/vbazarov/temp% ./test
Time elapsed = 700 ns
Time elapsed = 600 ns
Time elapsed = 700 ns
Time elapsed = 1000 ns
/home/vbazarov/temp% ./test
Time elapsed = 700 ns
Time elapsed = 600 ns
Time elapsed = 900 ns
Time elapsed = 1100 ns
/home/vbazarov/temp% ./test
Time elapsed = 700 ns
Time elapsed = 600 ns
Time elapsed = 800 ns
Time elapsed = 1000 ns

After increasing the number of runs ten times, I get a bit more precise
readings, but still not really significantly different:

Time elapsed = 760 ns
Time elapsed = 630 ns
Time elapsed = 790 ns
Time elapsed = 1030 ns

Is that something indicative of anything? I am not sure. Possibly the
ability of a compiler to optimise certain things and inability to
optimise other things.

BTW, all previous results presented here are G++-compiled. If I do
the same test on VC++ 7.1, I get

Time elapsed = 262 ns
Time elapsed = 140 ns
Time elapsed = 1020 ns
Time elapsed = 959 ns

I am not going to provide any explanation here either.

Victor
 
J

Jeff Flinn

Patrick Guio said:
I have benchmarked 2 different ways to add an element in a map container.

1 map.insert(key,val)
2 map[key]=val

I used gcc and it comes out that the second way is about twice as fast as
the first one.
Is there a reason/case why I should use the insert() method rather the
operator[]?
Are the 2 different completely equivalent?

No, they are not. According to the Standard, the second one creates
a default 'value' associated with 'key' in the container, and then uses
the assignment op to get the contents of 'val' into the newly created
value. The operator[] is said to be implemented in terms of 'insert'.
So, all things being standard, I'd expect 'insert' to work faster in
general.

Strange, here is my very simple (maybe too simple?) benchmark

#include <map>
#include <string>
#include <iostream>
#include <ctime>

typedef std::map<int, std::string> lut;
typedef lut::value_type pair;

#define BENCH(BLOCK,NAME,N) \
{ \
lut a; \
clock_t tic=clock(); \
for (int i=0; i<N; i++) { \
BLOCK \
} \
double elapsed = double(clock()-tic)/N/CLOCKS_PER_SEC; \
std::cout << "Time elapsed = " << 1e9*elapsed << " ns" << std::endl; \
} \

int const N=100000;

int main()
{
BENCH(a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
BENCH(a[i%1000]="foo1";,"map::eek:perator[]",N)

BENCH(lut a; a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
BENCH(lut a; a[i%1000]="foo1";,"map::eek:perator[]",N)
}

And the output on a linux box runnig g++ 3.2.3 compiling without any
optimisation

Time elapsed = 900 ns
Time elapsed = 500 ns
Time elapsed = 1100 ns
Time elapsed = 1200 ns

Try running each test case in it's own excitable, compare those results. I
bet the allocator on your implementation doesn't need to do as much work
after the first 1000 allocations have occurred.

Jeff F
 
O

Old Wolf

Patrick Guio said:
I have benchmarked 2 different ways to add an element in a map container.

1 map.insert(key,val)
2 map[key]=val

I used gcc and it comes out that the second way is about twice as fast as
the first one.

#define BENCH(BLOCK,NAME,N) \
{ \
lut a; \
clock_t tic=clock(); \
for (int i=0; i<N; i++) { \
BLOCK \
} \
double elapsed = double(clock()-tic)/N/CLOCKS_PER_SEC; \
std::cout << "Time elapsed = " << 1e9*elapsed << " ns" << std::endl; \
} \

int const N=100000;

int main()
{
BENCH(a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
BENCH(a[i%1000]="foo1";,"map::eek:perator[]",N)

Your benchmark does not do a great job of isolating the two
things you are testing. For example, here you create a std::pair
every iteration of the loop for the "insert" test but not
for the [] test. Try creating the pair beforehand:

std::pair p(1, "foo1");
BENCH(a.insert(p); .....

Also the '%' operator is wasting time, it may not waste an
equal amount of time in both cases due to CPU caching or
something (I dont know what I'm talking about, but if you
can remove it as a variable then do so)
 
C

chris

Patrick said:
I have benchmarked 2 different ways to add an element in a map
container.

1 map.insert(key,val)
2 map[key]=val

I used gcc and it comes out that the second way is about twice as
fast as the first one.
Is there a reason/case why I should use the insert() method rather
the operator[]?
Are the 2 different completely equivalent?


No, they are not. According to the Standard, the second one creates
a default 'value' associated with 'key' in the container, and then uses
the assignment op to get the contents of 'val' into the newly created
value. The operator[] is said to be implemented in terms of 'insert'.
So, all things being standard, I'd expect 'insert' to work faster in
general.


Strange, here is my very simple (maybe too simple?) benchmark

#include <map>
#include <string>
#include <iostream>
#include <ctime>

typedef std::map<int, std::string> lut;
typedef lut::value_type pair;

#define BENCH(BLOCK,NAME,N) \
{ \
lut a; \
clock_t tic=clock(); \
for (int i=0; i<N; i++) { \
BLOCK \
} \
double elapsed = double(clock()-tic)/N/CLOCKS_PER_SEC; \
std::cout << "Time elapsed = " << 1e9*elapsed << " ns" << std::endl; \
} \

int const N=100000;

int main()
{
BENCH(a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
BENCH(a[i%1000]="foo1";,"map::eek:perator[]",N)

BENCH(lut a; a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
BENCH(lut a; a[i%1000]="foo1";,"map::eek:perator[]",N)
}

And the output on a linux box runnig g++ 3.2.3 compiling without any
optimisation

Time elapsed = 900 ns
Time elapsed = 500 ns
Time elapsed = 1100 ns
Time elapsed = 1200 ns

How do you explain this?
First of all, compiling without optimisation is a very bad way of
testing things, as numurous parts of g++'s standard library are designed
with the assumption that you will do at least some optimisation and
therefore various template-related things will simply be totally
optimised away.

The reason why [] is faster is because g++ first looks to see if there
is already an element in the map with the correct key, and if so simply
assigns it the approriate val, and if there isn't then calls insert.
Only if insert is called does it construct a pair<key,val>. As in your
test almost all the insertions end up being to keys which are already
present, you save the small, but non-zero, cost of building all those pairs.

A sufficently good compiler might be able to totally remove these pair
constructons, but such a thing would require quite a lot of inlining,
which could be very expensive. An alternative would be to expand
map.insert so that instead of just taking pairs of type <key,val>, it
will also take pairs of type <key&,val&> to avoid extra copying
(although doing such a thing would require some thought)

Chris
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,733
Messages
2,569,440
Members
44,830
Latest member
ZADIva7383

Latest Threads

Top