map.insert(key,val) vs. map[key]=val ?

Discussion in 'C++' started by Patrick Guio, Oct 19, 2004.

  1. Patrick Guio

    Patrick Guio Guest

    Dear all,

    I have benchmarked 2 different ways to add an element in a map container.

    1 map.insert(key,val)
    2 map[key]=val

    I used gcc and it comes out that the second way is about twice as fast as
    the first one.
    Is there a reason/case why I should use the insert() method rather the
    operator[]?
    Are the 2 different completely equivalent?

    Sincerely,

    Patrick
     
    Patrick Guio, Oct 19, 2004
    #1
    1. Advertising

  2. Patrick Guio wrote:
    > I have benchmarked 2 different ways to add an element in a map container.
    >
    > 1 map.insert(key,val)
    > 2 map[key]=val
    >
    > I used gcc and it comes out that the second way is about twice as fast
    > as the first one.
    > Is there a reason/case why I should use the insert() method rather the
    > operator[]?
    > Are the 2 different completely equivalent?


    No, they are not. According to the Standard, the second one creates
    a default 'value' associated with 'key' in the container, and then uses
    the assignment op to get the contents of 'val' into the newly created
    value. The operator[] is said to be implemented in terms of 'insert'.
    So, all things being standard, I'd expect 'insert' to work faster in
    general.

    As to what you should use, I don't know. If your map::value_type does
    not have an assignment operator, you won't be able to use method 2. If
    your type does have the assignment op, only testing (like you did) should
    be able to tell the difference. Use what you think is better. The end
    result of them is probably the same.

    V
     
    Victor Bazarov, Oct 19, 2004
    #2
    1. Advertising

  3. Patrick Guio

    Patrick Guio Guest

    On Tue, 19 Oct 2004, Victor Bazarov wrote:

    >> I have benchmarked 2 different ways to add an element in a map container.
    >>
    >> 1 map.insert(key,val)
    >> 2 map[key]=val
    >>
    >> I used gcc and it comes out that the second way is about twice as fast as
    >> the first one.
    >> Is there a reason/case why I should use the insert() method rather the
    >> operator[]?
    >> Are the 2 different completely equivalent?

    >
    > No, they are not. According to the Standard, the second one creates
    > a default 'value' associated with 'key' in the container, and then uses
    > the assignment op to get the contents of 'val' into the newly created
    > value. The operator[] is said to be implemented in terms of 'insert'.
    > So, all things being standard, I'd expect 'insert' to work faster in
    > general.


    Strange, here is my very simple (maybe too simple?) benchmark

    #include <map>
    #include <string>
    #include <iostream>
    #include <ctime>

    typedef std::map<int, std::string> lut;
    typedef lut::value_type pair;

    #define BENCH(BLOCK,NAME,N) \
    { \
    lut a; \
    clock_t tic=clock(); \
    for (int i=0; i<N; i++) { \
    BLOCK \
    } \
    double elapsed = double(clock()-tic)/N/CLOCKS_PER_SEC; \
    std::cout << "Time elapsed = " << 1e9*elapsed << " ns" << std::endl; \
    } \

    int const N=100000;

    int main()
    {
    BENCH(a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
    BENCH(a[i%1000]="foo1";,"map::eek:perator[]",N)

    BENCH(lut a; a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
    BENCH(lut a; a[i%1000]="foo1";,"map::eek:perator[]",N)
    }

    And the output on a linux box runnig g++ 3.2.3 compiling without any
    optimisation

    Time elapsed = 900 ns
    Time elapsed = 500 ns
    Time elapsed = 1100 ns
    Time elapsed = 1200 ns

    How do you explain this?

    Sincerely,

    P
     
    Patrick Guio, Oct 19, 2004
    #3
  4. Patrick Guio wrote:
    > On Tue, 19 Oct 2004, Victor Bazarov wrote:
    >
    >>> I have benchmarked 2 different ways to add an element in a map
    >>> container.
    >>>
    >>> 1 map.insert(key,val)
    >>> 2 map[key]=val
    >>>
    >>> I used gcc and it comes out that the second way is about twice as
    >>> fast as the first one.
    >>> Is there a reason/case why I should use the insert() method rather
    >>> the operator[]?
    >>> Are the 2 different completely equivalent?

    >>
    >>
    >> No, they are not. According to the Standard, the second one creates
    >> a default 'value' associated with 'key' in the container, and then uses
    >> the assignment op to get the contents of 'val' into the newly created
    >> value. The operator[] is said to be implemented in terms of 'insert'.
    >> So, all things being standard, I'd expect 'insert' to work faster in
    >> general.

    >
    >
    > Strange, here is my very simple (maybe too simple?) benchmark
    >
    > #include <map>
    > #include <string>
    > #include <iostream>
    > #include <ctime>


    I had to add

    #define pair mypair

    here

    >
    > typedef std::map<int, std::string> lut;
    > typedef lut::value_type pair;
    >
    > #define BENCH(BLOCK,NAME,N) \
    > { \
    > lut a; \
    > clock_t tic=clock(); \
    > for (int i=0; i<N; i++) { \
    > BLOCK \
    > } \
    > double elapsed = double(clock()-tic)/N/CLOCKS_PER_SEC; \
    > std::cout << "Time elapsed = " << 1e9*elapsed << " ns" << std::endl; \
    > } \
    >
    > int const N=100000;
    >
    > int main()
    > {
    > BENCH(a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
    > BENCH(a[i%1000]="foo1";,"map::eek:perator[]",N)
    >
    > BENCH(lut a; a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
    > BENCH(lut a; a[i%1000]="foo1";,"map::eek:perator[]",N)
    > }
    >
    > And the output on a linux box runnig g++ 3.2.3 compiling without any
    > optimisation
    >
    > Time elapsed = 900 ns
    > Time elapsed = 500 ns
    > Time elapsed = 1100 ns
    > Time elapsed = 1200 ns
    >
    > How do you explain this?


    I am not going to even try. In my run (gcc v.2.96) I get

    Time elapsed = 1100 ns
    Time elapsed = 1000 ns
    Time elapsed = 1100 ns
    Time elapsed = 1300 ns

    Which is close enough to not to be concerned. BTW, what do you think the
    *real* resolution of 'clock' on your system is?

    I went ahead and compiled and ran it on another system several times in
    a row (with less than a second between the runs, as fast as it took me to
    press the up arrow and 'enter'). Here is what I got:

    /home/vbazarov/temp% ./test
    Time elapsed = 700 ns
    Time elapsed = 600 ns
    Time elapsed = 900 ns
    Time elapsed = 900 ns
    /home/vbazarov/temp% ./test
    Time elapsed = 700 ns
    Time elapsed = 700 ns
    Time elapsed = 800 ns
    Time elapsed = 1000 ns
    /home/vbazarov/temp% ./test
    Time elapsed = 600 ns
    Time elapsed = 600 ns
    Time elapsed = 800 ns
    Time elapsed = 1200 ns
    /home/vbazarov/temp% ./test
    Time elapsed = 700 ns
    Time elapsed = 600 ns
    Time elapsed = 700 ns
    Time elapsed = 1000 ns
    /home/vbazarov/temp% ./test
    Time elapsed = 700 ns
    Time elapsed = 600 ns
    Time elapsed = 900 ns
    Time elapsed = 1100 ns
    /home/vbazarov/temp% ./test
    Time elapsed = 700 ns
    Time elapsed = 600 ns
    Time elapsed = 800 ns
    Time elapsed = 1000 ns

    After increasing the number of runs ten times, I get a bit more precise
    readings, but still not really significantly different:

    Time elapsed = 760 ns
    Time elapsed = 630 ns
    Time elapsed = 790 ns
    Time elapsed = 1030 ns

    Is that something indicative of anything? I am not sure. Possibly the
    ability of a compiler to optimise certain things and inability to
    optimise other things.

    BTW, all previous results presented here are G++-compiled. If I do
    the same test on VC++ 7.1, I get

    Time elapsed = 262 ns
    Time elapsed = 140 ns
    Time elapsed = 1020 ns
    Time elapsed = 959 ns

    I am not going to provide any explanation here either.

    Victor
     
    Victor Bazarov, Oct 19, 2004
    #4
  5. Patrick Guio

    Jeff Flinn Guest

    "Patrick Guio" <> wrote in message
    news:p...
    > On Tue, 19 Oct 2004, Victor Bazarov wrote:
    >
    > >> I have benchmarked 2 different ways to add an element in a map

    container.
    > >>
    > >> 1 map.insert(key,val)
    > >> 2 map[key]=val
    > >>
    > >> I used gcc and it comes out that the second way is about twice as fast

    as
    > >> the first one.
    > >> Is there a reason/case why I should use the insert() method rather the
    > >> operator[]?
    > >> Are the 2 different completely equivalent?

    > >
    > > No, they are not. According to the Standard, the second one creates
    > > a default 'value' associated with 'key' in the container, and then uses
    > > the assignment op to get the contents of 'val' into the newly created
    > > value. The operator[] is said to be implemented in terms of 'insert'.
    > > So, all things being standard, I'd expect 'insert' to work faster in
    > > general.

    >
    > Strange, here is my very simple (maybe too simple?) benchmark
    >
    > #include <map>
    > #include <string>
    > #include <iostream>
    > #include <ctime>
    >
    > typedef std::map<int, std::string> lut;
    > typedef lut::value_type pair;
    >
    > #define BENCH(BLOCK,NAME,N) \
    > { \
    > lut a; \
    > clock_t tic=clock(); \
    > for (int i=0; i<N; i++) { \
    > BLOCK \
    > } \
    > double elapsed = double(clock()-tic)/N/CLOCKS_PER_SEC; \
    > std::cout << "Time elapsed = " << 1e9*elapsed << " ns" << std::endl; \
    > } \
    >
    > int const N=100000;
    >
    > int main()
    > {
    > BENCH(a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
    > BENCH(a[i%1000]="foo1";,"map::eek:perator[]",N)
    >
    > BENCH(lut a; a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
    > BENCH(lut a; a[i%1000]="foo1";,"map::eek:perator[]",N)
    > }
    >
    > And the output on a linux box runnig g++ 3.2.3 compiling without any
    > optimisation
    >
    > Time elapsed = 900 ns
    > Time elapsed = 500 ns
    > Time elapsed = 1100 ns
    > Time elapsed = 1200 ns


    Try running each test case in it's own excitable, compare those results. I
    bet the allocator on your implementation doesn't need to do as much work
    after the first 1000 allocations have occurred.

    Jeff F
     
    Jeff Flinn, Oct 19, 2004
    #5
  6. Patrick Guio

    Old Wolf Guest

    Patrick Guio <> wrote:
    > On Tue, 19 Oct 2004, Victor Bazarov wrote:
    >
    > >> I have benchmarked 2 different ways to add an element in a map container.
    > >>
    > >> 1 map.insert(key,val)
    > >> 2 map[key]=val
    > >>
    > >> I used gcc and it comes out that the second way is about twice as fast as
    > >> the first one.

    >
    > #define BENCH(BLOCK,NAME,N) \
    > { \
    > lut a; \
    > clock_t tic=clock(); \
    > for (int i=0; i<N; i++) { \
    > BLOCK \
    > } \
    > double elapsed = double(clock()-tic)/N/CLOCKS_PER_SEC; \
    > std::cout << "Time elapsed = " << 1e9*elapsed << " ns" << std::endl; \
    > } \
    >
    > int const N=100000;
    >
    > int main()
    > {
    > BENCH(a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
    > BENCH(a[i%1000]="foo1";,"map::eek:perator[]",N)


    Your benchmark does not do a great job of isolating the two
    things you are testing. For example, here you create a std::pair
    every iteration of the loop for the "insert" test but not
    for the [] test. Try creating the pair beforehand:

    std::pair p(1, "foo1");
    BENCH(a.insert(p); .....

    Also the '%' operator is wasting time, it may not waste an
    equal amount of time in both cases due to CPU caching or
    something (I dont know what I'm talking about, but if you
    can remove it as a variable then do so)
     
    Old Wolf, Oct 20, 2004
    #6
  7. Patrick Guio

    chris Guest

    Patrick Guio wrote:
    > On Tue, 19 Oct 2004, Victor Bazarov wrote:
    >
    >>> I have benchmarked 2 different ways to add an element in a map
    >>> container.
    >>>
    >>> 1 map.insert(key,val)
    >>> 2 map[key]=val
    >>>
    >>> I used gcc and it comes out that the second way is about twice as
    >>> fast as the first one.
    >>> Is there a reason/case why I should use the insert() method rather
    >>> the operator[]?
    >>> Are the 2 different completely equivalent?

    >>
    >>
    >> No, they are not. According to the Standard, the second one creates
    >> a default 'value' associated with 'key' in the container, and then uses
    >> the assignment op to get the contents of 'val' into the newly created
    >> value. The operator[] is said to be implemented in terms of 'insert'.
    >> So, all things being standard, I'd expect 'insert' to work faster in
    >> general.

    >
    >
    > Strange, here is my very simple (maybe too simple?) benchmark
    >
    > #include <map>
    > #include <string>
    > #include <iostream>
    > #include <ctime>
    >
    > typedef std::map<int, std::string> lut;
    > typedef lut::value_type pair;
    >
    > #define BENCH(BLOCK,NAME,N) \
    > { \
    > lut a; \
    > clock_t tic=clock(); \
    > for (int i=0; i<N; i++) { \
    > BLOCK \
    > } \
    > double elapsed = double(clock()-tic)/N/CLOCKS_PER_SEC; \
    > std::cout << "Time elapsed = " << 1e9*elapsed << " ns" << std::endl; \
    > } \
    >
    > int const N=100000;
    >
    > int main()
    > {
    > BENCH(a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
    > BENCH(a[i%1000]="foo1";,"map::eek:perator[]",N)
    >
    > BENCH(lut a; a.insert(pair(i%1000,"foo1"));,"map::insert()",N)
    > BENCH(lut a; a[i%1000]="foo1";,"map::eek:perator[]",N)
    > }
    >
    > And the output on a linux box runnig g++ 3.2.3 compiling without any
    > optimisation
    >
    > Time elapsed = 900 ns
    > Time elapsed = 500 ns
    > Time elapsed = 1100 ns
    > Time elapsed = 1200 ns
    >
    > How do you explain this?
    >

    First of all, compiling without optimisation is a very bad way of
    testing things, as numurous parts of g++'s standard library are designed
    with the assumption that you will do at least some optimisation and
    therefore various template-related things will simply be totally
    optimised away.

    The reason why [] is faster is because g++ first looks to see if there
    is already an element in the map with the correct key, and if so simply
    assigns it the approriate val, and if there isn't then calls insert.
    Only if insert is called does it construct a pair<key,val>. As in your
    test almost all the insertions end up being to keys which are already
    present, you save the small, but non-zero, cost of building all those pairs.

    A sufficently good compiler might be able to totally remove these pair
    constructons, but such a thing would require quite a lot of inlining,
    which could be very expensive. An alternative would be to expand
    map.insert so that instead of just taking pairs of type <key,val>, it
    will also take pairs of type <key&,val&> to avoid extra copying
    (although doing such a thing would require some thought)

    Chris
     
    chris, Oct 20, 2004
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Holger
    Replies:
    11
    Views:
    552
    Gabriel Genellina
    Feb 12, 2007
  2. tomix
    Replies:
    3
    Views:
    896
    mlimber
    Nov 6, 2006
  3. Michael Neumann

    "val.dup rescue val" sloooow

    Michael Neumann, Oct 27, 2004, in forum: Ruby
    Replies:
    2
    Views:
    135
    Michael Neumann
    Oct 27, 2004
  4. Anita Anita
    Replies:
    2
    Views:
    105
    Gary Wright
    Jan 6, 2009
  5. cmic

    $hash{val} or %hash->{val} ??

    cmic, Jan 15, 2006, in forum: Perl Misc
    Replies:
    8
    Views:
    133
Loading...

Share This Page