map inserts and efficiency

Discussion in 'C++' started by Mark P, Jan 27, 2006.

  1. Mark P

    Mark P Guest

    Consider the following:

    // Contrast different ways of inserting into a map

    #include <map>
    #include <iostream>

    using namespace std;

    struct A
    {
    A () {cout << "ctor\n";}
    A (const A& a) {cout << "copy ctor\n";}
    A& operator= (const A& a) {cout << "op=\n"; return *this;}
    };

    typedef map<int,A> IAMap;
    typedef IAMap::value_type IAMVal;

    int main()
    {
    cout << "\nm1[0] = ...\n";
    IAMap m1;
    m1[0] = A();

    cout << "\nm2.insert(IAMVal...)\n";
    IAMap m2;
    m2.insert(IAMVal(0,A()));
    }

    // end of code

    The output is:

    m1[0] = ...
    ctor
    copy ctor
    copy ctor
    ctor
    op=

    m2.insert(IAMVal...)
    ctor
    copy ctor
    copy ctor

    // end of output

    First of all, are the two copy ctor calls both necessary? I gather that
    one occurs when constructing the temporary IAMVal and a second occurs
    when copying that temporary into the map. Can a smart compiler
    eliminate (elide?) the construction of the temporary? (My version of
    gcc with maximum optimiztion will not do so.)

    On a related note, in Scott Meyers's "Effective STL" he compares these
    two approaches in the contexts of inserts (new key) vs. updates
    (existing key with new data). He states about inserts, "Efficiency
    considerations thus lead us to conclude that insert is preferable to
    operator[] when adding an element to a map..." And so it would appear
    by simply counting the number of operations above.

    But now suppose my data type contains a vector which is empty for a
    default constructed object and I want to insert an object whose vector
    isn't empty. Comparing the two approaches and paying attention only to
    operations which involve the nonempty vector (since this is ostensibly
    more work), the insert approach involves two copies of the nonempty
    vector while the [] appraoch involves only one assignment of the
    nonempty vector. (Unless of course the compiler can eliminate one of
    the copies.) Am I looking at this right?

    Thanks,
    Mark
     
    Mark P, Jan 27, 2006
    #1
    1. Advertising

  2. Mark P

    Mike Wahler Guest

    "Mark P" <> wrote in message
    news:fjxCf.22520$...
    > Consider the following:
    >
    > // Contrast different ways of inserting into a map
    >
    > #include <map>
    > #include <iostream>
    >
    > using namespace std;
    >
    > struct A
    > {
    > A () {cout << "ctor\n";}
    > A (const A& a) {cout << "copy ctor\n";}
    > A& operator= (const A& a) {cout << "op=\n"; return *this;}
    > };
    >
    > typedef map<int,A> IAMap;
    > typedef IAMap::value_type IAMVal;
    >
    > int main()
    > {
    > cout << "\nm1[0] = ...\n";
    > IAMap m1;
    > m1[0] = A();
    >
    > cout << "\nm2.insert(IAMVal...)\n";
    > IAMap m2;
    > m2.insert(IAMVal(0,A()));
    > }
    >
    > // end of code
    >
    > The output is:
    >
    > m1[0] = ...
    > ctor
    > copy ctor
    > copy ctor
    > ctor
    > op=
    >
    > m2.insert(IAMVal...)
    > ctor
    > copy ctor
    > copy ctor
    >
    > // end of output
    >
    > First of all, are the two copy ctor calls both necessary? I gather that
    > one occurs when constructing the temporary IAMVal and a second occurs when
    > copying that temporary into the map. Can a smart compiler eliminate
    > (elide?) the construction of the temporary? (My version of gcc with
    > maximum optimiztion will not do so.)
    >
    > On a related note, in Scott Meyers's "Effective STL" he compares these two
    > approaches in the contexts of inserts (new key) vs. updates (existing key
    > with new data). He states about inserts, "Efficiency considerations thus
    > lead us to conclude that insert is preferable to operator[] when adding an
    > element to a map..." And so it would appear by simply counting the number
    > of operations above.
    >
    > But now suppose my data type contains a vector which is empty for a
    > default constructed object and I want to insert an object whose vector
    > isn't empty. Comparing the two approaches and paying attention only to
    > operations which involve the nonempty vector (since this is ostensibly
    > more work), the insert approach involves two copies of the nonempty vector
    > while the [] appraoch involves only one assignment of the nonempty vector.
    > (Unless of course the compiler can eliminate one of the copies.) Am I
    > looking at this right?


    I'm no expert with this stuff, but:

    Yes, I believe the compiler is allowed to elide a copy
    ctor call as long as the resulting behavior is the same
    ('efficiency' not being part of 'behavior'). However,
    since your copy ctor includes code for output, it must
    be called, since its behavior differs from a default
    copy ctor. Rather than your 'high level' analysis using
    outputs, try taking them out and watching with a debugger.
    If your compiler does elide the calls, stepping through
    the code should reflect that.

    I don't know if there are circumstances where a compiler
    could elide a call to operator=() though.

    HTH,
    -Mike
     
    Mike Wahler, Jan 27, 2006
    #2
    1. Advertising

  3. Mark P

    Jerry Coffin Guest

    Mark P wrote:

    [ ... ]

    > m1[0] = A();


    [ ... ]

    > m2.insert(IAMVal(0,A()));


    [ ... ]

    > First of all, are the two copy ctor calls both necessary?


    No. According to the standard ($12.8/15), when a temporary class object
    is copied using a copy ctor, and the object and the copy have the same
    cv-qualification, the compiler is allowed to elide the copy ctor, even
    if the copy ctor (and/or dtor) has side effects.

    [ ... ]

    > On a related note, in Scott Meyers's "Effective STL" he compares these
    > two approaches in the contexts of inserts (new key) vs. updates
    > (existing key with new data). He states about inserts, "Efficiency
    > considerations thus lead us to conclude that insert is preferable to
    > operator[] when adding an element to a map..." And so it would appear
    > by simply counting the number of operations above.


    When you use operator[], it has to return a reference to some object,
    so it has to have an object to return a reference to -- which means
    default constructing an object. Once you've created that default
    object, you immediately replace it with a copy of the other object
    being inserted. Even a compiler that can sometimes elide the use of the
    copy ctor might easily have difficulty doing so in this situation.

    --
    Later,
    Jerry.
     
    Jerry Coffin, Jan 28, 2006
    #3
  4. Mark P

    Pete C Guest

    What you say about insert vs. operator[] is correct, but I think the
    potentially surprising thing here is that even operator[] on its own
    (without the assignment) yields two calls to the copy constructors.
    That is to say,
    m1[0];
    produces:
    ctor
    copy ctor
    copy ctor

    This is because libstdc++'s implementation of operator[] does the
    following:
    1) Constructs a temporary A object (ctor call)
    2) Uses the temporary A to construct temporary IAMVal (copy ctor)
    3) Calls insert with the temporary IAMVal (copy ctor when insertion
    occurs)

    This seems inefficient but the implementation is specified by the
    standard (23.3.1.2).

    What's worse is that the compiler can't elide either of the copy
    constructors, because in both cases the temporaries have been bound to
    const references at the point the copy construction occurs. Consider
    the difference between these two functions:

    void f(const A &a) {
    // even if 'a' refers to a temporary, eliding impossible
    // here. copy ctor used
    A b = a;
    }
    void f() {
    A b = A(); // eliding possible, so ctor called just once
    }

    ...

    f((A())); // temporary bound to const ref. - calls ctor and copy ctor
    f(); // only default ctor is called


    So the problem is with the STL, not the compiler. It would be nice to
    have a map::eek:perator[] that just called the default constructor in the
    case that the entry needed to be created, but the standard doesn't
    permit it.

    Have I got that right?
     
    Pete C, Jan 28, 2006
    #4
  5. Mark P

    Pete C Guest

    I think you are looking at it right, but I doubt you'll get the
    compiler to eliminate any of the copying.
    I usually take the approach that any object in an STL container will
    get copied and assigned quite a bit, and so try to make the object
    itself as lightweight as possible.

    In your case, how about making the vector reference-counted? That way
    you could copy or assign the reference very cheaply. However you would
    need to check that the reference was unique before any operation that
    modified the vector, and if not then copy it prior to the operation
    (this is the approach that many std::string implementations take). It's
    quite a bit of work and not necessarily appropriate to your situation,
    but worth considering.
     
    Pete C, Jan 28, 2006
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. _XaToA_
    Replies:
    0
    Views:
    307
    _XaToA_
    Dec 6, 2003
  2. Jim West
    Replies:
    14
    Views:
    6,251
    Alex Vinokur
    Dec 18, 2003
  3. Kin Pang

    std::map efficiency

    Kin Pang, Feb 5, 2004, in forum: C++
    Replies:
    8
    Views:
    6,699
    Ivan Vecerina
    Feb 11, 2004
  4. puzzlecracker
    Replies:
    0
    Views:
    342
    puzzlecracker
    Sep 25, 2008
  5. Kai-Uwe Bux
    Replies:
    1
    Views:
    1,161
    Kai-Uwe Bux
    Dec 21, 2008
Loading...

Share This Page