map inserts and efficiency

M

Mark P

Consider the following:

// Contrast different ways of inserting into a map

#include <map>
#include <iostream>

using namespace std;

struct A
{
A () {cout << "ctor\n";}
A (const A& a) {cout << "copy ctor\n";}
A& operator= (const A& a) {cout << "op=\n"; return *this;}
};

typedef map<int,A> IAMap;
typedef IAMap::value_type IAMVal;

int main()
{
cout << "\nm1[0] = ...\n";
IAMap m1;
m1[0] = A();

cout << "\nm2.insert(IAMVal...)\n";
IAMap m2;
m2.insert(IAMVal(0,A()));
}

// end of code

The output is:

m1[0] = ...
ctor
copy ctor
copy ctor
ctor
op=

m2.insert(IAMVal...)
ctor
copy ctor
copy ctor

// end of output

First of all, are the two copy ctor calls both necessary? I gather that
one occurs when constructing the temporary IAMVal and a second occurs
when copying that temporary into the map. Can a smart compiler
eliminate (elide?) the construction of the temporary? (My version of
gcc with maximum optimiztion will not do so.)

On a related note, in Scott Meyers's "Effective STL" he compares these
two approaches in the contexts of inserts (new key) vs. updates
(existing key with new data). He states about inserts, "Efficiency
considerations thus lead us to conclude that insert is preferable to
operator[] when adding an element to a map..." And so it would appear
by simply counting the number of operations above.

But now suppose my data type contains a vector which is empty for a
default constructed object and I want to insert an object whose vector
isn't empty. Comparing the two approaches and paying attention only to
operations which involve the nonempty vector (since this is ostensibly
more work), the insert approach involves two copies of the nonempty
vector while the [] appraoch involves only one assignment of the
nonempty vector. (Unless of course the compiler can eliminate one of
the copies.) Am I looking at this right?

Thanks,
Mark
 
M

Mike Wahler

Mark P said:
Consider the following:

// Contrast different ways of inserting into a map

#include <map>
#include <iostream>

using namespace std;

struct A
{
A () {cout << "ctor\n";}
A (const A& a) {cout << "copy ctor\n";}
A& operator= (const A& a) {cout << "op=\n"; return *this;}
};

typedef map<int,A> IAMap;
typedef IAMap::value_type IAMVal;

int main()
{
cout << "\nm1[0] = ...\n";
IAMap m1;
m1[0] = A();

cout << "\nm2.insert(IAMVal...)\n";
IAMap m2;
m2.insert(IAMVal(0,A()));
}

// end of code

The output is:

m1[0] = ...
ctor
copy ctor
copy ctor
ctor
op=

m2.insert(IAMVal...)
ctor
copy ctor
copy ctor

// end of output

First of all, are the two copy ctor calls both necessary? I gather that
one occurs when constructing the temporary IAMVal and a second occurs when
copying that temporary into the map. Can a smart compiler eliminate
(elide?) the construction of the temporary? (My version of gcc with
maximum optimiztion will not do so.)

On a related note, in Scott Meyers's "Effective STL" he compares these two
approaches in the contexts of inserts (new key) vs. updates (existing key
with new data). He states about inserts, "Efficiency considerations thus
lead us to conclude that insert is preferable to operator[] when adding an
element to a map..." And so it would appear by simply counting the number
of operations above.

But now suppose my data type contains a vector which is empty for a
default constructed object and I want to insert an object whose vector
isn't empty. Comparing the two approaches and paying attention only to
operations which involve the nonempty vector (since this is ostensibly
more work), the insert approach involves two copies of the nonempty vector
while the [] appraoch involves only one assignment of the nonempty vector.
(Unless of course the compiler can eliminate one of the copies.) Am I
looking at this right?

I'm no expert with this stuff, but:

Yes, I believe the compiler is allowed to elide a copy
ctor call as long as the resulting behavior is the same
('efficiency' not being part of 'behavior'). However,
since your copy ctor includes code for output, it must
be called, since its behavior differs from a default
copy ctor. Rather than your 'high level' analysis using
outputs, try taking them out and watching with a debugger.
If your compiler does elide the calls, stepping through
the code should reflect that.

I don't know if there are circumstances where a compiler
could elide a call to operator=() though.

HTH,
-Mike
 
J

Jerry Coffin

Mark P wrote:

[ ... ]
m1[0] = A();

[ ... ]
m2.insert(IAMVal(0,A()));

[ ... ]
First of all, are the two copy ctor calls both necessary?

No. According to the standard ($12.8/15), when a temporary class object
is copied using a copy ctor, and the object and the copy have the same
cv-qualification, the compiler is allowed to elide the copy ctor, even
if the copy ctor (and/or dtor) has side effects.

[ ... ]
On a related note, in Scott Meyers's "Effective STL" he compares these
two approaches in the contexts of inserts (new key) vs. updates
(existing key with new data). He states about inserts, "Efficiency
considerations thus lead us to conclude that insert is preferable to
operator[] when adding an element to a map..." And so it would appear
by simply counting the number of operations above.

When you use operator[], it has to return a reference to some object,
so it has to have an object to return a reference to -- which means
default constructing an object. Once you've created that default
object, you immediately replace it with a copy of the other object
being inserted. Even a compiler that can sometimes elide the use of the
copy ctor might easily have difficulty doing so in this situation.
 
P

Pete C

What you say about insert vs. operator[] is correct, but I think the
potentially surprising thing here is that even operator[] on its own
(without the assignment) yields two calls to the copy constructors.
That is to say,
m1[0];
produces:
ctor
copy ctor
copy ctor

This is because libstdc++'s implementation of operator[] does the
following:
1) Constructs a temporary A object (ctor call)
2) Uses the temporary A to construct temporary IAMVal (copy ctor)
3) Calls insert with the temporary IAMVal (copy ctor when insertion
occurs)

This seems inefficient but the implementation is specified by the
standard (23.3.1.2).

What's worse is that the compiler can't elide either of the copy
constructors, because in both cases the temporaries have been bound to
const references at the point the copy construction occurs. Consider
the difference between these two functions:

void f(const A &a) {
// even if 'a' refers to a temporary, eliding impossible
// here. copy ctor used
A b = a;
}
void f() {
A b = A(); // eliding possible, so ctor called just once
}

...

f((A())); // temporary bound to const ref. - calls ctor and copy ctor
f(); // only default ctor is called


So the problem is with the STL, not the compiler. It would be nice to
have a map::eek:perator[] that just called the default constructor in the
case that the entry needed to be created, but the standard doesn't
permit it.

Have I got that right?
 
P

Pete C

I think you are looking at it right, but I doubt you'll get the
compiler to eliminate any of the copying.
I usually take the approach that any object in an STL container will
get copied and assigned quite a bit, and so try to make the object
itself as lightweight as possible.

In your case, how about making the vector reference-counted? That way
you could copy or assign the reference very cheaply. However you would
need to check that the reference was unique before any operation that
modified the vector, and if not then copy it prior to the operation
(this is the approach that many std::string implementations take). It's
quite a bit of work and not necessarily appropriate to your situation,
but worth considering.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top