Efficient insertion in a std::multimap

B

Barry

Hi,

Hope this doesn't get lost beneath all the spam.

I have the following container which I wish to create and store
objects of type MyObject in-

class Phase : public std::multimap<double, MyObject>
{
public:
Phase();
};

Phase::phase()
{
MyObject myObject;
std::pair<double,Note> pair(0.0,myObject);
insert (pair);
}

A lot of copy constructors are being called for MyObject which I'd
like to avoid. But first, I'm not understanding what is happening for
the line: insert (pair);. Here, the copy constructor is called twice
and the default destructor once which suggests to me that a temp
object is create, but why? Finally, is there a way to cut down on all
the copies?

Thanks for your help,

Barry
 
B

Barry

"std::pair<double,Note> pair(0.0,myObject)" is the first copy constructor..

The second copy constructor occurs in the insert() method.


Store pointers, instead of the actual objects. Of course, that creates a
whole bunch of other issues.

 application_pgp-signature_part
< 1KVisaHämta

Thanks for the reply. How might I use pointers to do this since I wish
to store many MyObject objects within a Phase Object? If I were to use
pointers, then I would do something like this -

Phase::phase()
{
MyObject* myObject = new MyObject();
std::pair<double,Note*> pair(0.0,myObject);
insert (pair);
}

But I will be storing a whole bunch of MyObject objects within a Phase
object, so I'm still going to need to keep track of them in order to
delete them in Phase's destructor. But how would I keep track of them?

Thanks again,

Barry
 
B

Barry

Thanks for the reply. How might I use pointers to do this since I wish
to store many MyObject objects within a Phase Object? If I were to use
pointers, then I would do something like this -

Phase::phase()
{
  MyObject* myObject = new MyObject();
  std::pair<double,Note*> pair(0.0,myObject);
  insert (pair);

}

But I will be storing a whole bunch of MyObject objects within a Phase
object, so I'm still going to need to keep track of them in order to
delete them in Phase's destructor. But how would I keep track of them?

Thanks again,

Barry

I could have a for loop in the destructor for Phase which loops
through all the elements, deleting them. Does this sound reasonable?
 
J

Juha Nieminen

Barry said:
Finally, is there a way to cut down on all the copies?

Since std::map and std::multimap only handle objects of type
std::pair, never your specified types directly, there's no easy way of
avoiding the copy construction of your objects when instantiating std::pair.

You could maybe get around it with low-level trickery, but this is
rather "don't do this at home" type of stuff.

For example, you could use a byte array of the size of the required
std::pair and then use placement new to construct your objects directly
onto the std::pair::first and std::pair::second members as needed, and
then use ugly reinterpret-cast trickery to get a reference of type
std::pair which points to this array, and then give that to the multimap.

Of course this is ugly, hackish, and I cannot guarantee it's
standard-conforming. But might work. You'll just have to decide whether
it's worth the minor saving.
 
B

Barry

  Since std::map and std::multimap only handle objects of type
std::pair, never your specified types directly, there's no easy way of
avoiding the copy construction of your objects when instantiating std::pair.

  You could maybe get around it with low-level trickery, but this is
rather "don't do this at home" type of stuff.

  For example, you could use a byte array of the size of the required
std::pair and then use placement new to construct your objects directly
onto the std::pair::first and std::pair::second members as needed, and
then use ugly reinterpret-cast trickery to get a reference of type
std::pair which points to this array, and then give that to the multimap.

  Of course this is ugly, hackish, and I cannot guarantee it's
standard-conforming. But might work. You'll just have to decide whether
it's worth the minor saving.

Would the following be OK? -

class Phase : public std::multimap<double, MyObject*>
{
public:
Phase();
virtual ~Phase();
void Phase::AddSomeMore();

};

Phase::phase()
{
insert (std::pair<double,MyObject*>(1.0,new MyObject (1.5,67,1.5)));
insert (std::pair<double,MyObject*>(1.0,new MyObject (1.5,65,0.5)));
insert (std::pair<double,MyObject*>(1.0,new MyObject (1.4,67,2.5)));
{

void Phase::AddSomeMore()
{
insert (std::pair<double,MyObject*>(1.0,new MyObject (2.5,67,1.5)));
insert (std::pair<double,MyObject*>(1.0,new MyObject (1.3,61,5.5)));
}

Phase::~Phase()
{
for(iterator i = begin(); i != end(); i++)
delete i->second;
}
 
P

Pavel

Barry said:
Hi,

Hope this doesn't get lost beneath all the spam.

I have the following container which I wish to create and store
objects of type MyObject in-

class Phase : public std::multimap<double, MyObject>
{
public:
Phase();
};

Phase::phase()
{
MyObject myObject;
std::pair<double,Note> pair(0.0,myObject);
insert (pair);
}

A lot of copy constructors are being called for MyObject which I'd
like to avoid. But first, I'm not understanding what is happening for
the line: insert (pair);. Here, the copy constructor is called twice
and the default destructor once which suggests to me that a temp
object is create, but why? Finally, is there a way to cut down on all
the copies?

Thanks for your help,

Barry
One way of doing what you want is to create a multiset. Then at least
you will avoid one copy-construction (when you create a pair -- you can
create the set value_type directly from the parameters you would
otherwise use to create the "second" member of a pair). Another is to
use pointers as suggested by previous answers and you can delete the
remaining objects in the destructor as suggested by Sam if all you need
is value semantics with less copying.

Just FYI: the copy constructor in your example is called three times,
not two as can be seen from compiling and running the example below. To
make it actually called twice, you have to use std::pair<const double,
Note> (or better use Phase::value_type to avoid extra thinking).

Hope this will help,
-Pavel

#include <iostream>
#include <map>

using namespace std;

struct S {
int i;
S() : i(7) {}
S(const S &s) : i(s.i) { cout << "S(const S &)\n"; }
};

typedef multimap<double, S> Map;


int main() {
Map m;
S s;
cout << "1\n";
std::pair<double, S> pair(1.0, s);
m.insert(pair);
cout << "2\n";
m.insert(std::pair<double, S>(2.0, S()));
cout << "3\n";
m.insert(std::pair<const double, S>(3.0, S()));
cout << "4\n";

return 0;
}
 
J

Juha Nieminen

Barry said:
Would the following be OK? -

No, because you have the compiler-generated copy constructor and
assignment operator in your class, which means that if you use them,
your program will probably crash because of accessing deleted memory or
because of double deletions.

You really shouldn't use dynamically allocated objects just to get rid
of one additional copy constructor call. Unless that copy constructor is
really slow, each 'new' is probably much slower, so what you are doing
is actually completely counter-productive. Additionally, each such
object will consume more memory. And this besides the memory management
nightmare you have in your hands.
 
J

Juha Nieminen

Pavel said:
Another is to
use pointers as suggested by previous answers and you can delete the
remaining objects in the destructor as suggested by Sam if all you need
is value semantics with less copying.

You do realize that allocating each object with 'new' is probably
going to be slower than that additional copy constructor call? Well,
unless his copy constructor is really slow (which it might be, but
without knowing how slow it is, it's impossible to say).

This kind of micro-optimization may end up being completely
counter-productive (in other words, it may end up making the program
*slower* rather than faster).

Besides the memory management nightmare this produces, of course.
 
J

James Kanze

Since std::map and std::multimap only handle objects of type
std::pair, never your specified types directly, there's no
easy way of avoiding the copy construction of your objects
when instantiating std::pair.

Note that the type of the objects in his multimap is std::pair<
double const, Note >. So at some point, his std::pair< double
Note > will have to be converted, which will result in a copy.
You could maybe get around it with low-level trickery, but
this is rather "don't do this at home" type of stuff.

More generally, Note is (or at least seems to be) a value class,
so it's going to be copied, and he should arrange for copying to
be "cheap enough", whatever that means in his application.
(Without knowing more about Note, or his application, it's hard
to say what the correct solution might be, but IMHO, avoiding
copies of a value type is generally doing things backwards,
although there are exceptions.)
 
P

Pavel

Pete said:
my_map.emplace(0.0, myObject), coming with C++0x. It takes its arguments
by reference, and constructs the stored object in place. No copying.
Still, myObject will have to be copied... by constructing the stored
object in place, anyway. With map of pointers this can be avoided.. at
the cost of dynamic memory allocation for myObject, of course. It is a
usual trade-off -- sometimes mapping pointers instead of objects pays
off, sometimes it does not. If myObject is big enough, it often does.

-Pavel
 
P

Pavel

Juha said:
You do realize that allocating each object with 'new' is probably
going to be slower than that additional copy constructor call? Well,
unless his copy constructor is really slow (which it might be, but
without knowing how slow it is, it's impossible to say).

This kind of micro-optimization may end up being completely
counter-productive (in other words, it may end up making the program
*slower* rather than faster).

Besides the memory management nightmare this produces, of course.
If you do not want memory management "nightmare", do not use multimap or
other STL container as they allocate/unallocate memory back and forth
all the time :). Seriously, it is certainly a trade-off but more often
than not I found that the map of pointers pays off. And, if the value to
be stored in the map is at least somewhat non-trivial (for example, has
a couple of std::string or std::vector members) it will most probably
pay off even in terms of the number of required memory allocations.

Using a new operator optimized for your object size (boost pool
allocator is one example; Alexandrescu's small chunk allocator from loki
is another) can also help to keep new/delete cost at bay -- but I found
(somewhat to my surprise) that even standard g++ "new" was not as bad as
I expected in my target environment.

DISCLAIMER: It may seem trivial but I saw too much of time and money
wasted due to the lack of the following: Any, even seemingly obvious,
assumption about software performance should be tested before it is
relied upon. While testing, I recommend turning on "production" software
configuration options including compiler optimization -- the performance
with and without optimization often differs significantly and not always
in obvious way.

-Pavel
 
J

James Kanze

Just a reminder that map<>::emplace is not a part of current
C++. It's part of the current draft for the next version of the
standard. It's highly unlikely that you can use it with your
compiler today.
Still, myObject will have to be copied... by constructing the
stored object in place, anyway.

If the final standard conforms to the current draft (and I see
no reason why it shouldn't in this regard), no copy will be
needed. You just pass whatever arguments you'd have passed to
the constructor to emplace, and it constructs the object in
place.
With map of pointers this can be avoided.. at the cost of
dynamic memory allocation for myObject, of course. It is a
usual trade-off -- sometimes mapping pointers instead of
objects pays off, sometimes it does not. If myObject is big
enough, it often does.

Most of the time, the motivation for mapping pointers is that
the objects have identity (and are often polymorphic). I've yet
to see a case where you'd map pointers with a value oriented
object, which can be copied.

(Note that if I understand correctly, objects created with
emplace in node based containers, like std::map, will not need
to be copiable.)
 
J

Juha Nieminen

Pavel said:
If you do not want memory management "nightmare", do not use multimap or
other STL container as they allocate/unallocate memory back and forth
all the time :).

That's not what I meant. With "memory management nightmare" I meant
that if you start writing raw news and deletes, you are *asking* for
memory leaks and other such problems to happen sooner or later.

The answer is *do* use multimap, but with value-based elements rather
than pointer-based.

If you really must use pointers instead, then usually use some smart
pointer to avoid the memory management issues.
 
J

Joshua Maurice

  That's not what I meant. With "memory management nightmare" I meant
that if you start writing raw news and deletes, you are *asking* for
memory leaks and other such problems to happen sooner or later.

  The answer is *do* use multimap, but with value-based elements rather
than pointer-based.

  If you really must use pointers instead, then usually use some smart
pointer to avoid the memory management issues.

I find your fears of leaks somewhat overrated. I work in a shop where
RAII is not well known nor appreciated. In that environment, I find it
relatively easy, doable, and safe to have a set of pointers as a class
member, and just have the class destructor iterate over the set
deleting the pointers.
 
P

Pavel

James said:
Just a reminder that map<>::emplace is not a part of current
C++. It's part of the current draft for the next version of the
standard. It's highly unlikely that you can use it with your
compiler today.


If the final standard conforms to the current draft (and I see
no reason why it shouldn't in this regard), no copy will be
needed. You just pass whatever arguments you'd have passed to
the constructor to emplace, and it constructs the object in
place.
I think I see what you are saying.. almost. I do not know much about the
new standard but by quick look it seems the way you gave it, myObject
still has to be copied.. not as a part of a pair, but by itself.. or
else you have to store it by pointer or a reference and then same
question of destructing it arises.. no? I mean: as long as the
value_type of the map is something like std::pair<const double,
MyObject>, the map has to store those MyObjects in pairs. You may be
able to forward arguments necessary to build MyObject but this seems to
be kind of second-level forwarding (you skip constructing both extra
pair object(s) and their data member of MyObject class). Is this also
going to be possible via emplace()?
Most of the time, the motivation for mapping pointers is that
the objects have identity (and are often polymorphic). I've yet
to see a case where you'd map pointers with a value oriented
object, which can be copied.
No, I was more towards answering the OP's question about efficiency.
It's funny though: I understand you can achieve polymorphic behavior
with the pointers, but during last 3 years I met with this technique
(storing pointers in maps) in two different companies; both times, it
was done for efficiency, not polymorphism. Recently, I had a chance to
measure the performance gain from storing in std::map rather
average-sized objects (less than 50 bytes) by pointers and the gain was
quite meaningful (some 10-30% depending on the usage pattern).
 
P

Pavel

Juha said:
That's not what I meant. With "memory management nightmare" I meant
that if you start writing raw news and deletes, you are *asking* for
memory leaks and other such problems to happen sooner or later.
This happens when the technique is applied in a haphazard manner or in a
hurry, but fully avoidable. For example, you inherit from std::map
and override erase()s, clear() and destructor to destroy pointers.
Another way (my favorite since recently as it helped me improve
performance in other, problem-specific, ways) is to use a set of custom
holder objects (a holder object contains both map key and a pointer to
the value and deletes in its destructor). Nothing too exciting, but
works just fine..
The answer is *do* use multimap, but with value-based elements rather
than pointer-based.

If you really must use pointers instead, then usually use some smart
pointer to avoid the memory management issues.
There is always a choice between using a swiss Army knife and a scalpel.
Swiss knife is more universal and readily available.. scalpel is capable
of giving more high-quality result. I consider smart pointers (I mean
shared_ptr and the likes you probably referred to) more like Swiss Army
knife -- or an attempt to create one (Swiss Army knives are much more
mature at this time). When you design for performance, however, you
often both can spend more and need to deliver a higher-quality result.
Accurately designed custom API implemented with raw pointers (you can
pass them in as auto-ptrs if you have a definitive allergy on raw
pointers but this is often more a firm-or-department-coding-style-policy
decision than something you want to deal with as you are solving a
particular problem) can be more appropriate in these situations.
Sometimes it is even beneficial to design from scratch your own
container that provides a combination of access methods that standard
containers don't provide.


Just an opinion, of course; your mileage can vary..
-Pavel
 
B

Bo Persson

Pavel said:
I think I see what you are saying.. almost. I do not know much
about the new standard but by quick look it seems the way you gave
it, myObject still has to be copied.. not as a part of a pair, but
by itself.. or else you have to store it by pointer or a reference
and then same question of destructing it arises.. no? I mean: as
long as the value_type of the map is something like std::pair<const
double, MyObject>, the map has to store those MyObjects in pairs.
You may be able to forward arguments necessary to build MyObject
but this seems to be kind of second-level forwarding (you skip
constructing both extra pair object(s) and their data member of
MyObject class). Is this also going to be possible via emplace()?

Yes, that's the idea. There will have to be some tweaks to the
standard before it works, but hopefully.


Bo Persson
 
J

Juha Nieminen

Joshua said:
I find your fears of leaks somewhat overrated.

Well, it has helped me writing leak-free code for over a decade. (I
don't even remember when was the last time I had a memory leak or an
access to freed memory. Must be over 10 years ago.)

And I have also found that when you write value-based code (not
always, of course, but by default if there is no good reason to use
pointer-based code) the code becomes simpler, cleaner and more readable.
Pointer-based code often tends to be more cryptic.

Of course I know when copying objects by value will be a heavy
operation. I know where bottlenecks are likely to happen because of
this. In those cases I might think of an alternative, more efficient
solution if necessary.
I work in a shop where
RAII is not well known nor appreciated. In that environment, I find it
relatively easy, doable, and safe to have a set of pointers as a class
member, and just have the class destructor iterate over the set
deleting the pointers.

In this particular case it was not a question of having a pointer as a
class member. It was a case of storing pointers (pointing to dynamically
allocated objects) into a multimap, and my programming instinct
immediately tells me that it's asking for trouble. The level of
carefulness and attention-to-detail has to immediately be increased by
an order of magnitude in this type of code in order to avoid problems.

I'm not saying it isn't perfectly possible to write working code like
this. I'm just saying that it will require more work and be more
hazardous (in that making mistakes will be easier).
 
N

Noah Roberts

Joshua said:
I find your fears of leaks somewhat overrated. I work in a shop where
RAII is not well known nor appreciated. In that environment, I find it
relatively easy, doable, and safe to have a set of pointers as a class
member, and just have the class destructor iterate over the set
deleting the pointers.

I've also worked in shops where good programming idioms were completely
ignored or were stubbornly fought against. Since I'm a perfect
programmer I also have no trouble writing in environments where the
standards are offended by reason.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,813
Messages
2,569,696
Members
45,483
Latest member
TedDvb6626

Latest Threads

Top