vector::push_back performance

Uwe Schnitker · Sep 30, 2004

Andrew Koenig said:
I'm skeptical.

Here's a little program:

#include <vector>

int main()
{
std::vector<int> v;
for (std::vector<int>::size_type i = 0; i != 1000000; ++i)
v.push_back(i);
return 0;
}

When I run this program on my machine (admittedly faster than 1.3G, but no
more than twice as fast), it runs in three *hundredths* of a second. And it
calls push_back a million times, not 300,000 times.

Just for the records: Are you sure it _does_ call push_back at all?

Since your program doesn't produce any observable effect, the compiler
has licence to transfer it into something like

int main()
{
}

which should probably run quite fast, shouldn't it?

Have fun,

Uwe

Peter van Merkerk · Sep 30, 2004

Uwe said:
Just for the records: Are you sure it _does_ call push_back at all?

Since your program doesn't produce any observable effect, the compiler
has licence to transfer it into something like

int main()
{
}

which should probably run quite fast, shouldn't it?

Good point, one should always be carefull with drawing conclusions from
test like this.

I compiled Andrew program on MSVC 6.0 with full optimization and checked
the assembly output. On this compiler the loop isn't optimized away.
push_back() is inlined, but the function that inserts elements into the
vector is still called one million times. Nevertheless this program
completed in 0.10 seconds on a Pentium III @ 700Mhz.

But I can imagine when the objects in the vector are expensive to copy
the story changes quite a bit.

Andrew Koenig · Sep 30, 2004

Uwe Schnitker said:
Just for the records: Are you sure it _does_ call push_back at all?

Since your program doesn't produce any observable effect, the compiler
has licence to transfer it into something like

int main()
{
}

which should probably run quite fast, shouldn't it?

If I insert the statement

std::cout << v.size() << std::endl;

in the appropriate place, it prints 1000000 and still runs in 0.03 seconds.

algorithm · Oct 1, 2004

Uwe - do you know who yoe are questioning?
I guess not...
Have fun yourself

Antonios Christofides · Oct 3, 2004

Hi again,

after several experiments, I see that, indeed, when vector:

ush_back
needs to reallocate memory, it does construct a copy of the object and
destruct the old object. When I recompiled my program without
optimization, I saw that the reallocation routines don't actually take
much time, but the constructor of the contained class does; with
optimization, it is apparently inlined and the profiler indicates that
the time is spent in the reallocation routines. If I call
vector::reserve prior to performing the 306 thousand push_backs, the
constructor is called 306 thousand times; if I don't, it is called 830
thousand times (likewise for the destructor).

Why don't just memmove the objects? I understand that in the general
case this is not possible; for example, if the constructor registers
the object's address in some global vector object. But this would not
be a problem for my class, which does not do such things. Furthermore,
my examination with the debugger indicates that when I copy an
instance of it, the copy is identical, byte by byte; even a string
member points to the same memory location as the original. So is it
just that the compiler/stl can't know, and we can't hint it, that a
memmove would suffice?

Here's the class, for your reference:

struct Record {
Record(const Date& t, bool n, double v, string f):
timestamp(t), null(n), value(v), flags(f) { }
Date timestamp;
bool null;
double value;
string flags;
};

("Date" is a user class internally represented as a struct tm.)

Uwe Schnitker · Oct 4, 2004

algorithm said:
Uwe - do you know who yoe are questioning?
I guess not...
Have fun yourself

I'm not sure when I first read something either about or by AK, but it
must have been in the previous century ...

Questioning (well-earned) authority is not a matter of refusal to
learn from it. Quite the contrary.

Victor Bazarov · Oct 4, 2004

Antonios said:
[...]
Why don't just memmove the objects? I understand that in the general
case this is not possible; for example, if the constructor registers
the object's address in some global vector object. But this would not
be a problem for my class, which does not do such things. Furthermore,
my examination with the debugger indicates that when I copy an
instance of it, the copy is identical, byte by byte; even a string
member points to the same memory location as the original. So is it
just that the compiler/stl can't know, and we can't hint it, that a
memmove would suffice?
[...]

Try implementing the copy c-tor for your class in terms of 'memmove'.
If you succeed, remember that it's not portable. If you don't succeed,
read up on "move semantics" for return values and temporaries (IIRC),
the discussion was in comp.lang.c++.moderated somewhere in the past
couple of years.

Victor

Tom Widmer · Oct 4, 2004

Except for what was already said, I believe that even in the cases
when realloc needs to allocate a new memory block rather than extend
the existing one, it uses memmove or some similar operation which will
be faster than manually copying the elements if its implementation
uses specialized mass-byte-copying CPU instructions.

std::vector will also typically use memmove or memcpy for built in
types like int. The easiest way to see this is to trace into the
push_back calls in a debugger - working out the code path can be
difficult otherwise.

For object types with copy constructors, the language currently
contains no better way of moving them that copying them and then
destroying the original. However, move constructors are a future
solution to this problem.

Tom

Tom Widmer · Oct 4, 2004

Hi again,

after several experiments, I see that, indeed, when vector:ush_back
needs to reallocate memory, it does construct a copy of the object and
destruct the old object. When I recompiled my program without
optimization, I saw that the reallocation routines don't actually take
much time, but the constructor of the contained class does; with
optimization, it is apparently inlined and the profiler indicates that
the time is spent in the reallocation routines. If I call
vector::reserve prior to performing the 306 thousand push_backs, the
constructor is called 306 thousand times; if I don't, it is called 830
thousand times (likewise for the destructor).

Right. Unfortunately the std::allocator interface doesn't include any
kind of try_realloc method, so it always has to allocate a new block
and copy, even when there is free space after the current storage.

Why don't just memmove the objects? I understand that in the general
case this is not possible; for example, if the constructor registers
the object's address in some global vector object. But this would not
be a problem for my class, which does not do such things. Furthermore,
my examination with the debugger indicates that when I copy an
instance of it, the copy is identical, byte by byte; even a string
member points to the same memory location as the original. So is it
just that the compiler/stl can't know, and we can't hint it, that a
memmove would suffice?

Yes, exactly. Strictly speaking, you can't use memmove portably on any
non-POD type, but in practice it works well with many types.

Here's the class, for your reference:

struct Record {
Record(const Date& t, bool n, double v, string f):
timestamp(t), null(n), value(v), flags(f) { }
Date timestamp;
bool null;
double value;
string flags;
};

("Date" is a user class internally represented as a struct tm.)

Well, I can think of reasonable implementations of std::string that
wouldn't be memmoveable (e.g. using a COW based lock-free pointer XOR
reference linked list implementation), so the copy-destroy cycle is
the best you can do until this comes to be:
http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2004/n1690.html#Move Semantics

Tom

David Harmon · Oct 6, 2004

On Thu, 30 Sep 2004 14:03:10 GMT in comp.lang.c++, "Andrew Koenig"

If I insert the statement

std::cout << v.size() << std::endl;

in the appropriate place, it prints 1000000 and still runs in 0.03 seconds.

But perhaps the compiler is determining statically what the size of the
vector would be at that point and encoding only that string in the
executable. I think you have to read the initial 1000000 from a file
that is unknown at compile time.

Andrew Koenig · Oct 7, 2004

David Harmon said:
On Thu, 30 Sep 2004 14:03:10 GMT in comp.lang.c++, "Andrew Koenig"

But perhaps the compiler is determining statically what the size of the
vector would be at that point and encoding only that string in the
executable. I think you have to read the initial 1000000 from a file
that is unknown at compile time.

Possible but unlikely.

How about trying whatever tests you feel are appropriate on your own machine
and reporting the results?

vector push_back question	8	Jan 8, 2006
vector iterators ...	2	Jun 3, 2006
stl vector performance	8	Feb 19, 2009
Is it better idea to do resize of vector before calling push_back?	14	Jan 16, 2007
std::vector performance	11	May 24, 2004
std::priority_queue: Abysmally bad performance versus std::set, can'tfigure out why.	5	Nov 30, 2012
vector, but without the overhead of dynamic memory allocation	5	Mar 9, 2011
'Re: std::vector performance	1	May 25, 2004

vector::push_back performance

Uwe Schnitker

Peter van Merkerk

Andrew Koenig

algorithm

Antonios Christofides

Uwe Schnitker

Victor Bazarov

Tom Widmer

Tom Widmer

David Harmon

Andrew Koenig

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads