Memory footprint of std::vector<std::vector<T> >

Rune Allnor · Dec 10, 2008

Hi all.

I have a data structure which I naively implemented as
std::vector<std::vector<T> >. Right now I'm testing it
with 8000 vectors of vectors, each of which contain on the
order of 1000 elements, ending up somewhere on the
order of 350 MB of memory. I am nowhere near exhausting
any meory limits.

The program was compiled with default 'release' settings
(VS2008) and takes forever to run.

So I am curious about the memory footpint here. I am
aware that the elements of a std::vector occupy a continuous
block of memory. Is this memory assigned dynamically
or is *everything* in the std::vector assigned as one
dynamic block?

That is, are the internal control variables etc located in a
memory block 'close' to the data values of the vector,
or can the data values be located somewhere else
(e.g. on the heap)?

That is, if everything is one continuous block of memory,
all the data might have to be copied to new areas of
memory as the vectors grow. Which might explain
why the thing takes so long to run.

Rune

joecook · Dec 10, 2008

Hi all.

I have a data structure which I naively implemented as
std::vector<std::vector<T> >. Right now I'm testing it
with 8000 vectors of vectors, each of which contain on the
order of 1000 elements, ending up somewhere on the
order of 350 MB of memory. I am nowhere near exhausting
any memory limits.

That is, if everything is one continuous block of memory,
all the data might have to be copied to new areas of
memory as the vectors grow. Which might explain
why the thing takes so long to run.

You have 8000 contiguous blocks of memory here (each sub-vector). If
you can pre-size :reserve(): them, you can reduce the amount of memory
allocation while your program is running.

I would bet all the tea in china that the locality of the vector class
with the associated data is not your problem. Minimizing memory
reallocations is really the goal here; especially in your tightest
loops.

Joe

Juha Nieminen · Dec 10, 2008

You have 8000 contiguous blocks of memory here (each sub-vector). If
you can pre-size :reserve(): them, you can reduce the amount of memory
allocation while your program is running.

If you can't do that (eg. because you don't have any idea what the
sizes of the vectors will be), then you can use std::deque instead of
std::vector. In general std::deque will cause less memory fragmentation
and thus consume less memory in cases were the size of the container is
incremented little by little.

(OTOH std::deque cannot be used if you really need for the memory
blocks to be contiguous, eg. because you are using them as raw arrays
somewhere.)

The Librarian · Dec 10, 2008

I have a data structure which I naively implemented as
std::vector<std::vector<T> >. Right now I'm testing it
with 8000 vectors of vectors, each of which contain on the
order of 1000 elements, ending up somewhere on the
order of 350 MB of memory. I am nowhere near exhausting
any meory limits.

The program was compiled with default 'release' settings
(VS2008) and takes forever to run.

Starting from VS2005, Microsoft has introduced a new feature called
"checked iterators" that affect several STL classes.
As a consequence, every time you access an element of std::vector<T>,
the vector bounds are checked.
The situation is even worse for std::vector<std::vector<T> > because
double checks are performed (I think it requires 4 comparison for
every element access).

In order to disable this "feature", you should set the symbol
_SECURE_SCL to 0 before including the stl headers.
This is an example taken from MSDN:

// checked_iterators_4.cpp
// compile with: /EHsc
#define _SECURE_SCL 0
#include <vector>
#include <iostream>
using namespace std;
int main() {
vector<int> v;
v.push_back(67);
int i = v[0];
cout << i << endl;
};

Apart from this, I suggest you to turn on all the optimizations (in
particular SSE2, it can make a big difference).

Regards,
The Librarian

Rune Allnor · Dec 11, 2008

I have a data structure which I naively implemented as
std::vector<std::vector<T> >. Right now I'm testing it
with 8000 vectors of vectors, each of which contain on the
order of 1000 elements, ending up somewhere on the
order of 350 MB of memory. I am nowhere near exhausting
any meory limits.

Click to expand...

The program was compiled with default 'release' settings
(VS2008) and takes forever to run.

Click to expand...

Starting from VS2005, Microsoft has introduced a new feature called
"checked iterators" that affect several STL classes.
As a consequence, every time you access an element of std::vector<T>,
the vector bounds are checked.
The situation is even worse for std::vector<std::vector<T> > because
double checks are performed (I think it requires 4 comparison for
every element access).

In order to disable this "feature", you should set the symbol
_SECURE_SCL to 0 before including the stl headers.
This is an example taken from MSDN:

// checked_iterators_4.cpp
// compile with: /EHsc
#define _SECURE_SCL 0
#include <vector>
#include <iostream>
using namespace std;
int main() {
vector<int> v;
v.push_back(67);
int i = v[0];
cout << i << endl;

};

Apart from this, I suggest you to turn on all the optimizations (in
particular SSE2, it can make a big difference).

The SSE2 optimization seems to have at least halved the
data load + processing time.

The #define _SECURE_SCL 0 trick caused a segmentation
fault when I used it only in the main trasnslation unit;
it might work better if I include it in every file.

I'll have to try the std::deque and reserve() on the
vectors, although there is probably more time to be
saved by reimplementing some of the processing routines,
although these don't take very long (about the same time
it takes to convert from text to binary numbers when
I tested the program with a different setup).

When I run the program I watch memory consumption in
the task manager. The numbers update approximately
twice per second.

From 0 MB data loaded to about 200 MB data loaded,
the memory ticker increases about 1.7-1.8 MB/update.
After 200MB it increases about 0.3-0.4 MB/update.
Which is why I suspected reallocations of continuous
blocks of memory to be the culprit.

BTW, is there a way to adjust the size of the ifstream
buffer? I know I will be working with files with
sizes on the order of hundreds of MBytes, so I can
just as well allocate, say, 10 MB for the stream buffer
up front to reduce overhead on file I/O.

Rune

Questions about std::vector and std::deque	2	Mar 11, 2012
Array of pointer to std::vector and memory allocation	4	Aug 23, 2010
Question about memory allocation in std::vector<T>	3	Feb 11, 2008
Initialize std::vector<T>::iterator with arbitrary element?	2	Feb 1, 2010
wrapping std::vector<> to track memory usage?	7	Sep 30, 2008
STL vector memory footprint	12	Mar 18, 2006
Constraining the length of std::vector<double>	3	Jan 16, 2009
syntax error in std::vector<T>::const_iterator where T = (class) U<V>	17	Feb 6, 2011

Memory footprint of std::vector<std::vector<T> >

Rune Allnor

joecook

Juha Nieminen

The Librarian

Rune Allnor

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads