Need to understand tradeoff between array and vector

D

duli

Hi:
I have a need for fixed length vector of integers (say 3 ints) and
want to find out which is better to use: arrays or vectors.
I am noticing a big difference in performance and wrote test code to
illustrate this (written below). The time difference is huge
(about a 40 fold better performance for array).

So I am confused: I remember Stroustrup saying that one should always
use vectors instead of arrays.

What would happen if we did not know the length of the array at
compile time ?

Thanks
Duli.


Vector version:


class X {
vector<int> vec;
public:
X(const vector<int>& v) {vec = v;}
int first() { return vec[0];}
};


int main() {
vector<int> v;
v.resize(3);
v[0]=1; v[1]=2;v[2]=3;
int sum;
int starttime=time(NULL);
cout << starttime << endl;
for (int i=0;i<50000;i++)
for (int j=0;j<10000;j++) {
X x(v);
sum+=x.first();
}
int endtime=time(NULL);
cout << endtime << endl;
cout << endtime - starttime << endl;
}


Array version:

class X {
int f[3];

public:
X(int a[]) {f[0]=a[0]; f[1]=a[1];f[2]=a[2];}
int first() { return f[0];}
};


int main() {
int v[3];
v[0]=1; v[1]=2;v[2]=3;
int sum;
int starttime=time(NULL);
cout << starttime << endl;
for (int i=0;i<50000;i++)
for (int j=0;j<10000;j++) {
X x(v);
sum+=x.first();
}
int endtime=time(NULL);
cout << endtime << endl;
cout << endtime - starttime << endl;
}
 
J

Juha Nieminen

duli said:
So I am confused: I remember Stroustrup saying that one should always
use vectors instead of arrays.

I think there's a confusion of terminology here.

There are, in fact, two types of arrays: Static arrays and dynamic
arrays (or, more precisely, dynamically allocated arrays). A static
array looks like this:

int array[3];

A dynamic array looks like this:

int* array = new int[3];

These are two drastically different things. std::vector is by all
intents and purposes completely equivalent to the latter (except that
it's safer because it doesn't leak memory so easily), and completely
non-equivalent to the former.

What I believe Stroustrup was talking about was that in situations
where you would need the second type of array, you should definitely use
std::vector instead, because it's much safer.

However, if you have such a small, fixed-size array as the member of a
class, then using a static array is definitely more efficient than using
a dynamic array (which includes using std::vector).

There are at least two reasons why using a dynamic array (or
std::vector) instead of a static array in this situation is a lot less
efficient. In approximate order of importance:

1) 'new' and 'delete' are very heavy operations, and each time you are
instantiating your class with the std::vector as member, and you are
resizing it to have 3 elements, you are indirectly causing a 'new'
operation. When your class is destroyed, you are indirectly causing a
'delete' operation. These operations are not performed if you have a
static array of size 3 as a member of your class.

2) A dynamically allocated array (of 3 elements, as in this case)
consumes significantly more memory than a static array of 3 elements.
The static array will consume only 3*sizeof(int) bytes as a member of
the class, while the std::vector will most probably consume the size of
a pointer, plus 3*sizeof(int), plus any overhead the default allocator
needs for a dynamically allocated block of memory (usually at least 4
bytes).

There may also be an rather insignificant (at least compared to #1
above) slowdown each time you access the dynamic array (eg. through the
std::vector) because you are doing it through one extra level of
indirection compared to the static array (with the std::vector each time
you access the array, the compiler takes the pointer to the class
instance, reads the dynamic array pointer from a specific offset from
there, and then indexes the array through this second pointer, while
with the static array it's enough for the compiler to simply access an
offset using the first pointer).
 
J

Jorgen Grahn

Hi:
I have a need for fixed length vector of integers (say 3 ints) and
want to find out which is better to use: arrays or vectors.
I am noticing a big difference in performance and wrote test code to
illustrate this (written below). The time difference is huge
(about a 40 fold better performance for array).

So I am confused: I remember Stroustrup saying that one should always
use vectors instead of arrays.

I'm pretty sure he didn't use the word "always". Do you have a
reference?

I haven't looked closely at your code, but std::vector<int> is not
perfect for the case when your objects are always three ints. Maybe
the case with one single int is clearer: replace all ints in a program
with a std::vector<int> of size 1, and see how badly that works.

By the way, I find that it's rare to have a need for small fixed-size
array types. Are you sure you don't really want a struct with three
integers?

class X {
int x, y, z;
// ...
};

/Jorgen
 
S

Salt_Peter

Hi:
I have a need for fixed length vector of integers (say 3 ints) and
want to find out which is better to use: arrays or vectors.
I am noticing a big difference in performance and wrote test code to
illustrate this (written below). The time difference is huge
(about a 40 fold better performance for array).

Try compiling in release mode.
A vector's construction and copying will always be slower than a
primitive array.
So I am confused: I remember Stroustrup saying that one should always
use vectors instead of arrays.

vectors and arrays are 2 different beasts. When a primitive array is
required, choose it over a vector. Arrays usually mean more code, more
work. A vector has a lot to offer other than the fact that its
dynamic.

range_checked accessor using at(index), reserve() to set capacity,
iteration using begin() and end(), object construction using a non-
default ctor, a number of predefined operators and algorithms, etc
What would happen if we did not know the length of the array at
compile time ?

you would allocate a dynamic array, that means you need to manage the
allocations manually.

const int length(50000);
int* arr = new int[ length ];

with a vector you can initialize it in one line and its still dynamic.

std::vector said:
Thanks
Duli.

Vector version:

class X {
vector<int> vec;
public:
X(const vector<int>& v) {vec = v;}

// use init list
X(const vector said:
int first() { return vec[0];}

};

int main() {
vector<int> v;
v.resize(3);
v[0]=1; v[1]=2;v[2]=3;
int sum;
int starttime=time(NULL);
cout << starttime << endl;
for (int i=0;i<50000;i++)
for (int j=0;j<10000;j++) {
X x(v);
sum+=x.first();
}
int endtime=time(NULL);
cout << endtime << endl;
cout << endtime - starttime << endl;

}

Array version:

class X {
int f[3];

public:
X(int a[]) {f[0]=a[0]; f[1]=a[1];f[2]=a[2];}
int first() { return f[0];}

};

int main() {
int v[3];
v[0]=1; v[1]=2;v[2]=3;
int sum;
int starttime=time(NULL);
cout << starttime << endl;
for (int i=0;i<50000;i++)
for (int j=0;j<10000;j++) {
X x(v);
sum+=x.first();
}
int endtime=time(NULL);
cout << endtime << endl;
cout << endtime - starttime << endl;

}
 
J

joseph cook

Hi:
I have a need for fixed length vector of integers (say 3 ints) and
want to find out which is better to use: arrays or vectors.
I am noticing a big difference in performance and wrote test code to
illustrate this (written below). The time difference is huge
(about a 40 fold better performance for array).

So I am confused: I remember Stroustrup saying that one should always
use vectors instead of arrays.

What would happen if we did not know the length of the array at
compile time ?

Actually,

I would prefer boost::array<int,3>.
(soon to be std::array also)

boost::array<int,3> data = {5,5,1};

Joe Cook
 
J

James Kanze

I have a need for fixed length vector of integers (say 3 ints)
and want to find out which is better to use: arrays or
vectors.

It depends on what you're doing, but most of the time, vectors
should be preferred. The one real exception is when you need
static initialization to manage order of initialization issues.
I am noticing a big difference in performance and wrote test
code to illustrate this (written below). The time difference
is huge (about a 40 fold better performance for array).

For some things, it's probable that std::vector will be slower,
at least in most implementations. Creating a lot of small,
short lived arrays would be an example. If your program isn't
fast enough, and the profiler shows that this is due to creating
and destroying vectors, by all means, consider using an array.
(There are even a cases where a very experienced programmer
might just use arrays from the start.)
So I am confused: I remember Stroustrup saying that one should
always use vectors instead of arrays.

Until you really know what you're doing, at any rate. Generally
speaking, until the profiler says otherwise; arrays in C++ are
fundamentally broken. About the only real exception is when you
need static initialization.
What would happen if we did not know the length of the array
at compile time ?

Then you can't use arrays. It's that simple.
 
J

James Kanze

  I think there's a confusion of terminology here.
  There are, in fact, two types of arrays: Static arrays and dynamic
arrays (or, more precisely, dynamically allocated arrays). A static
array looks like this:
    int array[3];
  A dynamic array looks like this:
    int* array = new int[3];
These are two drastically different things. std::vector is by
all intents and purposes completely equivalent to the latter
(except that it's safer because it doesn't leak memory so
easily),

And that a good implementation will have bounds checking. And
that it is a vector, and not just a pointer, and that you
haven't lost the size.
and completely non-equivalent to the former.

And just how is
std::vector< int > array( 3 ) ;
different from the former? Except that it doesn't convert into
a pointer (and thus loose the size), and that it is correctly
initialized. (Both fundamental advantages.)

In general: use a C style array if you need static
initialization (in an object with static lifetime), or possible,
for very small arrays in frequently used objects, if you expect
to see lots and lots of short lived instances of those objects.
Otherwise, use std::vector.
What I believe Stroustrup was talking about was that in
situations where you would need the second type of array, you
should definitely use std::vector instead, because it's much
safer.

I suspect that Stroustrup was talking about all use. And
somehow, I doubt that he said "always"; more likely usually, or
almost always. (I'm sure, for example, that he understands the
issues of static initialization.) Or perhaps he was just saying
it in the context of a learner; someone who's just beginning to
learn C++ won't encounter the cases where a C style array makes
sense.
However, if you have such a small, fixed-size array as the
member of a class, then using a static array is definitely
more efficient than using a dynamic array (which includes
using std::vector).

I think you meant C style array, and not "static array".
(Static has a very definite meaning with regards to class
members.) As for more efficient, it depends on how the class is
used. Your scenario is one where an experienced programmer
might consider using a C style array, but it's far from
guaranteed (and is only applicable to an experienced
programmer).
There are at least two reasons why using a dynamic array (or
std::vector) instead of a static array in this situation is a
lot less efficient. In approximate order of importance:
1) 'new' and 'delete' are very heavy operations, and each time
you are instantiating your class with the std::vector as
member, and you are resizing it to have 3 elements, you are
indirectly causing a 'new' operation. When your class is
destroyed, you are indirectly causing a 'delete' operation.
These operations are not performed if you have a static array
of size 3 as a member of your class.

That, of course, depends on the implementation (and how you
define "heavy"). For this to be relevant, the arrays must also
be short-lived; who cares if it takes, say, 10 microseconds to
create one if you do a couple of million operations on it
afterwards.
2) A dynamically allocated array (of 3 elements, as in this
case) consumes significantly more memory than a static array
of 3 elements. The static array will consume only
3*sizeof(int) bytes as a member of the class, while the
std::vector will most probably consume the size of a pointer,
plus 3*sizeof(int), plus any overhead the default allocator
needs for a dynamically allocated block of memory (usually at
least 4 bytes).

This can be an important point (and I'd forgotten it). You're
right: if you expect to have literally millions of instances of
the object, you probably should go with the C style array.
There may also be an rather insignificant (at least compared
to #1 above) slowdown each time you access the dynamic array
(eg. through the std::vector) because you are doing it through
one extra level of indirection compared to the static array
(with the std::vector each time you access the array, the
compiler takes the pointer to the class instance, reads the
dynamic array pointer from a specific offset from there, and
then indexes the array through this second pointer, while with
the static array it's enough for the compiler to simply access
an offset using the first pointer).

Hmmm. The measurements I've seen seem to show the opposite,
although I'll admit that I don't know why. At any rate, it
depends on the implementation, and the underlying architecture
(available addressing modes, etc., and how efficient they
are---the only measurements I've made were on a Sparc, which is
a RISC architecture, and requires the address to be in a
register in order to access memory anyway).
 
J

James Kanze

Try compiling in release mode.
A vector's construction and copying will always be slower than
a primitive array.

At least with most current compilers. In theory, a compiler
could optimize them to the same code, but that would take a very
intelligent compiler (or one that had built-in knowledge of
std::vector).
vectors and arrays are 2 different beasts.

Yes. Vector isn't broken.
When a primitive array is required, choose it over a vector.

About the only time it's really required is when you need static
initialization in order to manage order of initialization
issues.
Arrays usually mean more code, more work. A vector has a lot
to offer other than the fact that its dynamic.
range_checked accessor using at(index), reserve() to set
capacity, iteration using begin() and end(), object
construction using a non- default ctor, a number of predefined
operators and algorithms, etc

Most importantly, the fact that it isn't broken. It can be
copied and assigned to, exactly like any other type, and it
doesn't convert to a pointer, with loss of information, at the
drop of a hat.
 
J

James Kanze

I would prefer boost::array<int,3>.
(soon to be std::array also)
boost::array<int,3> data = {5,5,1};

Good point. Boost::array has been carefully designed to work in
all of the contexts where you'd want to use a C style array: it
supports static initialization, and has the same performance.
But it doesn't have the serious drawbacks; you can copy and
assign it, and it doesn't implicitly convert to a pointer.
 
J

Juha Nieminen

James said:
And just how is
std::vector< int > array( 3 ) ;
different from the former? Except that it doesn't convert into
a pointer (and thus loose the size), and that it is correctly
initialized. (Both fundamental advantages.)

Its size is different, it doesn't store the values inside the owning
class but elsewhere, copying it is internally a rather different (and
heavier, especially for arrays of 3 integer elements) operation...
In general: use a C style array if you need static
initialization (in an object with static lifetime), or possible,
for very small arrays in frequently used objects, if you expect
to see lots and lots of short lived instances of those objects.
Otherwise, use std::vector.

If the size of the array is fixed (and known at compile time), and
each instance of the class will have its own private array (ie. it won't
be shared among objects in any way), I see little reason to not to use a
C-style array rather than std::vector. Regardless of the size of the
array, it will consume less memory and be more efficient (of course the
larger the array, the smaller the significance of this overhead, but
still...)

With the next C++ standard there might be one reason to prefer
std::vector in some cases, especially if the array is very large:
std::vector will automatically support move semantics, which C-style
arrays can't do.
I suspect that Stroustrup was talking about all use.

But having a std::vector with three elements as a member variable is
going to be significantly less efficient than having just an array of
three elements, no matter what you do and what tricks the compiler can
come up with. It would be odd to recommend using std::vector in this
kind of case as well.

And when I say "significant" I don't mean "maybe 1% slower". More like
"at least 10 times slower" with many operations (such as creation,
copying, etc).
I think you meant C style array, and not "static array".

The terminology is rather confusing. An array allocated dynamically at
the end of a pointer could also be thought as a "C style array" (in
contrast to std::vector)...

I used "static" as opposed to "dynamic" (ie. allocated dynamically
with new), rather than meaning "static member".

What *is* the opposite of "dynamically allocated", if it's not
"statically allocated" (ie. "static" for short)?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,186
Latest member
vinaykumar_nevatia

Latest Threads

Top