Use of std::vector slower than arrays?

Bo Persson · Nov 13, 2007

James Kanze wrote:
:
:: :
: [...]
:: I did some tests on my system and had dismal results also, but
:: in release things are a bit better, but not much.
:
:: Using your algorithm the array based was 6 seconds, the vector
:: based was 29 seconds.
:
:: Using vectors with the array based algorithm
:: arradd(&av[0], &av[0], &bv[0], 4);
:: is 16 seconds.
:
:: I tried all the optmizations I could find but couldn't get an
:: improvement over these values.
:
: Which compilers, and which options?
:
: I gave the complete version he posted a quick try on the systems
: I have at hand (a PC with VC++ 2005 and g++ 3.3.3---the CygWin
: port, but I don't think that affects CPU bound programs much---,
: an Intel PC with Linux and g++ 4.1.0, and a Sun Sparc with g++
: 4.1.0 and Sun CC 5.8---Sun CC comes with two different library
: implementations). The results are interesting, to say the
: least, and do indicate just how much the answer is
: implementation dependent. I multiplied the loop count by 10 to
: get measurable times. My compiler options were:
:
: VC++: cl /vmg /GR /EHs /D_SECURE_SCL=0 /O2
: g++: g++ -O3
: Sun CC: CC -O4 and CC -library=stlport4 -O4
:
: The results are interesting, to say the least:
:
: array vector
: Windows PC:
: VC++ 22 22
: g++ 30 38
:
: Linux PC:
: g++ 33 42
:
: Sun Sparc:
: g++ 73 77
: Sun standard 60 172
: Sun stlport 60 165
:

I just have to give another data point, showing that we *always* have
to measure to know for sure.

Again, VC++ 2005, but with an alternate standard library
implementation:

Test 1: 1000 million 128-bit additions using C arrays...
Test 1 complete.
CPU time elapsed: 23 seconds
Wall time elapsed: 24 seconds
Final sum: 23106FE7 44D0A28A 64EF219D 9803CB01

Test 2: 1000 million 128-bit additions using vector...
Test 2 complete.
CPU time elapsed: 14 seconds
Wall time elapsed: 14seconds
Final sum: 23106FE7 44D0A28A 64EF219D 9803CB01

For some reason, the compiler here decides that it is proper to fully
inline the vecadd function, saving a call and the parameter passing.
On the other hand, it decides to call arradd() out-of-line while
pushing and popping parameters.

A negative abstraction penalty?

Bo Persson

mike3 · Nov 13, 2007

On Nov 13, 3:13 am, mike3 <[email protected]> wrote:

<snip>

Well, now I got to to go nice and fast by shutting off the
checks. Thanks for the answers, everyone.

James Kanze · Nov 14, 2007

news:[email protected]...

Click to expand...

[...]
And VC++ deserves an accolade, both for the quality of its
library and the quality of its compiler. And a criticism: I had
to grunge around in the sources to find the _SECURE_SCL define;
while it didn't take me more than a couple of minutes, that's
because I'm used to this sort of thing---it really should be
documented somewhere. (Maybe it is, but I don't know how to
find it.)

Click to expand...

It is quite well documented if you know where to look (search
for checked iterators at msdn), the problem is that they do
not tell you about it on the pages describing the containers
and iterators.

The problem is that it isn't documented with the rest of the
compiler flags controling optimization. I know the reason, and
I know that Microsoft isn't alone in this; the reason I was able
to find it so quickly perusing the sources is because I've had
to do the same thing so often with other compilers. (Where does
g++ document -D_GLIBCXX_DEBUG, etc., for example.)

Such documentation belongs on the page with the compiler code
generation options. (Even better, of course, would be to add a
real option to the compiler driver, which defines the symbol one
way or another.)

MS would recommend to use intrinsics (SIMD instructions) but
of course by then you have left everything called portability
behind. And I have no idea whether they would be useful for
this purpose.

Once you use assembler, you've abandonned portability. But some
things (like the handling of carry) can't be easily expressed in
C++, and yet are simple in assembler. (I once saw 2000 lines of
C, used for BCD arithmetic, rewritten in something like 40 lines
of assembler. In this case, the assembler was actually "higher
level" than C, since the target machine had hardware
instructions which did almost exactly what we needed. Of
course, the resulting assembler only worked on that one
machine.)

nmi · Nov 14, 2007

It could be worth a trial to take the iterators out from the function. Now you have this inside:

vector<Digit>::iterator ai(a->begin());
vector<Digit>::const_iterator bi(b->begin());
vector<Digit>::const_iterator ci(c->begin());

That means

creation,
initialization and
destruction

of the iterators in every run.

If you create the iterators outside and pass them into the function as (reference) parameters, surely you spare some time.

How much? Who knows... :dontknow:

You have to try.

nmi · Nov 14, 2007

I tried your code on VC6.

Release mode:
Without optimization:

array: 10 seconds
vector: 13 seconds

With optimization:

array: 3 seconds
vector: 2 seconds (yes!!)

Debug mode:

array: 8 secs
vector: 36 secs

Can I avoid the use of arrays in this situation or do I have to usethem?	47	Nov 23, 2007
Better way to program this?	28	Nov 3, 2007
Insertion Sort : C++ implementation 100 times slower than C implementation	7	Oct 31, 2011
Questions on implementing a floating point bignum package	2	Dec 5, 2007
Which is a better way to implement this algorithm?	19	Apr 20, 2008
std::find_if in vector	3	Nov 5, 2008
Is there in memory tree structure faster than std::map?	11	Aug 12, 2012
No boundschecking?	23	Jan 30, 2008

Use of std::vector slower than arrays?

Bo Persson

mike3

James Kanze

nmi

nmi

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads