Use of std::vector slower than arrays?

B

Bo Persson

James Kanze wrote:
:
:: :
: [...]
:: I did some tests on my system and had dismal results also, but
:: in release things are a bit better, but not much.
:
:: Using your algorithm the array based was 6 seconds, the vector
:: based was 29 seconds.
:
:: Using vectors with the array based algorithm
:: arradd(&av[0], &av[0], &bv[0], 4);
:: is 16 seconds.
:
:: I tried all the optmizations I could find but couldn't get an
:: improvement over these values.
:
: Which compilers, and which options?
:
: I gave the complete version he posted a quick try on the systems
: I have at hand (a PC with VC++ 2005 and g++ 3.3.3---the CygWin
: port, but I don't think that affects CPU bound programs much---,
: an Intel PC with Linux and g++ 4.1.0, and a Sun Sparc with g++
: 4.1.0 and Sun CC 5.8---Sun CC comes with two different library
: implementations). The results are interesting, to say the
: least, and do indicate just how much the answer is
: implementation dependent. I multiplied the loop count by 10 to
: get measurable times. My compiler options were:
:
: VC++: cl /vmg /GR /EHs /D_SECURE_SCL=0 /O2
: g++: g++ -O3
: Sun CC: CC -O4 and CC -library=stlport4 -O4
:
: The results are interesting, to say the least:
:
: array vector
: Windows PC:
: VC++ 22 22
: g++ 30 38
:
: Linux PC:
: g++ 33 42
:
: Sun Sparc:
: g++ 73 77
: Sun standard 60 172
: Sun stlport 60 165
:

I just have to give another data point, showing that we *always* have
to measure to know for sure.

Again, VC++ 2005, but with an alternate standard library
implementation:

Test 1: 1000 million 128-bit additions using C arrays...
Test 1 complete.
CPU time elapsed: 23 seconds
Wall time elapsed: 24 seconds
Final sum: 23106FE7 44D0A28A 64EF219D 9803CB01

Test 2: 1000 million 128-bit additions using vector...
Test 2 complete.
CPU time elapsed: 14 seconds
Wall time elapsed: 14seconds
Final sum: 23106FE7 44D0A28A 64EF219D 9803CB01

For some reason, the compiler here decides that it is proper to fully
inline the vecadd function, saving a call and the parameter passing.
On the other hand, it decides to call arradd() out-of-line while
pushing and popping parameters.


A negative abstraction penalty? :)


Bo Persson
 
J

James Kanze

[...]
And VC++ deserves an accolade, both for the quality of its
library and the quality of its compiler. And a criticism: I had
to grunge around in the sources to find the _SECURE_SCL define;
while it didn't take me more than a couple of minutes, that's
because I'm used to this sort of thing---it really should be
documented somewhere. (Maybe it is, but I don't know how to
find it.)
It is quite well documented if you know where to look (search
for checked iterators at msdn), the problem is that they do
not tell you about it on the pages describing the containers
and iterators.

The problem is that it isn't documented with the rest of the
compiler flags controling optimization. I know the reason, and
I know that Microsoft isn't alone in this; the reason I was able
to find it so quickly perusing the sources is because I've had
to do the same thing so often with other compilers. (Where does
g++ document -D_GLIBCXX_DEBUG, etc., for example.)

Such documentation belongs on the page with the compiler code
generation options. (Even better, of course, would be to add a
real option to the compiler driver, which defines the symbol one
way or another.)
MS would recommend to use intrinsics (SIMD instructions) but
of course by then you have left everything called portability
behind. And I have no idea whether they would be useful for
this purpose.

Once you use assembler, you've abandonned portability. But some
things (like the handling of carry) can't be easily expressed in
C++, and yet are simple in assembler. (I once saw 2000 lines of
C, used for BCD arithmetic, rewritten in something like 40 lines
of assembler. In this case, the assembler was actually "higher
level" than C, since the target machine had hardware
instructions which did almost exactly what we needed. Of
course, the resulting assembler only worked on that one
machine.)
 

nmi

Joined
Jul 6, 2007
Messages
13
Reaction score
0
It could be worth a trial to take the iterators out from the function. Now you have this inside:

vector<Digit>::iterator ai(a->begin());
vector<Digit>::const_iterator bi(b->begin());
vector<Digit>::const_iterator ci(c->begin());


That means

  • creation,
  • initialization and
  • destruction
of the iterators in every run.

If you create the iterators outside and pass them into the function as (reference) parameters, surely you spare some time.

How much? Who knows... :dontknow:

You have to try.
 

nmi

Joined
Jul 6, 2007
Messages
13
Reaction score
0
I tried your code on VC6.

Release mode:
Without optimization:

array: 10 seconds
vector: 13 seconds

With optimization:

array: 3 seconds
vector: 2 seconds (yes!!)

Debug mode:

array: 8 secs
vector: 36 secs
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,219
Latest member
KristieKoh

Latest Threads

Top