Vectors vs Arrays performance

J

James Kanze

volatile is not relevant to this discussion, it was simply
used to illustrate that *at least one* extra pointer
dereference is required even if this is subsequently cached in
a register. VC++ would optimize both assignments completely
away without it.

Volatile is very relevant, because anything which affects
optimization is very relevant when you start speaking of
generated code.
You *cannot* avoid *at least one* extra pointer dereference
when using std::vector compared to a stack based container or
array.

Never the less, most compilers do, most of the time.
BTW I am not eschewing the use of std::vector due to an
optimization gain that varies wildly depending on use-case, I
hardly use stack based arrays now, I use std::vector a lot or
my own stack based container.

Certainly. I understand that. I still think your basic precept
is wrong. Compilers are not at all as dumb as you seem to
think.
 
J

James Kanze

Well, this far you did not say anything that would explain the
last part -- how you reach the "quite rare" figure.

It's based on two things: a knowledge of how compilers (and in
particular, their optimizers) work, and the actual uses I make
of arrays in my own code. Obviously, the second is very
personal, but really, how often do you use have an array and
only access a single element?
The situation is like having points A anc C, case one: go stratight line
A->C, case two: tough point B first. B amy happen to be on the line, or
moved there by some tricks -- or the whole trip may be ignored in the full
travel context, but the difference still is one way, and its being zero
depends on a bunch of 'chance' factors.
And those factors are the kind IMO you can't tell.

Nope. You can't tell. That's exactly my point. Until you
actually measure, you can't really tell which solution will be
fastest.
To actually have that, the compiler must:
- inline vector's constructor or last resize() operation that sets the data
address

Not really. What is required is that all of the functions in a
given sequence be inlined, so that the compiler can tell that
the pointer doesn't change.

I've already admitted that the fact that the operators are
functions, and not built-in operators, can have a negative
effect on performance. That's a different issue (sometimes
referred to as the abstraction penalty).
- the arbitrary code that follows until the next index op shall not use all
the available data registers

Agreed. That's usually the case, however, or else why are you
using an array. (That's also the reason why I said that the
results depend on the architecture. Most of my work until
recently has been on Sparcs, with 32 registers, and of course,
optimizers have a much easier job there than on an Intel.)
- if the code has any of vector's operations those also must
all be inlined, or the compiler use special knowledge that
they do not change the state

Agreed, sort of. Inlining is actually irrelevant (except that
any real function calls will slow things down); what's important
is that the compiler can "see" the code in all of the functions
involved. Since std::vector is a template, this is guaranteed
unless the compiler implements export, and the standard library
uses it. Which as far as I know doesn't leave many compilers
out of the running.
To me it looks like too many assumes tied to arbitrary/unknown
code.

Excuse me, but I'm not assuming anything. I'm arguing against a
statement which said "Accessing a vector's contents via
operator[] will always be slightly slower[...]". All I have to
do to prove my point is show one counter example.

[...]
How can you know critical paths of unknown/arbitrary code?

If it's a critical path, it will be in a tight loop; otherwise,
any extra access won't make a measurable difference. And
compilers do know how to optimize tight loops.
Why you assume we think that? In many post here and around I'm
the one usually pointing out that healthy optimizer shall
treat different-looking expression the same.

I know. That's why your reaction here surprises me.

[...]
Thought we are discussing an 'equivelence' issue, and be
serioeus about it, not a contest to cheat each other ot the
public.

I'm not sure about the equivalence issue---what is supposed to
be equivalent to what? But the fact remains that the one
example posted counts as a "falsified" benchmark, in the sense
that it looks at a very atypical example. A more typical
example would involve something like:

for ( int i = 0; i != v.size(); ++ i )
// something with v

Or perhaps a binary search (so the compiler couldn't trivially
remap the index manipulations into pointer arithmetic).
In general, possibly, for this case those were irrelevant and
did not mess with the relevant content.
(FWIW: my case would use a virtual function returning a
from a private member array. In my experience, virtual
functions are really great optimization blockers for a lot
of compilers---including VC++.

Huh? Of course they are -- there is a call the compiler can't
follow,

All compilers can in a few situations, and some can in almost
all situtations. But most can't most of the time:). And I've
designed my bench harness to make it particularly difficult for
the compiler to follow, even for those which can do so in most
cases. (I ensure, for example, that there are a similar number
of calls which resolve to different virtual functions in the
same loop. Each time I enter the loop, however, its the same
virtual function called each time in the loop, and a compiler
could potentially detect that, and given that there are only a
couple different virtual functions, test before entering the
loop, and generate as many different versions of the loop as
necessary, with the corresponding virtual function inlined in
each case. I've yet to see, or even to hear of such a compiler,
however.)
and the called function has license to change anything,
including a private member vector's state. For most practical
cases even if you use const like crazy.
And there are many similar 'blockers'. That is why I don't
get your claim for 'rare', my educated guess would calculate
with at least one extra fetch somewhere -- and call its
omissin a case of good luck.

It would be interesting to see some statistics on real code. As
I said, in the code I've seen, arrays are used because the code
will loop over the elements. Which means that the compiler has
an easy job of optimizing.
Err, I think I lost you, as I got it originally the function
may do aything, and it is arbitrary.

There's a difference between what a function *may* do (legally,
according to the language), and what it is likely to do in
practice. How often do you have, in practice, virtual functions
which access a single array element?
IMO we all agreed that it *can* be hoisted.

In which case, the original statement is false.
Also, that even if it isn't the overall speed difference is
likely amortized.

In sum, you agree with me. You will not *always* see a
performance loss.

[...]
Yet, denying away the difference is IMO not fair -- the users
are better aware of the difference.

Presumably, the output of the profiler should indicate the
difference, if there is one and it is in a critical path.

[...]
Claiming the difference as 'bullshit' is different.

What is "bullshit" is the statement that there will always be a
difference in performance, in favor of C style arrays. In
practice, it depends.
 
J

James Kanze

Accessing a vector's contents via operator[] will always be
slightly slower than accessing a local (stack based) array via
[] as you have to dereference a pointer with the vector
version.
That's bullshit. You have to dereference a pointer in both
cases. In the case of indexing, you have to calculate the
address (both for stack based and for std::vector), and
because these calculations are hidden in some functions with
std::vector (and probably do bounds checking as well),
I thought operator[] didn't do bounds checking (or not
usually) whilst at() did.

It's a quality of implementation issue. On most platforms,
operator[] will lead to an assertion failure, or something
similar, if there is a bounds error. But this is not required,
and all of the implementations I know of have a means of turning
this off.
 
B

Bo Persson

Leigh said:
Intuitively it is correct for the optimizer to call size() each
time as the non-const version of operator[] should be being called,
why this is only happening for vecarray and not vector I do not
know.
/Leigh

A wild guess would be alias analysis.

In vecarray we have some nifty reinterpret_casts and accessing memory
through an array of char. We know that chars are sometimes allowed to
alias other objects.

In the std::vector, we have a pointer to an int buffer and a size_t,
which cannot possibly overlap.


Or it could be something completely different. :)


Bo Persson
 
B

Balog Pal

Leigh Johnston said:
I would expect them to perform about the same for this use-case as I said
in my O.P. that the pointer dereference of operator[] can be optimized to
be only done once (cached in a register) before a loop.

It is annoying and strange that size() is not being optimized out though
in both compilers VC++ and g++.

Just to try Bo's therory, can you test what happens if you simply define the
data holder as T[N] without the aligning union, and remove the
reinterpret_casts?

And if no change, make operator[] not call begin()?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top