Iterating a std::vector vs iterating a std::map?

C

carl

I have a std::vector and a std::map containing the same number of elements.
Currently I experience that iterating all elements in the map is slower that
iterating all elements in the vector. Is this true ?
 
W

White Wolf

carl said:
I have a std::vector and a std::map containing the same number of
elements. Currently I experience that iterating all elements in the map
is slower that iterating all elements in the vector. Is this true ?

Do you want us to decide if you are hallucinating or not? You have
said: you are experiencing it. So I guess then it is true. :)

If your question is: will iterating a map be slower than iterating a
vector, the answer is: most certainly. But there is another question
that must be asked: Should you care about it? And the answer is: In
most cases you don't.

See Scott Meyers: Effective STL for answers why a vector will be faster
than a map. But also see the constraints, because you cannot always use
a vector when you need a map.

A map contains ordered elements. It is a node based container, normally
implemented as a red-black tree. Because with a map you ask for
"ordering", you pay for it with the iteration being somewhat slower.
However that "slower" may not matter in your application at all, if what
you do with the elements of the map is slower. For example making
iteration faster won't make any different if you are printing out the
elements and you are waiting for the terminal to print and scroll...
 
C

carl

White Wolf said:
Do you want us to decide if you are hallucinating or not? You have said:
you are experiencing it. So I guess then it is true. :)

If your question is: will iterating a map be slower than iterating a
vector, the answer is: most certainly. But there is another question that
must be asked: Should you care about it? And the answer is: In most cases
you don't.


For each voxel in a 3D volume (that typically contains 512*512*512 voxels) I
need to iterate a number of objects (typically from 0 - 30 elements). I
have reduced the numbers of objects stored in the std::map using a kd-tree.
But still I need to iterate all the elements in some of the operations.

Before I was just storing all elements in a std::vector and iterated them
all. Now I have changed to an implementation using a std::map and a kd-tree
but even though the number of elements are reduced drastically I get a much
worse performance compared to iterating all elements in the std::vector
(typically it contained 216 elements).
 
J

James Kanze


[...]
Note that using standard containers in performance-critical
code parts is not straightforward and can kill the performance
quite easily if one does not have a clear overview what's
happening behind the scenes. I'm not advocating against using
std::vector, but one has to be aware of the potential problems
and how to work around them. Notably the swap() member
function comes often handy in performance-critical code.

As a general rule, you shouldn't use standard classes, or
classes from any general library, as part of the application
abstractions. You should always define your own classes, with
the exact interface you need (and no more). The standard
classes only appear in the implementation of these classes, so
you can swap them in or out as needed (or even replace them with
a custom implementation, if none of the standard classes meets
your needs).
If the code still seems too slow, then one should use
profiling to see if there are any unexpected bottlenecks. You
cannot rely on profiling only, because if you write uniformly
unefficient code all over the place there will be no
bottlenecks.

Sure there will. The code which is executed most often will be
the bottleneck. You can generally start by not worrying too
much about efficiency; if you've encapsulated correctly, there
will then be no problem improving the efficiency of the places
which really are bottlenecks, i.e. the places which are executed
the most often.
 
J

James Kanze

While in certain large projects it's a good idea to abstract
implementation details away as much as possible, in smaller
projects trying to abstract everything away can be more work
than it's worth.

"As a general rule" doesn't mean "absolutely with no
exceptions". In general, in all large projects and in most
small projects, it's worth defining your application specific
abstractions. If you're writing an editor, you don't use
std::string for your text buffer, you use TextBuffer, a class
that you've designed. (Of course, at least at the beginning,
TextBuffer will use std::string in its implementation.)
For example, suppose you have some function or member function which
takes, let's say, a std::vector as parameter:

class Something
{
public:
void foo(const std::vector<unsigned>& values);
};
This is very unabstract code. It hard-codes both the data
container and the element type. One would think that it's a
good idea to abstract that away a bit, for example:
class Something
{
public:
typedef std::vector<unsigned> ValueContainer;
void foo(const ValueContainer& values);
};
Now if all the outside code uses Something::ValueContainer
instead of std::vector<unsigned>, everything is well? Maybe,
except that that typedef doesn't enforce anything. You can
still see that what's being used is really a
std::vector<unsigned>, and nothing stops you from using that
type directly and passing instances of it to Something::foo().

First, it depends. Is this ValueContainer fundamentally part of
your application abstraction? If so, it should be a class, and
not a typedef. A class which exposes the interface defined by
the abstraction in your application, and not the interface
defined by std::vector. If not, if the abstraction is precisely
that of std::vector, then std::vector is fine.

Having said that, it's sometimes a bit delicate. I have more
than a few cases where the abstraction does wrap a standard
container, like std::vector, *and* includes iterators. In such
cases, it's very tempting to use a (member) typedef to define
the iterators, which does lock me into random access iterators,
even if the abstraction doesn't require them. I've thought
about creating a template to "downgrade" iterators, but I've not
gotten around to it.
Moreover, even if the outside code would strictly adhere to
always using Something::ValueContainer, exactly which member
functions of that type is it allowed to use? Since it is a
std::vector, all of the std::vector functions are free to be
used, making it more difficult to change the type of
ValueContainer later to something else. You quickly notice
that, in fact, you didn't abstract anything away at all with
that typedef. You are just using an alias.

In sum: typedef doesn't define a new abstraction. What else is
new?
So if you want to truly abstract the type away you have to do
it like:
class Something
{
public:
class ValueContainer;
void foo(const ValueContainer& values);
};
and then you define Something::ValueContainer as a class which
contains the necessary member functions for passing the
numerical values to Something::foo().
The problem? Now Something::ValueContainer will be way more
limited than std::vector is. For example, the calling code
might benefit from things like random access with that data
container, so if ValueContainer doesn't provide it, it hinders
what the code can do.

That's not the problem. That's rather exactly what we're trying
to achieve. My code guarantees a certain functionality for
ValueContainer, and only that functionality. Client code can't
use more than I guarantee (in theory, anyway), so I'm free to
change the implementation anyway I want, as long as I maintain
that functionality.

It's called encapsulation, and it's essential if you want to be
able to optimize the code later. Or evolve it in any other way.
Of course this is intentional: By not providing random access
you are more free to change the internal data structure to
something else if needed (eg. std::list). However, by being
too careful like this, you are now hindering the outside code.

You want to define a contract for the outside code. Ideally,
you'll define the contract in such a way that the outside code
can do whatever it needs to do, and you maintain all the
necessary freedom to implement the code however you want. Even
using a typedef to std::vector "hinders" outside code: it can't
hold an iterator into the container over an insertion, for
example. It's up to you, as the designer, to decide what you
want (or need) to support, and what you don't.
What you could do is to add a way to initialize a
Something::ValueContainer with a set of values (from a
container or an iterator range). However, what you have done
now is effectively move the abstraction problem from Something
to Something::ValueContainer, achieving only little advantage.
You could as well have done that with Something::foo()
directly.
It might also introduce an inefficiency because now values
need to be copied around instead of the calling code just
using the same container for both whatever it needs to do (eg.
requiring random access) and making Something read from it, so
the values don't need to be copied anywhere just for
Something::foo() to read them.

I'm not sure I fully understand what you're getting at in the
above two paragraphs, but one thing is sure: *IF* performance is
ever an issue, you'd better hide all implementation details
behind such an abstraction, and limit client code as much as
possible, or you'll be overly constrained in you optimization
possibilities.

Don't forget, if you've not provided random access, and it later
becomes evident that client code needs it, you can always add
it. Adding functionality causes no problems. Removing it, on
the other hand, breaks client code. So you don't want to
provide anything you're not sure the client needs.
 
J

James Kanze

(e-mail address removed):
Yes, *if encapsulated correctly*, there is no problem.
However, I'm afraid that the people writing unefficient code
are not very proficient in things like encapsulation, either.

Yes. People who write bad code will write bad code:). They
have to be taught. But it's easy to check for proper
encapsulation during code review. On the other hand, it's
almost impossible to see where the bottlenecks will be until
you've actually measured. (At the lowest level, at least.
Obviously, if someone's used an O(n^2) algorithm, and you know
that there will be millions of elements to deal with, you don't
need the profiler to know that it's not going to work. You
should catch things like that in code review as well.)
An example would be that somebody writes all over the place:
A* a = new A();
a->f();
delete a;
If operator new pops up in the profiler, it is not clear which
use cases are legitimate, which are not, and fixing the
problem is not so simple as it must be done in many places. If
it does not pop up in the profiler, it just makes the program
slower by few percents and nobody ever fixes it. If there are
many such things, these few percents are all added up...

The problem above is deeper. C++ supports value operations, and
a programmer who doesn't use them needs to be taught C++. It
has nothing to do with performance (at least not in the first
line); it's a question of semantics. The difference between:

A* a = new A();
a->f();
delete a;

and

A a;
a.f();

isn't just performance---it's also number of lines of code
written, what happens if an exception is thrown, and what
happens if (later) someone assigns a to another variable (of the
same type). And all of those issues are more important than
performance, since they affect program correctness and
maintainability.

(In most languages, for a variety of reasons, doing things the
idiomatically correct way will result in better performance,
over all. In C++, this is particularly true. But in all cases,
the reason for doing them in this way is because it is the
idiomatically correct way, and not for performance reasons per
se.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,731
Messages
2,569,432
Members
44,832
Latest member
GlennSmall

Latest Threads

Top