Iterating a std::vector vs iterating a std::map?

Discussion in 'C++' started by carl, Nov 24, 2009.

  1. carl

    carl Guest

    I have a std::vector and a std::map containing the same number of elements.
    Currently I experience that iterating all elements in the map is slower that
    iterating all elements in the vector. Is this true ?
     
    carl, Nov 24, 2009
    #1
    1. Advertising

  2. carl

    White Wolf Guest

    carl wrote:
    > I have a std::vector and a std::map containing the same number of
    > elements. Currently I experience that iterating all elements in the map
    > is slower that iterating all elements in the vector. Is this true ?


    Do you want us to decide if you are hallucinating or not? You have
    said: you are experiencing it. So I guess then it is true. :)

    If your question is: will iterating a map be slower than iterating a
    vector, the answer is: most certainly. But there is another question
    that must be asked: Should you care about it? And the answer is: In
    most cases you don't.

    See Scott Meyers: Effective STL for answers why a vector will be faster
    than a map. But also see the constraints, because you cannot always use
    a vector when you need a map.

    A map contains ordered elements. It is a node based container, normally
    implemented as a red-black tree. Because with a map you ask for
    "ordering", you pay for it with the iteration being somewhat slower.
    However that "slower" may not matter in your application at all, if what
    you do with the elements of the map is slower. For example making
    iteration faster won't make any different if you are printing out the
    elements and you are waiting for the terminal to print and scroll...

    --
    BR, WW
     
    White Wolf, Nov 24, 2009
    #2
    1. Advertising

  3. carl

    carl Guest

    "White Wolf" <> wrote in message
    news:4b0b31d3$0$577$4all.se...
    > carl wrote:
    >> I have a std::vector and a std::map containing the same number of
    >> elements. Currently I experience that iterating all elements in the map
    >> is slower that iterating all elements in the vector. Is this true ?

    >
    > Do you want us to decide if you are hallucinating or not? You have said:
    > you are experiencing it. So I guess then it is true. :)
    >
    > If your question is: will iterating a map be slower than iterating a
    > vector, the answer is: most certainly. But there is another question that
    > must be asked: Should you care about it? And the answer is: In most cases
    > you don't.



    For each voxel in a 3D volume (that typically contains 512*512*512 voxels) I
    need to iterate a number of objects (typically from 0 - 30 elements). I
    have reduced the numbers of objects stored in the std::map using a kd-tree.
    But still I need to iterate all the elements in some of the operations.

    Before I was just storing all elements in a std::vector and iterated them
    all. Now I have changed to an implementation using a std::map and a kd-tree
    but even though the number of elements are reduced drastically I get a much
    worse performance compared to iterating all elements in the std::vector
    (typically it contained 216 elements).
     
    carl, Nov 24, 2009
    #3
  4. carl

    James Kanze Guest

    On Nov 24, 7:06 am, Paavo Helde <> wrote:
    > "carl" <carl@.com> wrote innews:4b0b34ea$0$280$:


    [...]
    > Note that using standard containers in performance-critical
    > code parts is not straightforward and can kill the performance
    > quite easily if one does not have a clear overview what's
    > happening behind the scenes. I'm not advocating against using
    > std::vector, but one has to be aware of the potential problems
    > and how to work around them. Notably the swap() member
    > function comes often handy in performance-critical code.


    As a general rule, you shouldn't use standard classes, or
    classes from any general library, as part of the application
    abstractions. You should always define your own classes, with
    the exact interface you need (and no more). The standard
    classes only appear in the implementation of these classes, so
    you can swap them in or out as needed (or even replace them with
    a custom implementation, if none of the standard classes meets
    your needs).

    > If the code still seems too slow, then one should use
    > profiling to see if there are any unexpected bottlenecks. You
    > cannot rely on profiling only, because if you write uniformly
    > unefficient code all over the place there will be no
    > bottlenecks.


    Sure there will. The code which is executed most often will be
    the bottleneck. You can generally start by not worrying too
    much about efficiency; if you've encapsulated correctly, there
    will then be no problem improving the efficiency of the places
    which really are bottlenecks, i.e. the places which are executed
    the most often.

    --
    James Kanze
     
    James Kanze, Nov 24, 2009
    #4
  5. carl

    James Kanze Guest

    On Nov 24, 4:37 pm, Juha Nieminen <> wrote:
    > James Kanze wrote:
    > > As a general rule, you shouldn't use standard classes, or
    > > classes from any general library, as part of the application
    > > abstractions. You should always define your own classes,
    > > with the exact interface you need (and no more). The
    > > standard classes only appear in the implementation of these
    > > classes, so you can swap them in or out as needed (or even
    > > replace them with a custom implementation, if none of the
    > > standard classes meets your needs).


    > While in certain large projects it's a good idea to abstract
    > implementation details away as much as possible, in smaller
    > projects trying to abstract everything away can be more work
    > than it's worth.


    "As a general rule" doesn't mean "absolutely with no
    exceptions". In general, in all large projects and in most
    small projects, it's worth defining your application specific
    abstractions. If you're writing an editor, you don't use
    std::string for your text buffer, you use TextBuffer, a class
    that you've designed. (Of course, at least at the beginning,
    TextBuffer will use std::string in its implementation.)

    > For example, suppose you have some function or member function which
    > takes, let's say, a std::vector as parameter:
    >
    > class Something
    > {
    > public:
    > void foo(const std::vector<unsigned>& values);
    > };


    > This is very unabstract code. It hard-codes both the data
    > container and the element type. One would think that it's a
    > good idea to abstract that away a bit, for example:


    > class Something
    > {
    > public:
    > typedef std::vector<unsigned> ValueContainer;
    > void foo(const ValueContainer& values);
    > };


    > Now if all the outside code uses Something::ValueContainer
    > instead of std::vector<unsigned>, everything is well? Maybe,
    > except that that typedef doesn't enforce anything. You can
    > still see that what's being used is really a
    > std::vector<unsigned>, and nothing stops you from using that
    > type directly and passing instances of it to Something::foo().


    First, it depends. Is this ValueContainer fundamentally part of
    your application abstraction? If so, it should be a class, and
    not a typedef. A class which exposes the interface defined by
    the abstraction in your application, and not the interface
    defined by std::vector. If not, if the abstraction is precisely
    that of std::vector, then std::vector is fine.

    Having said that, it's sometimes a bit delicate. I have more
    than a few cases where the abstraction does wrap a standard
    container, like std::vector, *and* includes iterators. In such
    cases, it's very tempting to use a (member) typedef to define
    the iterators, which does lock me into random access iterators,
    even if the abstraction doesn't require them. I've thought
    about creating a template to "downgrade" iterators, but I've not
    gotten around to it.

    > Moreover, even if the outside code would strictly adhere to
    > always using Something::ValueContainer, exactly which member
    > functions of that type is it allowed to use? Since it is a
    > std::vector, all of the std::vector functions are free to be
    > used, making it more difficult to change the type of
    > ValueContainer later to something else. You quickly notice
    > that, in fact, you didn't abstract anything away at all with
    > that typedef. You are just using an alias.


    In sum: typedef doesn't define a new abstraction. What else is
    new?

    > So if you want to truly abstract the type away you have to do
    > it like:


    > class Something
    > {
    > public:
    > class ValueContainer;
    > void foo(const ValueContainer& values);
    > };


    > and then you define Something::ValueContainer as a class which
    > contains the necessary member functions for passing the
    > numerical values to Something::foo().


    > The problem? Now Something::ValueContainer will be way more
    > limited than std::vector is. For example, the calling code
    > might benefit from things like random access with that data
    > container, so if ValueContainer doesn't provide it, it hinders
    > what the code can do.


    That's not the problem. That's rather exactly what we're trying
    to achieve. My code guarantees a certain functionality for
    ValueContainer, and only that functionality. Client code can't
    use more than I guarantee (in theory, anyway), so I'm free to
    change the implementation anyway I want, as long as I maintain
    that functionality.

    It's called encapsulation, and it's essential if you want to be
    able to optimize the code later. Or evolve it in any other way.

    > Of course this is intentional: By not providing random access
    > you are more free to change the internal data structure to
    > something else if needed (eg. std::list). However, by being
    > too careful like this, you are now hindering the outside code.


    You want to define a contract for the outside code. Ideally,
    you'll define the contract in such a way that the outside code
    can do whatever it needs to do, and you maintain all the
    necessary freedom to implement the code however you want. Even
    using a typedef to std::vector "hinders" outside code: it can't
    hold an iterator into the container over an insertion, for
    example. It's up to you, as the designer, to decide what you
    want (or need) to support, and what you don't.

    > What you could do is to add a way to initialize a
    > Something::ValueContainer with a set of values (from a
    > container or an iterator range). However, what you have done
    > now is effectively move the abstraction problem from Something
    > to Something::ValueContainer, achieving only little advantage.
    > You could as well have done that with Something::foo()
    > directly.


    > It might also introduce an inefficiency because now values
    > need to be copied around instead of the calling code just
    > using the same container for both whatever it needs to do (eg.
    > requiring random access) and making Something read from it, so
    > the values don't need to be copied anywhere just for
    > Something::foo() to read them.


    I'm not sure I fully understand what you're getting at in the
    above two paragraphs, but one thing is sure: *IF* performance is
    ever an issue, you'd better hide all implementation details
    behind such an abstraction, and limit client code as much as
    possible, or you'll be overly constrained in you optimization
    possibilities.

    Don't forget, if you've not provided random access, and it later
    becomes evident that client code needs it, you can always add
    it. Adding functionality causes no problems. Removing it, on
    the other hand, breaks client code. So you don't want to
    provide anything you're not sure the client needs.

    --
    James Kanze
     
    James Kanze, Nov 25, 2009
    #5
  6. carl

    James Kanze Guest

    On Nov 24, 5:57 pm, Paavo Helde <> wrote:
    > James Kanze <> wrote in news:5cdb6f3a-e2e0-48c1-
    > :


    > > On Nov 24, 7:06 am, Paavo Helde <> wrote:
    > >> "carl" <carl@.com> wrote innews:4b0b34ea$0$280$14726298


    > @news.sunsite.dk:


    > >> If the code still seems too slow, then one should use
    > >> profiling to see if there are any unexpected bottlenecks. You
    > >> cannot rely on profiling only, because if you write uniformly
    > >> unefficient code all over the place there will be no
    > >> bottlenecks.


    > > Sure there will. The code which is executed most often will be
    > > the bottleneck. You can generally start by not worrying too
    > > much about efficiency; if you've encapsulated correctly, there
    > > will then be no problem improving the efficiency of the places
    > > which really are bottlenecks, i.e. the places which are executed
    > > the most often.


    > Yes, *if encapsulated correctly*, there is no problem.
    > However, I'm afraid that the people writing unefficient code
    > are not very proficient in things like encapsulation, either.


    Yes. People who write bad code will write bad code:). They
    have to be taught. But it's easy to check for proper
    encapsulation during code review. On the other hand, it's
    almost impossible to see where the bottlenecks will be until
    you've actually measured. (At the lowest level, at least.
    Obviously, if someone's used an O(n^2) algorithm, and you know
    that there will be millions of elements to deal with, you don't
    need the profiler to know that it's not going to work. You
    should catch things like that in code review as well.)

    > An example would be that somebody writes all over the place:


    > A* a = new A();
    > a->f();
    > delete a;


    > If operator new pops up in the profiler, it is not clear which
    > use cases are legitimate, which are not, and fixing the
    > problem is not so simple as it must be done in many places. If
    > it does not pop up in the profiler, it just makes the program
    > slower by few percents and nobody ever fixes it. If there are
    > many such things, these few percents are all added up...


    The problem above is deeper. C++ supports value operations, and
    a programmer who doesn't use them needs to be taught C++. It
    has nothing to do with performance (at least not in the first
    line); it's a question of semantics. The difference between:

    A* a = new A();
    a->f();
    delete a;

    and

    A a;
    a.f();

    isn't just performance---it's also number of lines of code
    written, what happens if an exception is thrown, and what
    happens if (later) someone assigns a to another variable (of the
    same type). And all of those issues are more important than
    performance, since they affect program correctness and
    maintainability.

    (In most languages, for a variety of reasons, doing things the
    idiomatically correct way will result in better performance,
    over all. In C++, this is particularly true. But in all cases,
    the reason for doing them in this way is because it is the
    idiomatically correct way, and not for performance reasons per
    se.)

    --
    James Kanze
     
    James Kanze, Nov 25, 2009
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jonathan Mcdougall

    Clarification for std::vector and std::map

    Jonathan Mcdougall, Jul 30, 2003, in forum: C++
    Replies:
    5
    Views:
    436
    Sir_Ciph3r
    Jul 31, 2003
  2. Replies:
    8
    Views:
    2,002
    Csaba
    Feb 18, 2006
  3. Replies:
    1
    Views:
    456
    red floyd
    Dec 21, 2008
  4. Thomas J. Gritzan
    Replies:
    6
    Views:
    1,051
    James Kanze
    Dec 22, 2008
  5. James Kanze
    Replies:
    0
    Views:
    2,067
    James Kanze
    Dec 21, 2008
Loading...

Share This Page