ordered unordered_map

Discussion in 'C++' started by Ralf Goertz, Feb 2, 2010.

  1. Ralf Goertz

    Ralf Goertz Guest

    Hi,

    I recently found out that using unordered_map<unsigned, Foo> instead of
    map<unsigned, Foo> speeds up my program substantially (it contains
    several hundreds of thousands elements). However, I really would like to
    be able to iterate over those elements in an ordered way at the end of
    the program. Is there a way to do that? If m would be that unordered map
    what about m.rehash(n) where n=m.size()? According to Pete Pecker's book
    that would "resize the container so that it has at least n buckets…".
    And aren't the keys for integral types hashed by putting them into
    residue classes mod bucket size? If so, m would be ordered, right?

    Ralf
    Ralf Goertz, Feb 2, 2010
    #1
    1. Advertising

  2. Ralf Goertz

    Ralf Goertz Guest

    Pete Becker wrote:

    > Ralf Goertz wrote:
    >> Hi,
    >>
    >> I recently found out that using unordered_map<unsigned, Foo> instead of
    >> map<unsigned, Foo> speeds up my program substantially (it contains
    >> several hundreds of thousands elements). However, I really would like to
    >> be able to iterate over those elements in an ordered way at the end of
    >> the program. Is there a way to do that? If m would be that unordered map
    >> what about m.rehash(n) where n=m.size()? According to Pete Pecker's book
    >> that would "resize the container so that it has at least n buckets…".
    >> And aren't the keys for integral types hashed by putting them into
    >> residue classes mod bucket size? If so, m would be ordered, right?
    >>

    >
    > No, that almost certainly wouldn't work. The problem is the hash values:
    > if two elements hash to the same value they end up in the same bucket,
    > regardless of how many buckets there are. And even if they hash to
    > different values, they can end up in the same bucket, because the hash
    > container has to adjust hash values to align with the number of buckets
    > (typically by taking the remainder modulo the number of buckets). You
    > might be able to do this with a custom hash function that returned a
    > value equal to the desired bucket, but that seems abusive.


    So if it would be guaranteed that m contains exactly n keys in the range
    [0,n) then rehashing could do the trick? Of course, in that case I could
    use a vector in the first place.

    > If you're willing to reorganize the container, how about going a small
    > step further and replacing it at that point with an ordered container?


    It seems I will have to do that then, thanks.
    Ralf Goertz, Feb 2, 2010
    #2
    1. Advertising

  3. Ralf Goertz

    tonydee Guest

    On Feb 2, 10:30 pm, Ralf Goertz <> wrote:
    > Pete Becker wrote:
    > > Ralf Goertz wrote:
    > >> I recently found out that using unordered_map<unsigned, Foo> instead of
    > >> map<unsigned, Foo> speeds up my program substantially (it contains
    > >> several hundreds of thousands elements). However, I really would like to
    > >> be able to iterate over those elements in an ordered way at the end of
    > >> the program. Is there a way to do that? ...

    >
    > > If you're willing to reorganize the container, how about going a small
    > > step further and replacing it at that point with an ordered container?

    >
    > It seems I will have to do that then, thanks.


    It's a good, simple solution in most cases. If the objects are large
    and keys small and your RAM limited, and/or that final step is
    performance critical too, consider
    - keeping the data in an ordered map, but adding an unordered (hash)
    map indexing from key to ordered-map iterator
    - keep the data in a vector, and having unordered and ordered maps
    from key to index
    - looking at boost's multi-indexed container library, which can manage
    the synchronisation of data and indices for you

    Cheers,
    Tony
    tonydee, Feb 3, 2010
    #3
  4. Ralf Goertz

    Ralf Goertz Guest

    tonydee wrote:

    > On Feb 2, 10:30 pm, Ralf Goertz <> wrote:
    >> Pete Becker wrote:
    >>
    >> > If you're willing to reorganize the container, how about going a small
    >> > step further and replacing it at that point with an ordered container?

    >>
    >> It seems I will have to do that then, thanks.

    >
    > It's a good, simple solution in most cases. If the objects are large
    > and keys small and your RAM limited, and/or that final step is
    > performance critical too, consider
    > - keeping the data in an ordered map, but adding an unordered (hash)
    > map indexing from key to ordered-map iterator
    > - keep the data in a vector, and having unordered and ordered maps
    > from key to index
    > - looking at boost's multi-indexed container library, which can manage
    > the synchronisation of data and indices for you


    These are worth trying, especially since I still need to find() in the
    map. The reason why I want an ordered iterator is that I'd like to
    compare all elements with each other and erase duplicate (type Foo)
    values keeping the one with the lower (type unsigned) key (it is more
    complicated than that but you get the point). So I use two nested loops:

    typedef unordered_map<unsigned, Foo> Map;
    typedef unordered_set<unsigned> Set;
    Map m;
    Map::iterator i1,i2;
    Set s;
    Set::iterator i3;

    void find_elements_in_m_that_need_to_be_compared(unsigned to_this_element,Set&);

    bool foo_equal(const Foo &,const Foo&);

    for (i1=m.begin();i1!=m.end();++i1) {
    find_elements_in_m_that_need_to_be_compared(i1->first,s);
    for (i3=s.begin();i3!=s.end();++i3) {
    i2=m.find(*i3);
    if (foo_equal(i1->second,i2->second)) {
    m.erase(i2); // (*)
    }
    }
    }

    Since m is not ordered, I cannot be sure that i1->first is less than
    i2->first. Therefore I might erase the object with the lower key at (*).
    OTOH unordered_map has an erase(iterator, iterator) member function.
    This function seems to be of little use since the key 42 might be in the
    range [8,15).
    Ralf Goertz, Feb 3, 2010
    #4
  5. Ralf Goertz

    tonydee Guest

    On Feb 3, 9:27 pm, Ralf Goertz <> wrote:
    > tonydee wrote:
    > > On Feb 2, 10:30 pm, Ralf Goertz <> wrote:
    > >> Pete Becker wrote:

    >
    > >> > If you're willing to reorganize the container, how about going a small
    > >> > step further and replacing it at that point with an ordered container?

    >
    > >> It seems I will have to do that then, thanks.

    >
    > > It's a good, simple solution in most cases.  If the objects are large
    > > and keys small and your RAM limited, and/or that final step is
    > > performance critical too, consider
    > > - keeping the data in an ordered map, but adding an unordered (hash)
    > > map indexing from key to ordered-map iterator
    > > - keep the data in a vector, and having unordered and ordered maps
    > > from key to index
    > > - looking at boost's multi-indexed container library, which can manage
    > > the synchronisation of data and indices for you

    >
    > These are worth trying, especially since I still need to find() in the
    > map. The reason why I want an ordered iterator is that I'd like to
    > compare all elements with each other and erase duplicate (type Foo)
    > values keeping the one with the lower (type unsigned) key (it is more
    > complicated than that but you get the point). So I use two nested loops:
    >
    > typedef unordered_map<unsigned, Foo> Map;
    > typedef unordered_set<unsigned> Set;
    > Map m;
    > Map::iterator i1,i2;
    > Set s;
    > Set::iterator i3;
    >
    > void find_elements_in_m_that_need_to_be_compared(unsigned to_this_element,Set&);
    >
    > bool foo_equal(const Foo &,const Foo&);
    >
    > for (i1=m.begin();i1!=m.end();++i1) {
    >         find_elements_in_m_that_need_to_be_compared(i1->first,s);
    >         for (i3=s.begin();i3!=s.end();++i3) {
    >                 i2=m.find(*i3);
    >                 if (foo_equal(i1->second,i2->second)) {
    >                         m.erase(i2); // (*)
    >                 }
    >         }
    >
    > }
    >
    > Since m is not ordered, I cannot be sure that i1->first is less than
    > i2->first. Therefore I might erase the object with the lower key at (*).
    > OTOH unordered_map has an erase(iterator, iterator) member function.
    > This function seems to be of little use since the key 42 might be in the
    > range [8,15).


    I'm not sure how often you need to "clean up" these duplicates, but if
    Foo has (or can easily be given) a operator<, you may find it better
    to have an (ordered) map (something like "std::map<Foo*, unsigned,
    Compare_Via_Ptr<Foo*> >" with template <typename T> struct
    Compare_Via_Ptr { bool operator()(const Foo* lhs, const Foo* rhs)
    const { return *lhs < *rhs; } }; )...

    That way you can do a quick lookup to see if an "incoming" Foo is
    already stored, preventing the storage of duplicates.

    Cheers,
    Tony
    tonydee, Feb 4, 2010
    #5
  6. Ralf Goertz

    Ralf Goertz Guest

    tonydee wrote:

    > I'm not sure how often you need to "clean up" these duplicates, but if
    > Foo has (or can easily be given) a operator<, you may find it better
    > to have an (ordered) map (something like "std::map<Foo*, unsigned,
    > Compare_Via_Ptr<Foo*> >" with template <typename T> struct
    > Compare_Via_Ptr { bool operator()(const Foo* lhs, const Foo* rhs)
    > const { return *lhs < *rhs; } }; )...
    >
    > That way you can do a quick lookup to see if an "incoming" Foo is
    > already stored, preventing the storage of duplicates.


    That is impossible. Only after thorough investigation of the whole set I
    know that two given objects of type Foo are the same. And I not only
    erase the duplicates but I also transfer information that is stored in
    the one to be erased.

    I have a solution for the problem (breaking the inner loop) but I
    consider it being error prone and inelegant. That's why I hoped to find
    a smarter one. Using one more level of indirection as you suggested
    might be the way to go. Thanks for your input.
    Ralf Goertz, Feb 4, 2010
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Paulo Matos

    Template problem with unordered_map

    Paulo Matos, Aug 3, 2006, in forum: C++
    Replies:
    4
    Views:
    442
    Paulo Matos
    Aug 3, 2006
  2. Rares Vernica

    error with tr1 unordered_map iterator

    Rares Vernica, Feb 24, 2007, in forum: C++
    Replies:
    6
    Views:
    1,787
  3. abir
    Replies:
    6
    Views:
    832
    W Karas
    Jun 26, 2008
  4. Ralf Goertz

    ordered unordered_map

    Ralf Goertz, Feb 2, 2010, in forum: C++
    Replies:
    0
    Views:
    337
    Ralf Goertz
    Feb 2, 2010
  5. DL

    Ordered list inside ordered list

    DL, Nov 9, 2009, in forum: Javascript
    Replies:
    6
    Views:
    304
    Dr J R Stockton
    Nov 21, 2009
Loading...

Share This Page