STL hashmap

Daniel Heiserer · Jan 13, 2004

Hello,

I used a unique associative container such as:
//---------------
map<vector<int>,double> X;
//---------------

In general I fill X with millions of entries.

Sorting plays no role for me. All I need is uniqueness
of the keys and speed when inserting them.
At the end I retrieve them using iterators from the beginning
to the end. Which should lead to constant complexity.

Memory requirement is important but less then speed.

Unfortunately the used "map" is too slow.
I need a faster implementation.

I found hash_map as an alterantive, but it does not work
by simply replacing "map" with "hash_map".

Can anybody help me out?
So is hash_map part of the STL?
When not where can I get a good implementation of it?

I use g++ 3.2.2 on linux.

-- thanks, daniel

tom_usenet · Jan 13, 2004

Hello,

I used a unique associative container such as:
//---------------
map<vector<int>,double> X;
//---------------

In general I fill X with millions of entries.

What code do you use to add element? There may be a more efficient
way.

Sorting plays no role for me. All I need is uniqueness
of the keys and speed when inserting them.

Why are you using vector<int> as your key? If there a fixed maximum
length of vector? How do you create the vector? You would probably get
an improvement from a custom key class.

At the end I retrieve them using iterators from the beginning
to the end. Which should lead to constant complexity.

Memory requirement is important but less then speed.

Unfortunately the used "map" is too slow.
I need a faster implementation.

I found hash_map as an alterantive, but it does not work
by simply replacing "map" with "hash_map".

No, you need to provide a hashing function at least.

Can anybody help me out?
So is hash_map part of the STL?

No. unordered_map is part of the draft standard library technical
report (TR1). It is basically hash_map by another name, with a few
differences from the various different hash_map versions in use.

When not where can I get a good implementation of it?

I use g++ 3.2.2 on linux.

GCC comes with an implementation <ext/hash_map>. I don't know how good
it is. Docs are here:

http://gcc.gnu.org/onlinedocs/libstdc++/ext/howto.html#1

You might be able to make std::map fast enough if you optimize your
use of it...

Tom

C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html

Daniel Heiserer · Jan 13, 2004

What code do you use to add element? There may be a more efficient
way.

vector<int> a;
double b;
X[a]=b;
// or
X[a]+=b;

Why are you using vector<int> as your key? If there a fixed maximum
length of vector? How do you create the vector? You would probably get
an improvement from a custom key class.

foreach of my maps the vectors have the same length, but I want to
use different maps. e.g. one with vectors of size 3 others with
longer or shorter ones. Before I "add" my element I "X.resize(length)"
the vectors.
map<vector<int>,double> X; // vector length 3
map<vector<int>,double> Y; // vector lenght 5

Of course this distinction has to be made during runtime.
How much would a

hash_map<int[3],double> X;

speed that up? And how much memory would I save?

No, you need to provide a hashing function at least.

How do I define one for vectors?

No. unordered_map is part of the draft standard library technical
report (TR1). It is basically hash_map by another name, with a few
differences from the various different hash_map versions in use.

GCC comes with an implementation <ext/hash_map>. I don't know how good
it is. Docs are here:

http://gcc.gnu.org/onlinedocs/libstdc++/ext/howto.html#1

You might be able to make std::map fast enough if you optimize your
use of it...

Well a suggestion might be useful ;-)
Again: The time critical part is insertion and therefore checking
for duplicate ones. In maybe 50% of the cases the next insertion
"might" be "close" to the previous one so this might help to speed it
up.
After that I only retrieve all pairs from the beginning to the end
which should be of complexity O(1) using the iterators.

I would appreciate if I could use as much of the standard container
functions as possible and avoid writing new templates and classes.

-- thanks, daniel

tom_usenet · Jan 13, 2004

What code do you use to add element? There may be a more efficient
way.

Click to expand...

vector<int> a;
double b;
X[a]=b;
// or
X[a]+=b;

Ok, first improvement is to insert using:

X.insert(mymaptype::value_type(a, b));

Why are you using vector<int> as your key? If there a fixed maximum
length of vector? How do you create the vector? You would probably get
an improvement from a custom key class.

Click to expand...

foreach of my maps the vectors have the same length, but I want to
use different maps. e.g. one with vectors of size 3 others with
longer or shorter ones. Before I "add" my element I "X.resize(length)"
the vectors.
map<vector<int>,double> X; // vector length 3
map<vector<int>,double> Y; // vector lenght 5

Of course this distinction has to be made during runtime.
How much would a

hash_map<int[3],double> X;

speed that up? And how much memory would I save?

Well, the above is illegal (arrays aren't copyable), but you might try
this:

#include <cstddef>
#include <algorithm>

template <std::size_t Length>
class Key
{
int m_data[Length];
public:
static std::size_t const SIZE = Length;

Key()
{
//0 initialize
std::set(m_data, m_data + SIZE, 0);
}

int& operator[](std::size_t i)
{
assert(i < SIZE);
return m_data;
}

int const& operator[](std::size_t i) const
{
assert(i < Length);
return m_data;
}

friend bool operator<(Key const& lhs, Key const& rhs)
{
return std::lexicographical_compare(
lhs.m_data, lhs.m_data + SIZE,
rhs.m_data, rhs.m_data + SIZE);
}

friend bool operator==(Key const& lhs, Key const& rhs)
{
return std::equal_range(
lhs.m_data, lhs.m_data + SIZE,
rhs.m_data, rhs.m_data + SIZE);
}

//add anything required for convenience.
};

std::map<Key<3>, double> m;

That would add a pretty big speedup at insert time (in terms of
constant factor), and an even bigger memory saving. Key<3> takes up
12-16 bytes whereas std::vector<int> takes up 16 bytes before you even
consider the contents of the vector (another 16 at least, if not more
will allocation overhead). Using the above should roughly halve the
memory requirements and provide a major speed increase.

How do I define one for vectors?

Click to expand...

For your vectors you could have something like:

std::size_t hash(std::vector<int> const& v)
{
return std::accumulate(v.begin(), v.end(), std::size_t(0));
}

That's a bad hashing algorithm, so you may want to read up on hashing
algorithms.

Well a suggestion might be useful ;-)
Again: The time critical part is insertion and therefore checking
for duplicate ones. In maybe 50% of the cases the next insertion
"might" be "close" to the previous one so this might help to speed it
up.
After that I only retrieve all pairs from the beginning to the end
which should be of complexity O(1) using the iterators.

Click to expand...

Iteration over the whole container is O(n). Inserting n elements is
O(n log n).

I would appreciate if I could use as much of the standard container
functions as possible and avoid writing new templates and classes.

Click to expand...

The standard containers are building blocks and in many cases aren't
the optimal solution. If your problem definitely requires these
multi-int keys and performance isn't good enough using the standard
approach, then first trying the Key class and then trying the Key
class in a hash_map (writing a hash function for Key) is going to work
best.

Tom

C++ FAQ: http://www.parashift.com/c++-faq-lite/
C FAQ: http://www.eskimo.com/~scs/C-faq/top.html

Paul Dubuc · Jan 13, 2004

Daniel said:
Hello,

I used a unique associative container such as:
//---------------
map<vector<int>,double> X;
//--------------- ....
Unfortunately the used "map" is too slow.
I need a faster implementation.

I found hash_map as an alterantive, but it does not work
by simply replacing "map" with "hash_map".

Can anybody help me out?
So is hash_map part of the STL?
When not where can I get a good implementation of it?

I use g++ 3.2.2 on linux.

-- thanks, daniel

With this compiler hash_map is int the __gnu_cxx namespace, not int the std
namespace. hash_map isn't C++ standard. Use

__gnu_cxx::hash_map<vector<int>,double> X;

Of course, you'll need to define a hash function for your key. See
http://www.sgi.com/tech/stl/hash_map.html for details.

Paul Dubuc

STL hash_map	1	Jul 27, 2007
Question on initializing STL hash-maps using Visual C++ Express	2	Oct 17, 2010
Algorithmic complexity & STL	3	Apr 11, 2005
looping through hashmap	1	Nov 2, 2003
STL based LRU cache, any suggestions for improvements?	0	May 1, 2007
STL Map iterator compilation error	2	Nov 29, 2007
Composition versus Implementation Inheritance	8	Jul 28, 2007
STL map insert question	5	Jan 9, 2006

STL hashmap

Daniel Heiserer

tom_usenet

Daniel Heiserer

tom_usenet

Paul Dubuc

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads