V
Vinodh
I am reading about hashing techniques. The map data structure
available in C++ STL uses hashing techniques?
available in C++ STL uses hashing techniques?
Vinodh said:I am reading about hashing techniques. The map data structure
available in C++ STL uses hashing techniques?
I am reading about hashing techniques. The map data structure
available in C++ STL uses hashing techniques?
[email protected] says... said:For information, this is because of requirements:
A STL map is required to have a worse case find() of O log(N).
Balanced trees give you that. Although hash maps are typically faster
than balanced tree on large amount of random data, they do not
guarantee a worse case scenario of O log(N).
TR1 (and C++ 0x) add unordered_[multi]set and unordered_[multi]map,
which are intended to be based on hashing.
I.e. TR1 did accept that hash maps are also useful and added them.
Now all we need is for the developpers to be able to make an informed
choice on the correct one to use for a particular application.![]()
For information, this is because of requirements:A STL map is required to have a worse case find() of O log(N).
Balanced trees give you that. Although hash maps are
typically faster than balanced tree on large amount of random
data, they do not guarantee a worse case scenario of O log(N).
I don't know about the "typically faster". Their average access
time is faster IF the hash function is good. Typically, I find
that it's not always that good (although the situation is
improving).
James said:I don't know about the "typically faster". Their average access
time is faster IF the hash function is good. Typically, I find
that it's not always that good (although the situation is
improving).
Question: Will the new C++ standard require for the libraries to
provide high-quality default hashing functions for internal types (and
perhaps some other standard types such as std::string), or will the user
be forced to always provide one himself?
Creating good-quality hashing functions are not a trivial task at all
(and the subject of continuing extensive research). One cannot expect
the average user to invent a good-quality function without extensive
knowledge and experience on the subject.
Pete said:Indeed. That's why I've never thought that standardizing hashed
containers was a good idea. There's just too much flexibility, making
them hard to specify well, with the result that naive users can get
horrible performance without knowing why.
But, to answer your question, no, there is no requirement for "high
quality" hashing functions. Implementations will be required to provide
hashing functions for the builtin types, pointers, std::string,
std::u16string, std::u32string, std::wstring, std::error_code, and
std::thread::id.
Question: Will the new C++ standard require for the libraries to
provide high-quality default hashing functions for internal types (and
perhaps some other standard types such as std::string), or will the user
be forced to always provide one himself?
On 2008-05-31 11:50:57 -0400, Juha Nieminen <[email protected]> said:
But, to answer your question, no, there is no requirement for
"high quality" hashing functions. Implementations will be
required to provide hashing functions for the builtin types,
pointers, std::string, std::u16string, std::u32string,
std::wstring, std::error_code, and std::thread::id.
Ok, sorry about my lack of definition for "typically":
Assuming a non-perfect hash function that return you a bucket
identifier (almost necessary since a perfect hash function
would be too expensive for it to be worthwhile), this hash
function makes assumptions on the data it receives (e.g.
random). The hash table making use of this hash function will
be very fast when presented with "typical" data (i.e. data
that generally meet the assumptions used for creating the hash
function). Unfortunately, there may be some worse case
scenario under which the performance of the hash table will be
abyssimal.
I suspect that part of the reason why the original STL had a
balanced tree, but not a hash table, is that one can sort of
expect an average programmer to get the ordering relationship
right, but you can't expect him to define a good hash function.
(There's also the point that if he gets the ordering function
wrong, there's a good chance of the program not working at all,
so he'll notice. Where as in the case of a hash function, the
program will only be a little bit slower than it should be.)
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.