J
Johannes Schaub (litb)
I want to learn how to create perfect hash functions for switch statements
over strings. So given
switch(str) {
case str1...
case str2...
...
case strN...
}
I want to create a perfect hash function that gives me an unique integer for
each of those strI, in the range of [0 ... M] (M should preferably be
small).
However, I can't for the life of me find any tutorial or documentation on
how to build up such a function. All I find are statements that I can use a
two-stage hashing system, and use a certain function for the first stage and
then for the second stage finding the function from a set of "universal
hashing functions" by randomly testing whether a given function will be
perfect. This sounds like "let's try randomly, and if there is no function
that works, we are just shit-outta-luck"
I have no idea what an universal hashing function really is about and
whether that is the only way of creating a perfect hash function, too: I
read about CHD, BDZ, BMZ, CHM, BRZ and FCH on the page of cmph, but I've no
clue how they relate to the big picture.
Can anyone shed some light on this, please? Is it really difficult to do a
perfect hashing?
over strings. So given
switch(str) {
case str1...
case str2...
...
case strN...
}
I want to create a perfect hash function that gives me an unique integer for
each of those strI, in the range of [0 ... M] (M should preferably be
small).
However, I can't for the life of me find any tutorial or documentation on
how to build up such a function. All I find are statements that I can use a
two-stage hashing system, and use a certain function for the first stage and
then for the second stage finding the function from a set of "universal
hashing functions" by randomly testing whether a given function will be
perfect. This sounds like "let's try randomly, and if there is no function
that works, we are just shit-outta-luck"
I have no idea what an universal hashing function really is about and
whether that is the only way of creating a perfect hash function, too: I
read about CHD, BDZ, BMZ, CHM, BRZ and FCH on the page of cmph, but I've no
clue how they relate to the big picture.
Can anyone shed some light on this, please? Is it really difficult to do a
perfect hashing?