Angus Comber said:
I want to create a lookup table where I can store string keys:
For example:
192.168.0.1 -> Purple
192.168.0.2 -> Blue
192.168.0.3 -> Red
The first field are IP addresses - but basically a text string. The
colour - eg Purple, Blue or Red is the value associated with the IP address
(sort of the key).
Ordinarily, this is the wrong newsgroup for this kind of a question.
C is too low-level of a language to do this in a simple or natural
way, and discussions of algorithms is not the focus of this group.
Only topics like casting malloc, are really covered here.
Anyhow, the solution is to build a hash-table with a target on the end
of each map. So the idea is that you would store a (ip,attribute)
tuple but use only the first tuple for the key. I'm going to assume
that what you've got here is really just a string -> string mapping.
So each tuple would be roughly:
struct aaStr2StrTuple {
struct aaStr2StrTuple * next;
unsigned int keyHash32;
char * key, * result;
};
#define HASH_KEY_SZ (65536)
struct aaStr2StrTuple * hashTable[HASH_KEY_SZ];
Then the core API would include:
int setEntryInHashTable (struct aaStr2StrTuple * h,
const char * key,
const char * result);
char * getEntryInHashTable (struct aaStr2StrTuple * h,
const char * key);
The idea would be that you would copy, then canonicallize the "key"
string. Then you would compute some kind of 48-bit hash (using a
CRC-48 or something), from which you could extract a 16-bit hash
index, and additional 32-bit verification bits (so that you can by
virtue of index seperation an one fast int comparison, perform
essentially a fairly strong 48-bit hash test, before having to drop
down to a strcmp.) Then copy the result string, and you should be
able to find/insert into the hash table fairly easily. To wipe an
entry you can so something like calling setEntryInHashTable with
result set to NULL. If you don't find an entry, then
getEntryInHashTable should return NULL.
You can do more complicated things like having a variable sized hash
index, but then you have to generate two hashes, or have redundant
bits or something like that (or suffer worse collision issues.)
Whatever -- these are just hash design issues, which you can research
from any good data structures text.