How to implement a Hash Table in C

R

Richard Heathfield

CBFalconer said:
What question?

It seems that he wishes to know what primitive data structures would be
most suitable for implementing a hash table. Elsethread, I have
answered this question already, so I won't repeat the answer here.
What's a 'ds'?

The consensus appears to be that he means "data structure".
What's "C/C++"? In c.l.c I think
we can assume the language is intended to be C.

You have answered your own question. :)
 
U

user923005

Richard Heathfield wrote:

... snip ...





That's a nice, compact, and clear explanation. I tried explaining
this to someone some time ago (1 year or more) without success. He
continued to insist that non-prime sizes were suitable.

This assumes that quadratic probing is the best overflow handler.

I have found that hash tables which are an integral power of 2 in size
have definite advantages in some situations.
To calculate the appropriate address you can simply do this
(simplified example):

struct hashtab hash_table[1u << 24];
unsigned long long mask = (1u << 24)-1;

....

struct hashtab *entry = &hash_table[current_hash & mask];

It comes up a lot in game programming because we don't worry about
overflow. If a new entry looks better than an old one, we simply
overwrite it.

And when we need to store collision entries we can either roll the egg
out of the nest (cuckoo), or use hash tables of skiplists.

Of course, every different scheme has some advantages and
disadvantages. I did not read the book in question, but it sounds
like it could have used better editing.
 
E

Eric Sosman

user923005 wrote On 08/14/07 15:01,:
[...]

I have found that hash tables which are an integral power of 2 in size
have definite advantages in some situations.
To calculate the appropriate address you can simply do this
(simplified example):

struct hashtab hash_table[1u << 24];
unsigned long long mask = (1u << 24)-1;

Just a couple side-issues, not really related to hash
tables as such:

1) On a system whose ints have 16,17,...,24 bits, the
expression `1u << 24' yields undefined behavior,
not the intended 16777216. Suggestion: Change
`1u' to `1uL' in both places, or write 0x1000000
instead.

2) The range of `unsigned long long' vastly exceeds
the desired mask value of 16777215, and its use
may impose performance penalties. (Even on systems
with 64-bit hardware, you may wind up using two
CPU registers instead of just one.) Suggestion:
Change `unsigned long long' to `unsigned long', or
even to `unsigned int' if UINT_MAX suffices.

(I'm sure user923005 is aware of these matters, but
less-experienced readers of his "simplified example" may
not be.)
 
M

Malcolm McLean

CBFalconer said:
Malcolm McLean wrote:
I haven't looked at your code, but it appears to be bug infested
from the comments here. Take a look at my hashlib package, which
has been out for some years, and I have had zero bug reports since
the first revision (and that was for a minor memory loss on
closing). Very stable, and purely standard C content. See:
There are a few glitches which will come out in edition 2.

There are also some design issues, like the avoidance of size_t.

The code is meant to illustrate the algorithm, so that people who don't know
what a hash table is can see how one works. It is not meant to be an
industrial-strength application. If you want a drop-in hashtable, I'd
recommend your package over mine. They are not meant to do the same job.
 
M

Malcolm McLean

Mark McIntyre said:
<OT>
What, you mean as opposed to the irrationality of the book itself?
</ot>
As you might expect a lot of atheists are very hostile to my religious
books. What is interesting is how similar the response is to Basic
Algorithms. They are quite different subjects, and there is no reason why
someone who agrees with me on Christiamity should see eye to eye on
programming matters. But the things said are almost identical - I regularly
get demands to withdraw the book because it doesn't contain a definitive
proof of God's existence, for instance (it doesn't claim to, it refutes 12
Common Atheist Arguments, not the same thing as proving Christianity to be
true). In case of Basic Algorithms the pretext is technical, of course, but
I think the basic motive is the same. People see a book as something
socially unacceptable.
 
U

user923005

user923005 wrote On 08/14/07 15:01,:
I have found that hash tables which are an integral power of 2 in size
have definite advantages in some situations.
To calculate the appropriate address you can simply do this
(simplified example):
struct hashtab hash_table[1u << 24];
unsigned long long mask = (1u << 24)-1;

Just a couple side-issues, not really related to hash
tables as such:

1) On a system whose ints have 16,17,...,24 bits, the
expression `1u << 24' yields undefined behavior,
not the intended 16777216. Suggestion: Change
`1u' to `1uL' in both places, or write 0x1000000
instead.

2) The range of `unsigned long long' vastly exceeds
the desired mask value of 16777215, and its use
may impose performance penalties. (Even on systems
with 64-bit hardware, you may wind up using two
CPU registers instead of just one.) Suggestion:
Change `unsigned long long' to `unsigned long', or
even to `unsigned int' if UINT_MAX suffices.

(I'm sure user923005 is aware of these matters, but
less-experienced readers of his "simplified example" may
not be.)

Excellent observations.
Aside:
In the case of chess programs, 32 bit hash values are not large
enough.
If we compute 1 million nodes per second (not at all unusual) and
analyze for correspondence chess (e.g. 24 hour search = 86400 seconds)
there will be a huge number of collisions. With a 64 bit hash, we can
easily detect them.
We use the bottom n-bits to compute the hash, and then the top n-bits
are stored in the table as a key-check. Only if both the address and
key-check bits match do we conclude that we have found a match.
Because of the 64 squares on a chessboard, bitboard representations
are quite common and so most of the math is using 64 bit operations
anyway. Hence many modern chess program benefit mightily from 64 bit
operating systems and compilers.
 
K

Keith Thompson

Malcolm McLean said:
As you might expect a lot of atheists are very hostile to my religious
books. What is interesting is how similar the response is to Basic
Algorithms. They are quite different subjects, and there is no reason
why someone who agrees with me on Christiamity should see eye to eye
on programming matters. But the things said are almost identical - I
regularly get demands to withdraw the book because it doesn't contain
a definitive proof of God's existence, for instance (it doesn't claim
to, it refutes 12 Common Atheist Arguments, not the same thing as
proving Christianity to be true). In case of Basic Algorithms the
pretext is technical, of course, but I think the basic motive is the
same. People see a book as something socially unacceptable.

I think that's an extreme case of wishful thinking on your part.

I won't discuss your "12 Common Atheist Arguments" book here, both
because I haven't read it and because it's about as far off-topic as
anything I can imagine.

I don't believe that anybody has any objection to the idea of a book
on basic algorithms. There is absolutely nothing "socially
unacceptable" about such a book. People are objecting to the errors
in your book. If you had written a *good* book on basic algorithms,
nobody would complain.
 
M

Malcolm McLean

Keith Thompson said:
I don't believe that anybody has any objection to the idea of a book
on basic algorithms. There is absolutely nothing "socially
unacceptable" about such a book. People are objecting to the errors
in your book. If you had written a *good* book on basic algorithms,
nobody would complain.
The objection is to a regular publishing such a book.
I have written a good book. Not a single objection has been non-trivial - a
few bugs, but dropped stitches rather than deep-seated problems, and
stylistic issues, plus a detail on deterministic algorithms for prime
numbers. However they have been magnified into an assertion that I am
"unqualified to write" a book on algorithms. The technical objections,
though real issues, are obviously a pretext for something a bit deeper.

Guys, I've done it.
 
K

Kelsey Bjarnason

[snips]

As you might expect a lot of atheists are very hostile to my religious
books.

If they're on a par with your programming books, no wonder.
What is interesting is how similar the response is to Basic
Algorithms. They are quite different subjects, and there is no reason why
someone who agrees with me on Christiamity should see eye to eye on
programming matters.

Er... the two have bugger all to do with each other. I don't care if the
coder next to me is Christian, Wiccan, Atheist or what; I care whether I
can work with him and whether his code is any good.
Christianity to be true). In case of Basic Algorithms the pretext is
technical, of course, but I think the basic motive is the same.

Rejecting bogus tripe? You're probably right.
 
U

user923005

The objection is to a regular publishing such a book.

I know of regulars posting books before, and there was not a lot of
flack generated.
I have written a good book. Not a single objection has been non-trivial - a
few bugs, but dropped stitches rather than deep-seated problems, and
stylistic issues, plus a detail on deterministic algorithms for prime
numbers. However they have been magnified into an assertion that I am
"unqualified to write" a book on algorithms. The technical objections,
though real issues, are obviously a pretext for something a bit deeper.

Guys, I've done it.

I suggest Knuth's method:
Offer a reward of (say) $1 for every mistake found (but only the first
time the mistake is found).
That way, people will know that you are sincere in your desire for
correctness.

Alternatively, incorporate the feedback you receive to improve the
second edition.

I suggest that your editor was not technical enough to aid you in the
publishing process.
If the mistake list that I have seen is correct, then I guess you need
an editor with more technical ability.

Suggestion:
On your next pass {if you do another one} don't publish anything that
won't stand up to lint.
Also, where possible, do an exhaustive test of all possible inputs.
After all, if you are publishing a book, people are likely to use it
as a reference.

I am thinking about writing a book about hybrid algorithms.
If I can ever find the time.
 
B

Ben Bacarisse

Malcolm McLean said:
The objection is to a regular publishing such a book.
I have written a good book. Not a single objection has been
non-trivial

I think a useless queue[1] is rather more than trivial. It is quite
misleading about what one can do with a queue, but even at that level
of detail it is off-topic here.
- a few bugs, but dropped stitches rather than deep-seated
problems, and stylistic issues, plus a detail on deterministic
algorithms for prime numbers.

You continue to side-step the fact that your posted in c.l.c. Ask for
reviews in groups where either algorithms or teaching CS texts are
topical to see if anyone has deeper objections.

[1] Maybe "virtually useless" is better because I can think of one
rather specialised use for your non-queue.
 
B

Ben Pfaff

user923005 said:
I am thinking about writing a book about hybrid algorithms.
If I can ever find the time.

I think that I could write an entire book about implementing
linked lists in C, along the lines of GNU libavl. Not sure that
anyone would read it though.
 
U

user923005

I think that I could write an entire book about implementing
linked lists in C, along the lines of GNU libavl. Not sure that
anyone would read it though.

I promise to buy a copy.
 
K

Keith Thompson

Malcolm McLean said:
The objection is to a regular publishing such a book.

That's complete nonsense. Present one example of a book other than
yours being criticized because it was published by a regular.

Richard Heathfield is a co-author of a book on C. Why hasn't he
received the kind of abuse you've gotten here? Because his book
doesn't have the kind of blatant errors that yours does.

P.J. Plauger is a semi-regular. Even Dennis Ritchie posts here
occasionally. People have pointed out errors in their books, but
there just aren't all that many to point out.
I have written a good book. Not a single objection has been
non-trivial - a few bugs, but dropped stitches rather than deep-seated
problems, and stylistic issues, plus a detail on deterministic
algorithms for prime numbers. However they have been magnified into an
assertion that I am "unqualified to write" a book on algorithms. The
technical objections, though real issues, are obviously a pretext for
something a bit deeper.

Guys, I've done it.

People are criticizing your book because of its content. It has
nothing to do with the fact that you wrote it.

Well, that may not be entirely true. You're probably held to a higher
standard because, as a regular, we expect you to know C. But again,
the criticism is based on the actual content of the book. And if your
book had been published by an outsider, we probably wouldn't be
discussing it in the first place.
 
E

Eric Sosman

Malcolm said:
The objection is to a regular publishing such a book.
I have written a good book.

You have done no such thing. I have read good books,
and I know.
Not a single objection has been non-trivial
- a few bugs, but dropped stitches rather than deep-seated problems, and
stylistic issues, plus a detail on deterministic algorithms for prime
numbers. However they have been magnified into an assertion that I am
"unqualified to write" a book on algorithms. The technical objections,
though real issues, are obviously a pretext for something a bit deeper.
Guys, I've done it.

Code that won't compile -- "trivial?"

Code that doesn't work -- "a detail?"

Code that's inefficient -- "stylistic?"

An author who can't even tell whether the objections are
wrong or right -- "qualified?"

When you "drop stitches" on a space suit, you deserve
to fly in it.

I admit to a revulsion towards your frightful book. But it's
no pretext for anything, deeper or shallower: your frightful book
merits opprobrium. The notion that anyone would accept money from
unsuspecting dupes in exchange for this piece of rot is repugnant.
And when the author moves from "I'll accept money" to "Buy my good
book!" as in this thread...! My revulsion is redoubled, my gorge
runneth over.
 
C

CBFalconer

Eric said:
Malcolm McLean wrote:
.... snip ...

You have done no such thing. I have read good books, and I know.
... snip all further elucidation ...

I suspect you need go no further to make an enemy.
 
R

Richard Heathfield

Malcolm McLean said:
As you might expect a lot of atheists are very hostile to my religious
books.

Someone who claims that your Refutations book is irrational need not
necessarily be an atheist. I haven't read it myself, so I am commenting
only generally, but it seems to me that there are many possibilities:

Your book is rational, your critic is an atheist, your critic is wrong;
your book is irrational, your critic is an atheist, your critic is
right; your book is rational, your critic is of another faith, your
critic is wrong; your book is irrational, your critic is of another
faith, your critic is right; your book is rational, your critic is a
Christian, your critic is wrong; or your book is irrational, your
critic is a Christian, and your critic is right.

But these boil down to just two truly distinct possibilities - either
the book is irrational (in which case the criticism is correct) or it
is not (in which case the criticism is incorrect), and the faith or
otherwise of the critic is of no particular relevance.
What is interesting is how similar the response is to Basic
Algorithms. They are quite different subjects, and there is no reason
why someone who agrees with me on Christiamity should see eye to eye
on programming matters.

Indeed - and in fact other Christians might not even agree with you on
Christianity. Or they might. It's a broad church (if I may use that
expression in this context!).
But the things said are almost identical - I
regularly get demands to withdraw the book because it doesn't contain
a definitive proof of God's existence, for instance

Neither does the Bible, but I don't see anyone clamouring for it to be
withdrawn.
(it doesn't claim
to, it refutes 12 Common Atheist Arguments, not the same thing as
proving Christianity to be true).

Right. The book should be judged on its merits. If the refutations are
of poor quality, or are easily refuted themselves, or attack the wrong
arguments (e.g. arguments that are not commonly used by atheists), then
the book is a poor book. That doesn't mean it should be withdrawn,
however. The world needs horrible warnings just as much as it needs
good examples.
In case of Basic Algorithms the
pretext is technical, of course, but I think the basic motive is the
same.

The motive is technical. If your purpose is to illustrate algorithms, I
suggest that you use either pseudocode or a language you know far
better than you know C.
People see a book as something socially unacceptable.

Not in comp.lang.c they don't.
 
V

Very.Little.Gravitas.Indeed

However they have been magnified into an assertion that I am
"unqualified to write" a book on algorithms. The technical objections,
though real issues, are obviously a pretext for something a bit deeper.

That was not what the assertion was, it was you are unqualified to use
C as a language to descibe the algorithms you are describing in your
books.

C is a strictly defined language which has been around for years, not
only has it standards it has conventions. yet you ignore all those and
choose to do it your own way.

If you are using C you should do so rigorously, anyone who finds a
book with C code in it is going to try to use that code when they find
it doesn't work, it's not going to teach them anything, except they
just wasted money on a poor book.

If you choose to use your own ideas and conventions on how C should be
written then the your algorithms are lost in your personal ideas about
how C is or isn't a good language. Instead of looking at your
algorithms and thinking why they work or how they work, you look at
the code and you wonder why the standards and conventions were not
followed and why the code doesn't work.

If you don't follow any standards in C or conventions, then why use C
at all, use a pseudo code, because any advantages from using C go out
the window when you ignore the stanards and conventions that are used.
 
C

Christopher Benson-Manica

[comp.lang.c] Richard Heathfield said:
Malcolm McLean said:
(a bunch of stuff about atheism, all OT)
(a bunch of stuff about atheism, all OT)

Can we possibly take discussion of Malcolm's anti-atheism book (or
whatever it's about) someplace where it's on topic? Thanks.

I'm trying to imagine a context where this statement is reasonable,
without success.
 
R

Richard Heathfield

Christopher Benson-Manica said:
[comp.lang.c] Richard Heathfield said:
Malcolm McLean said:
(a bunch of stuff about atheism, all OT)
(a bunch of stuff about atheism, all OT)

Er, I didn't actually say anything about atheism. But yes, I know what
you mean.
Can we possibly take discussion of Malcolm's anti-atheism book (or
whatever it's about) someplace where it's on topic? Thanks.

It was more of a meta-discussion really - the intent was to encourage
Malcolm to fold my points back onto the discussion of his algorithms
book. But your point is nevertheless well-taken.
I'm trying to imagine a context where this statement is reasonable,
without success.

Likewise. Especially here in comp.lang.c, which is one of the more
literate groups on Usenet.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top