Be Honest: Do you implement hashCode(), equals(), and toString() for every class you write?

C

Chris Uppal

Patricia Shanahan wrote:

[me:]
It is a functionally correct hashCode.

Only in the minimal sense that it abides by its contract. It not, however,
useable for anything; therefore it only exists (or should only exist) as a
placeholder for a method which the author assumes will not be needed. But
that's a major risk -- sooner or later someone else is likely[*] to start using
these things in HashTables, and then they get poor performance without knowing
why (or perhaps even without realising that they are getting unecessarily poor
performance). If the assumption is that the method is only a placeholder, then
it would be much safer (as I said) to throw an unchecked exception.

If any programmer /I/ worked with was in the habit of leaving these little
timebombs ticking away in the codebase, then I'd be inclined to throw a Very
Serious Wobbly.

What you think of a programmer who habitually wrote:

void toString() { return ""; }

? Yet that's a damn sight safer (and no less "reasonable") than always
returning zero from hashCode().

I assume that if Mark found poor hash table performance during
profiling,

As an aside, profiling is not a good tool for recognising or diagnosing poor
hashing functions. And, although it is a reasonable tool for confirming that a
non-too-special hash function is "good enough", even that is somewhat unsafe
unless you know that your test data is representative of real world data /in
the way it interacts with the hash/.

-- chris

[*] "likely" assuming normal operation of Sod's Law.
 
P

Patricia Shanahan

Thomas said:
That doesn't work well if the hash codes of a, b and c are often equal,
a, b and c are often permutations, or whatever.

You could take you lead from String and do something like:

@Override
public int hashCode() {
int h = a.hashCode();
h = h*31 + b.hashCode();
h = h*31 + c.hashCode();
return h;
}

That is what I usually do.

Patricia
 
C

Chris Uppal

Matt said:
My question is, what is a good first implementation of compound hashCode?
For objects that have simple fields a, b, c I tend to use

a.hashCode ^ b.hashCode ^ c.hashCode

And for compositions, accumulating hashCodes of elements over ^.

Multiplication isn't such a good idea if any of the elements has a tendency to
be zero. Xor is also not such a good idea unless it is combined with
bit-shifting to mix the bits up (and also to make the hash sensitive to the
order of elements in the collection -- which is normally, but not always, what
you want).

A couple of approaches:

(a * P + b) * P + c (where P is some prime)
((a << S) ^ b) << S) ^ c (where S is some small integer)

Either can be generalised to iterating over collections. But the second should
be modified to catch the bits as they are shifted off the high end and xor-ing
then back in at the low end, otherwise only the last few elements in the
collection will contribute to the hash. For example:

int
hash(int[] data)
{
int hash = 0
for (int i : data)
{
int wrapBit = hash >>> 31;
hash = (hash << 1) ^ i ^ wrapBit;
}
}

There are lots of possible modifications.

-- chris
 
T

Tor Iver Wilhelmsen

Danno said:
Just curious

toString(): Yes, for evey case when a bean is used in a context where
a textual representation is needed.

hashCode()/equals(): Yes, for every bean used in e.g. Hibernate
mappings, esp. composite key classes which are mandated by Hibernate
to override them.
 
R

Richard Wheeldon

Danno said:
Just curious

toString() : frequently, usually for debug information and for objects
which will be used directly in a GUI (e.g. nodes in trees, entries in
lists, comboboxes, etc.)

equals() : Usually only for business data objects, which count for a
fairly small number of classes compared to those related to the gui,
reports, database access, etc. I always implement equals when
implementing Comparable though.

hashCode() : Most of the times when I implement equals().

Fwiw, in my (rather old) copy of Sun's JDK source code I found 455
toString()s, 377 equals()s and 265 hashCode()s out of around 4000 source
files.

Richard
 
M

Mark Jeffcoat

Chris Uppal said:
As an aside, profiling is not a good tool for recognising or diagnosing poor
hashing functions. And, although it is a reasonable tool for confirming that a
non-too-special hash function is "good enough", even that is somewhat unsafe
unless you know that your test data is representative of real world data /in
the way it interacts with the hash/.


Since the constant hashing function always shows up in
the same way in the profiler (an unusually large number
of equals() comparisons, as your HashMap degrades into
a list), I think the profiler is a perfectly good tool.

It's also perfectly safe; you'll never get the wrong
answer this way.

The real win comes from two places: first, that you haven't
wasted any time or mental effort solving a problem that
wouldn't have done you any good anyway. Second, when you
do have a need for a real hash function, the performance
will be so miserable that you can't miss it, and you'll
exactly where to concentrate your effort. If you'd started
with something that's pertty good, that bright red flag
won't ever be there, and you may have a harder time spotting
your opportunity.

In my recent experience, I need a real hashCode about 5% of
the time. For the other 95%, I'd be more productive just
setting $20 bills on fire.

(Oooh: just thought of a third way the constant hash function
wins: it's really, really hard to screw it up. God help you
if you write a hashCode() that's inconsistent with equals().)
 
R

richardsosborn

Not only that but test harness, junit class, class diagram and javadocs
before ANY methods are created.

;) (not really)
 
A

AndrewMcDonagh

Mark said:
You'll know when you need equals()... most things
don't get compared, so there's usually no reason to
bother.

Once you've put in your own equals() method, you really
need to override hashCode(), too. But pre-mature optimization
is evil, so here's the one I always use pre-profiling:

public int hashCode() {
return 0;
}


I'm not kidding.

If thats your over ridden version - your application would be better
served not doing anything. At least that way you may get some
distribution of the objects within a Hash. With your current approach
your Hashes would always only contain one bucket full of every instance.

You are in effect causing a performance problem - not optimizing a slow one.
 
P

Patricia Shanahan

AndrewMcDonagh said:
If thats your over ridden version - your application would be better
served not doing anything. At least that way you may get some
distribution of the objects within a Hash. With your current approach
your Hashes would always only contain one bucket full of every instance.

Distributing the objects within the hash is very, very bad unless you
ensure that any pair of equal objects have the same hash code.
You are in effect causing a performance problem - not optimizing a slow
one.

He is not trying to optimize. He is trying to maintain program
correctness, and do optimization later, if it is needed.

Patricia
 
J

Jeffrey Schwab

AndrewMcDonagh said:
If thats your over ridden version - your application would be better
served not doing anything.

If you override equals(), you're supposed to override hashCode() as
well, to ensure that equal objects have equal hash codes.
At least that way you may get some
distribution of the objects within a Hash. With your current approach
your Hashes would always only contain one bucket full of every instance.

He knows.
You are in effect causing a performance problem - not optimizing a slow
one.

This is pre-profiling. He's effectively letting his hash use just one
bucket, trading performance (from near-constant to O(n) complexity) for
simplicity. I'm not sure I like it, but I appreciate the logic.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,070
Latest member
BiogenixGummies

Latest Threads

Top