how to define hashcode for this class?

K

Kevin

Hello!

For example, if I want to use this class as key to a hashtable, how do
we define its hashcode efficiently? I want to hash based on valueA,
ValueB, ValueC.

public class abc
{
int ValueA;
int ValueB;
int ValueC;
// other functions ...
//.....
}

A typical usage of this kind of hash is, for example, we want to hash a
3D point in a 3D space, where we need all these 3 int in order to
decide a position in 3D space. (or we can extend this to 4D, 5/5/6...)

Thanks a lot and have a nice day!
 
R

Roedy Green

For example, if I want to use this class as key to a hashtable, how do
we define its hashcode efficiently? I want to hash based on valueA,
ValueB, ValueC.

public class abc
{
int ValueA;
int ValueB;
int ValueC;
// other functions ...
//.....

see http://mindprod.com/jgloss/hashcode.html


how about valueA ^ valueB ^ valueC


--
Bush crime family lost/embezzled $3 trillion from Pentagon.
Complicit Bush-friendly media keeps mum. Rumsfeld confesses on video.
http://www.infowars.com/articles/us/mckinney_grills_rumsfeld.htm

Canadian Mind Products, Roedy Green.
See http://mindprod.com/iraq.html photos of Bush's war crimes
 
J

Jean-Baptiste Nizet

Not really an answer to your question, but if you intend to use this
class as a HashMap key, you should also implement equals, and make sure
that two equal keys has the same hashCode.
You should also make your class immutable, else someone might change
the values of a, b or c, which would change the hashCode, and thus
break the HashMap.
A typiccal immutable class is made like this:

public final class Immutable {
private int a;
private int b;
// no public field

public Immutable(int a, int b) {
this.a = a;
this.b = b;
}

public int getA() {
return a;
}

public int getB() {
return b;
}

// no setter
}

JB.
 
H

Hemal Pandya

Not really an answer to your question, but if you intend to use this
class as a HashMap key, you should also implement equals, and make sure
that two equal keys has the same hashCode.
You should also make your class immutable, else someone might change
the values of a, b or c, which would change the hashCode, and thus
break the HashMap.

Thank you for pointing this out.

I have seen many code fragments that use a HashMap as a convenient
label-value pairs -- small collections where being able to get to the
object by its key is more important then the search efficiency. In that
regard this aspect it more important then coming up with a nice hashing
function.
A typiccal immutable class is made like this:
[....]

It is not so much that the class but the fields used for computing the
hashCode have to be immutable but that are immutable. They have to be
protected from the class itself too; so I prefer to make final all
fields that participate in computing the hashCode.

Is it correct to say that to avoid errors all calls to hashCode an
object should return the same vale? Once functions anyone?
 
C

cbongior

As everyone else has pointed out, but in more generic terms, any
commutative math op on the hashable objs in question will do. To
prevent overflow and possible sideffects therein, use the '^' (xor) op.
Dont' forget the equals method

Christian
http://christian.bongiorno.org/resume.pdf
 
P

Patricia Shanahan

As everyone else has pointed out, but in more generic terms, any
commutative math op on the hashable objs in question will do. To
prevent overflow and possible sideffects therein, use the '^' (xor) op.
Dont' forget the equals method

Christian
http://christian.bongiorno.org/resume.pdf

Why commutative? I had assumed that commutivity is undesirable in a hash
code function, rather than neutral or desirable.

Symmetry can lead to an object whose fields are a permutation of another
object's fields. For example, consider a unit cube with vertices:

A=(0,0,0) B=(0,0,1) C=(0,1,0) D=(1,0,0) E=(1,1,0) F=(1,0,1) G=(0,1,1)
H=(1,1,1)

If a hash code based on the coordinates uses any commutative operation,
points B, C, and D have the same hash code, as do points E, F, and G.
(For the specific case of xor, because it is a parity count for each bit
position, there are only two hash code values for the eight points.)

Patricia
 
G

googmeister

As everyone else has pointed out, but in more generic terms, any
commutative math op on the hashable objs in question will do.

I echo Patricia's criticism. Commutativity doesn't seem desirable.
To prevent overflow and possible sideffects therein, use the '^' (xor) op.

What's wrong with overflow of ints? What side effects are you
worried about?
 
P

Patricia Shanahan

I echo Patricia's criticism. Commutativity doesn't seem desirable.




What's wrong with overflow of ints? What side effects are you
worried about?

For example, the String hashCode is a weighted sum using powers of 31 as
weights. It will overflow for most Strings. Hash tables with String keys
are rather common, and don't seem to cause any problems.

Patricia
 
C

cbongior

That's what the equals() method is for. The hascode built into object
is not well suited to uniform random probabilty. "Effective Java" by
Bloch gives a simple test to prove this out. You are right though about
symetry -- which is why the EJ book even gives you and improved hashing
algorithm (I believe it involves some simple primes). And now that you
mention it, I suppose commutivity is a problem with hashing (increase
chance of collision). I will look at the bloch method again tonight

Clearly though, I stirred up a hornets nest with the above post.
Avoiding overflow is merely precautionary wisdom. If someone can show
me that overflow in a hasing algorithm is NEVER a problem (or give a
very convincing argument) I will let my overflow worries ebb (get it:
Overflow, EBB ?).
 
T

Thomas Weidenfeller

Clearly though, I stirred up a hornets nest with the above post.
Avoiding overflow is merely precautionary wisdom. If someone can show
me that overflow in a hasing algorithm is NEVER a problem (or give a
very convincing argument) I will let my overflow worries ebb (get it:
Overflow, EBB ?).

Let's start with the behavior of the data type: Overflow behavior of
integers is well defined in the JLS. You don't get any exception, and
the result resembles the lower bits as if the operation had been
executed with a sufficiently large data type. So that shouldn't create a
new problem.

But, overflow in a hashing algorithm can of course be a problem, if the
algorithm is ill chosen, depending on the value range. E.g.

// brain-dead hash function
hash = 1;
for(int i = 0; i < value; i++) {
hash *= 2;
}

The above will give a hash code of 0 for almost all values. This is a
property of the particular algorithm (and the input). Other algorithms
have other properties, and might or might not degenerate when overflow
happens.

/Thomas
 
P

Patricia Shanahan

That's what the equals() method is for. The hascode built into object
is not well suited to uniform random probabilty...

The sole objective of hashing is to reduce the number of key equality
tests for a table look-up, so I don't understand the first quoted sentence.

The relevant parts of the hashCode contract are "However, the programmer
should be aware that producing distinct integer results for unequal
objects may improve the performance of hashtables." and "As much as is
reasonably practical, the hashCode method defined by class Object does
return distinct integers for distinct objects."

A hash table implementation can do its own manipulations to distribute
entries with different codes among its buckets, but there is nothing it
can do to avoid excessive equals() calls given a hash function that
assigns the same code to clusters of keys that are likely to exist in
the same table.
Clearly though, I stirred up a hornets nest with the above post.
Avoiding overflow is merely precautionary wisdom. If someone can show
me that overflow in a hasing algorithm is NEVER a problem (or give a
very convincing argument) I will let my overflow worries ebb (get it:
Overflow, EBB ?).

Integer overflow is not a problem, because it never causes abrupt
completion of an operation and it has no attractors, no values that
results tend to concentrate on. Any int can the the result of an
overflowed int calculation.

On the other hand, I would not like to see double overflow in a
hash code function, because the double infinities function as
attractors. Any overflow results in one of them, and an infinite
intermediate result in a chained calculation tends to lead to an
infinite final result.

[Note that I nobly resisted any puns involving "floating", "overflow",
and "ebb".]

Patricia
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top