equals and hascode

mark jason · Dec 18, 2010

hi
I am using colt matrix package from cern for some matrix manipulations
in my application.To provide a more convenient interface for client ,I
decided to override a matrix class (let's say BaseMatrix) as below

class MyMatrix extends cern.somecoltpackage.BaseMatrix {
public SomeValue easierInterface1(){
//return something...
}
public SomeOtherValue easierInterface2(){
//return somethingelse...
}
....
}

However,while running FindBugs over the code,I am getting this warning

MyMatrix.java:1 MyMatrix inherits equals and uses Object.hashCode()

I went through the source code of the superclass in cern package and
couldn't find any implementation of equals() or hashCode().What should
I do about this warning?Should I ignore it?Or should I implement these
methods?Please advise.
thanks,
mark.

Andreas Leitgeb · Dec 18, 2010

mark jason said:
class MyMatrix extends cern.somecoltpackage.BaseMatrix { [...] }
However,while running FindBugs over the code,I am getting this warning
MyMatrix.java:1 MyMatrix inherits equals and uses Object.hashCode()
I went through the source code of the superclass in cern package and
couldn't find any implementation of equals() or hashCode().

Could you try to run FindBugs over cern.somecoltpackage.BaseMatrix ?

I don't know "FindBugs", but the wording appears to imply a bug
already in the super class (or even higher up).

mark jason · Dec 18, 2010

Could you try to run FindBugs over cern.somecoltpackage.BaseMatrix ?

I don't know "FindBugs", but the wording appears to imply a bug
already in the super class (or even higher up).

same warning there,
/src/cern/colt/matrix/DoubleMatrix2D.java:284
cern.colt.matrix.DoubleMatrix2D defines equals and uses
Object.hashCode()

Aeris · Dec 18, 2010

mark said:
MyMatrix.java:1 MyMatrix inherits equals and uses Object.hashCode()

One of the (transitive) superclass of your class redefined Â«void
equals(Object)Â» but not Â«int hashCode()Â»
This is potentially wrong because Java doc say Â«o1.equals(o2) =>
o1.hashCode() == o2.hashCode()Â» and can lead to strange behavior (bad sort
on generic algorithm, wrong result of equals, breaked HashMap or HashSetâ€¦)

A Â«goodÂ» implementation of equals is

public boolean equals(Object obj) {
if(this == obj) {
return true;
}

if (obj == null || !(obj instanceof MyClass)) {
return false;
}

return this.hashCode() == obj.hashCode();
}

mark jason · Dec 18, 2010

One of the (transitive) superclass of your class redefined «void
equals(Object)» but not «int hashCode()»

The inheritance hierachy is

MyMatrix extends DenseDoubleMatrix2D extends DoubleMatrix2D .......

DenseDoubleMatrix2D does not define equals()
DoubleMatrix2D redefines equals() as below but not hashCode()

public boolean equals(Object obj) {
if (this == obj) return true;
if (obj == null) return false;
if (!(obj instanceof DoubleMatrix2D)) return false;
return cern.colt.matrix.linalg.Property.DEFAULT.equals(this,
(DoubleMatrix2D) obj);
}

Aeris · Dec 18, 2010

mark said:
DoubleMatrix2D redefines equals() as below but not hashCode()

Here is the mistake
And by inheritance, your custom class is also Â«buggedÂ»

Aeris · Dec 18, 2010

Chris said:
except in the extremely
rare cases where some class's instances' states are entirely reflected in
their hashCode()s.

Does it shouldn't be the case?

A good practice is to write immutable object and in case of mutable object,
I agree with you that hashcode could throw a exception to avoid collection
or map problem.

And Java doc say:

Whenever it is invoked on the same object more than once during an

execution of a Java application, the hashCode method must consistently
return the same integer

If two objects are equal according to the equals(Object) method, then

calling the hashCode method on each of the two objects must produce the same
integer result.

However, the programmer should be aware that producing distinct integer

results for unequal objects may improve the performance of hashtables.

So, in the most of the cases (immutable object), hashcode and equals MAY be
equivalent and hashCode could be computed Â«staticallyÂ» in constuctors and my
equals code is good.
In minor cases (immutable object), hashCode + equals mix is a non-sense
concept, which could lead to very very serious problems with all generic
classes (collection, map, set, generic algoâ€¦) and MUST be avoided and
disable by an exception.

Lew · Dec 18, 2010

mark said:
The inheritance hierachy is

MyMatrix extends DenseDoubleMatrix2D extends DoubleMatrix2D .......

DenseDoubleMatrix2D does not define equals()
DoubleMatrix2D redefines equals() as below but not hashCode()

public boolean equals(Object obj) {
if (this == obj) return true;
if (obj == null) return false;
if (!(obj instanceof DoubleMatrix2D)) return false;

Note that this code effectively checks twice whether 'obj == null'.

return cern.colt.matrix.linalg.Property.DEFAULT.equals(this,
(DoubleMatrix2D) obj);
}

Others have pointed out that the "colt" library itself has the bug. This
means that its matrix classes will behave strangely in hashed collections such
as 'HashSet'.

Read and study /Effective Java/ (2nd ed.) by Joshua Bloch for the rationale
and best practices around 'hashCode' and 'equals'. This book is /de rigueur/
for anyone wishing to describe themselves as a Java programmer.

Lew · Dec 18, 2010

A ï¿½goodï¿½ implementation of equals is

public boolean equals(Object obj) {
if(this == obj) {
return true;
}

if (obj == null || !(obj instanceof MyClass)) {
return false;
}

return this.hashCode() == obj.hashCode();
}

[/QUOTE]

Chris said:
That is a very, very, very, very, very, very
/bad/ implementation of equals(), except in the extremely

no exceptions

rare cases where some class's instances' states are entirely reflected in their
hashCode()s. I.e. that each unique (up to whatever notion of equality is
appropriate for that class) object has a unique hash code. That is hardly ever
going to be true.

And you should never count on it.

Plus it checks twice whether 'obj == null'.

mark jason · Dec 18, 2010

Others have pointed out that the "colt" library itself has the bug. This
means that its matrix classes will behave strangely in hashed collections such
as 'HashSet'.

thanks everyone for the replies,so if I still want to subclass the
class from colt package,should I implement the method as given below?
regards,
mark
@Override
public int hashCode() {
throw new UnsupportedOperationException();
}

Eric Sosman · Dec 18, 2010

One of the (transitive) superclass of your class redefined Â«void
equals(Object)Â» but not Â«int hashCode()Â»
This is potentially wrong because Java doc say Â«o1.equals(o2) =>
o1.hashCode() == o2.hashCode()Â» and can lead to strange behavior (bad sort
on generic algorithm, wrong result of equals, breaked HashMap or HashSetâ€¦)

A Â«goodÂ» implementation of equals is

public boolean equals(Object obj) {
if(this == obj) {
return true;
}

if (obj == null || !(obj instanceof MyClass)) {
return false;
}

return this.hashCode() == obj.hashCode();
}

This is "Â«goodÂ»" in the sense that it satisfies the requirement
that equals() agree with hashCode(), but it does so only in a trivial
sense. If this implementation were "Â«good enoughÂ»", then it could
have been a `final' method of the Object class -- but it's not, meaning
that other classes are expected to have their own ideas about equals(),
notions that take into account more information than this one does.

For example, imagine that Long used this implementation for
equals(). Then for any Long instance `val', there would be
4294967295 other Long instances `valx' such that

val1.equals(valx) && val1.longValue() != valx.longValue()

.... leading to interesting consequences like

HashMap<Long,String> map = new HashMap<Long,String>();
map.put(1L, "one");
map.put(0x100000000L, "four gig");
assert (map.size() == 1);

.... which many people would find surprising -- and disappointing.

Aeris · Dec 18, 2010

Chris said:
The reason I say that it could never have been a requirement that equality
of hash codes be enough for certainty that the objects are equal(), is
that there are only a few possible values for hashCode() -- about 4
billion (2 ** 32).

This is where I fail
I have not considered that hashcode restrict the Â«equalsÂ» possibility to
2**32 casesâ€¦

Eric Sosman · Dec 18, 2010

I'm not really sure what you are asking here ? If you are asking whether it's
a good idea for an object's state to be entirely reflected in its hash code,
then my answer is that you wouldn't ever normally have the option (see below).

I didn't recommend that /every/ mutable object should throw from hashCode(); my
point is only that it's a useful design /option/, and possibly a good one in
the case of the Colt library Matrix classes.

The idea is that mutability is only bad /if/ it affects the object's idea of
what it is equal to (and therefore also its hashCode()). Most objects are
either immutable or don't have that "changing-equality" property. In either
case they can go into hashed collections without problems.

The problems only come when the object has "changing-equality" AND it actually
changes (or is remotely likely to do so) while it is in the collection. It may
be that the collections the code uses are -- by design -- only used to hold
objects which have "settled down" and will never change again. Depending on
circumstances, it might, or might not, be worth the effort of adding code to
make it /impossible/ for the object to change once it had been added to the
collection. (For instance, making an immutable copy of the object and putting
that in the collection instead, or putting a boolean
neverChangeAgainEverEverEver
flag in the object.)

However, if the object has "changing-equality" and is normally mutable over its
whole lifecycle, then it /may/ make sense to add safety by throwing from
hashCode().

Note that this won't help with objects in non-hashed keyed
collections like TreeSet, which are based on compareTo() (or on a
Comparator) and probably don't use hashCode() at all.

Also, although the neverChangeAgainEverEverEver flag may sound
plausible, I'm not at all sure it's practical. The question arises:
When does it get set? At the first call to hashCode() or equals() or
compareTo() or some other instance method that *might* be called when
some collection gets a reference to the object? What if you put the
object in an ordinary List and then call Collections.sort() on the
List; should the object thereby lose its ability to mutate? If so,
why; if not, why not? I fear the management of the flag might turn
out to be more complicated than it's worth, giving rise to extra
methods like testEqualityWithoutFreezing(). And then you need to tell
the programmers when to use testEqualityWithoutFreezing() instead of
equals() and vice versa ... and isn't this at least as brittle as
telling them not to mutate an object while it's in a Map?

I do not foresee a day when programming languages that are powerful
enough to be useful will also be so safe they require no discipline on
the part of the programmers. See also "With Folded Hands" by Jack
Williamson.

Martin Gregorie · Dec 18, 2010

Also, although the neverChangeAgainEverEverEver flag may sound
plausible, I'm not at all sure it's practical. The question arises:
When does it get set?

As an alternative approach, how about building the 'immutable' class to
contain the all the 'immutable' attributes. These are declared as
protected and the only public methods are getters and equals(). Now
extend it with a subclass that contains all the methods and constructors
that alter the protected attributes and a freeze() method that calculates
the hash attribute and returns a reference to the 'immutable' class.

I think this would work: you'd put the object returned by freeze() into
hashed collections, etc. where it can't be changed because it has no
methods that apply changes and its attributes are protected. If you
retain a reference to the subclass you can continue to modify this
instance and even update the collection by deleting the superclass
instance and replacing it with a new instance returned by the subclass's
freeze() method.

This idea is purely a thought experiment. Is it simply too arcane to be
worth the implementation and maintenance effort and/or has it got some
other flaw I didn't spot?

Andreas Leitgeb · Dec 18, 2010

Martin Gregorie said:
As an alternative approach, how about building the 'immutable' class to
contain the all the 'immutable' attributes. These are declared as
protected and the only public methods are getters and equals(). Now
extend it with a subclass that contains all the methods and constructors
that alter the protected attributes and a freeze() method that calculates
the hash attribute and returns a reference to the 'immutable' class.

I wonder, whether you meant returning an ImmutableBaseClass typed
reference to the mutable instance, or if you meant to create a new
ImmutableBaseClass-typed instance initialized to the same relevant
member values.

Either way would be unsafe. One could still add a mutable instance
directly into the HashMap (bypassing freeze()) and then modify it.

You could do with an abstract base class, from which you derive a
mutable and an immutable subclass and have your mutable subclass's
freeze() create an immutable-subclass-typed snapshot instance.

markspace · Dec 18, 2010

thanks everyone for the replies,so if I still want to subclass the
class from colt package,should I implement the method as given below?

I think a good fix would be to contact the Cern-Colt people and get them
to fix their bug.

Failing that, grab the source code and fix it yourself. The problem
with that is they might assume that hashcode() provides identity
semantics and you'll break their code if you do change hashcode().

We'd have know exactly what
cern.colt.matrix.linalg.Property.DEFAULT.equals does to figure out
what's the right way to go here.

Mike Schilling · Dec 18, 2010

Patricia Shanahan said:
Your equals method would work if, and only if, hashCode implements a
"perfect hash". Even for types with fewer than 2**32 cases, depending on
having a perfect hash is often not practical.

Or not optimal. A class consisting of a four character string would
probably not make a perfect hash of them, even though that's easy enough,
because the normal String hash tends to have better bucket-distribution
characteristics.

Martin Gregorie · Dec 18, 2010

I wonder, whether you meant returning an ImmutableBaseClass typed
reference to the mutable instance,

That's what I meant.

Either way would be unsafe. One could still add a mutable instance
directly into the HashMap (bypassing freeze()) and then modify it.

Good point. As implied, I thought I must have missed something, so thanks
for pointing it out.

You could do with an abstract base class, from which you derive a
mutable and an immutable subclass and have your mutable subclass's
freeze() create an immutable-subclass-typed snapshot instance.

Neat.

Am I right to suppose the immutable subclass's getHash() would return a
hash value while the mutable subclass's getHash() would throw an
exception to prevent it being added to a hash collection?

Andreas Leitgeb · Dec 18, 2010

Martin Gregorie said:
Am I right to suppose the immutable subclass's getHash() would return a
hash value while the mutable subclass's getHash() would throw an
exception to prevent it being added to a hash collection?

Throwing an exception is one way.

Defining hashCode() to return some constant like that
public int hashCode() { return 42; }
would be ok, too. (actual value doesn't matter)

In the former case, you get a runtime exception on trying to add
it to a HashMap, in the latter case it will work, but always place
all the mutable instances into the same bucket. Bad performance,
but consistent/correct (regardless of how equals() is defined).

Lew · Dec 19, 2010

Aeris said:
So, in the most of the cases (immutable object), hashcode and equals MAY be
equivalent and hashCode could be computed Â«staticallyÂ» in constuctors and my
equals code is good.

Actually, 'hashCode()' and 'equals()' will not be equivalent in general,
because unequal objects always risk having the same hash code. So if by "my
equals code" you mean

'return this.hashCode() == obj.hashCode();'

this is never good. Ever. You cannot and should not assume that distinct by
'equals()' implies distinct hash codes. Bad, very bad. Very, very bad.
Very, very, very bad. Very, very, very, very, ...

In minor cases (immutable object), hashCode + equals mix is a non-sense
concept, which could lead to very very serious problems with all generic
classes (collection, map, set, generic algoâ€¦) and MUST be avoided and
disable by an exception.

It is possible to use mutable objects in collections as long as they don't
mutate whilst in the collection. It's just risky. But it is done all the
time, e.g., in JPA entities.

Subclasses and the equals method	15	Aug 10, 2008
hashCode and equals (again)	19	Apr 9, 2008
override equals and hashcode	9	Aug 9, 2007
HashMaps, hashcodes, equals, and Serialization	4	Dec 30, 2006
equals() and hashcode() not working...	6	Apr 9, 2006
TreeSet.contains and an OVERLOADED equals...works?	6	Aug 28, 2007
equals() and persitence	8	Jan 27, 2004
Improving hashCode() to match equals()	10	Dec 18, 2003

equals and hascode

mark jason

Andreas Leitgeb

mark jason

Aeris

mark jason

Aeris

Aeris

Lew

Lew

mark jason

Eric Sosman

Aeris

Eric Sosman

Martin Gregorie

Andreas Leitgeb

markspace

Mike Schilling

Martin Gregorie

Andreas Leitgeb

Lew

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads