hashCode

A

Arne Vajhøj

No, that's not true. Value-equality maps, for example, would not work if
you didn't override 'hashCode()' in the key type to match value equality
on the keys.

That was almost exactly what I answered in my first answer to
the original poster.

Then I read it as "would it be better if my class used Object's
implementation".

But after reading the clarification then I think it should be
read as "would it be better if Object used the implementation
shown".

Very different question!

Arne
 
A

Arne Vajhøj

Well, I would go as far as to say that it will perform poorly in all
cases where hashCode() is actually needed

More or less.
- and that makes it broken.

This thread started about whether it is correct. The term correct
has a very specific meaning in programming and that always return 1
code is correct.

Now you talk about broken. If broken means not good, then you are right.
If broken means not correct, then you are wrong. I suspect
that broken is not very well defined.
The whole idea of hashing is based on the fact that the hash code
_somehow_ represents the item to be hashed. If all items have the same
constant hash code there is no point in hashing at all. So while it
does work, it does not work as intended.

It disable the entire hashing functionality and a HashMap get
characteristics of ArrayList.

But the code should actually work.

Arne
 
A

Arne Vajhøj

Overriding 'hashCode()' is done for functional reasons, not performance
reasons. If you fail to override the method, you'll get incorrect
behavior, for example failing to find a collection member that is
actually present.

Correct.

But the return constant is a special case. It functions as it
should but performs very poorly.

Arne
 
A

Arne Vajhøj

Keeping hashCode as an Object method but making it useless for bucket
selection unless overridden would not be a good alternative.

A more reasonable alternative would be to have hashCode as the only
member of a HashKey interface that would be implemented by every class
whose objects are intended to be suitable for use as has keys. Those
objects that have a hashCode would still have to have a usable one, but
some classes would not implement HashKey and not have a hashCode at all.

That would avoid using the hash for classes where it does not make
much sense.

I like it.

Given that there is a gazillion lines of code out there that
stores Object, then it can not be changed.

Arne
 
A

Arne Vajhøj

Even for cases where one predicts the use will not require one or
another of the consistency practices, it is harmless to enforce them.

There are four methods a type might use to represent equality or
identity and deviations therefrom: 'equals()' of course, and
'hashCode()', 'compareTo()', and 'toString()'. There may be external
'Comparator's on that type.

All these methods on or of a type, where they exist, should be
consistent, absent an overwhelming and sound reason not to be.
Should you design anything that violates the consistency rule, then
please do both Javadoc and code-comment it properly.

I agree.

(well toString is not high on my priority list for what needs to
behave certain ways, but ...)

To make it easier to get them consistent, then I think it would
be nice with syntactic sugar.

Like:

public class Data {
@ValueId
private int iv;
@ValueId
private String sv;
// the rest of the class
}

which would cause the compiler to emit equals and hashCode methods
that are test of the @ValueId fields are equal and creates a "decent"
hash.

public class Data implements Comparable<Data> {
@ValueId
private int iv;
@ValueId
private String sv;
// the rest of the class
}

could cause the compiler to output a compareTo as well.

It is invisible code, but we already got that with the
default constructor.

And it will help make those methods consistent.

Of course explicit override should still be possible.

Arne
 
R

Robert Klemme

More or less.


This thread started about whether it is correct. The term correct
has a very specific meaning in programming and that always return 1
code is correct.

Now you talk about broken.

Actually it wasn't me who brought up the term. :)
If broken means not good, then you are right.
If broken means not correct, then you are wrong. I suspect
that broken is not very well defined.

Right. :)
It disable the entire hashing functionality and a HashMap get
characteristics of ArrayList.

An ArrayList allows multiple occurrences of the same instance - and does
not store key value pairs. That won't be the case with HashMap as
equals() (if properly implemented) will prevent that (albeit slowly, or
more correct: slower than with a proper implementation of hashCode()).
Also, a HashMap will degenerate more to a LinkedList via the chaining of
a bucket's entries.
But the code should actually work.

Yes, sort of - depending on whether O(1) is a requirement or not.
Still, the idea to implement hashCode() in Object in a way to return a
constant to avoid necessity of overriding it is crazy - especially since
you then cannot efficiently use Object as hash key - which you can today.

Kind regards

robert
 
A

Arne Vajhøj

An ArrayList allows multiple occurrences of the same instance - and does
not store key value pairs. That won't be the case with HashMap as
equals() (if properly implemented) will prevent that (albeit slowly, or
more correct: slower than with a proper implementation of hashCode()).
Also, a HashMap will degenerate more to a LinkedList via the chaining of
a bucket's entries.

I guess my statement was a bit misleading.

.... get O(1) characteristics for getting data similar to
various List implementation.

Arne
 
A

Arne Vajhøj

Yes, sort of - depending on whether O(1) is a requirement or not.

That is a rare functional requirement.

(common non-functional)
Still,
the idea to implement hashCode() in Object in a way to return a constant
to avoid necessity of overriding it is crazy - especially since you then
cannot efficiently use Object as hash key - which you can today.

Actually the more I think about the question the more logical
it sounds!

We are looking at two alternatives:
A) default functions properly but good performance requires an override
B) default gives good performance but may need an override to function
properly

#A actually sounds more the Java way to me.

Arne
 
A

Andreas Leitgeb

Arne Vajhøj said:
I guess my statement was a bit misleading.
... get O(1) characteristics for getting data similar to
various List implementation.

Still is ;-)


O(n), not O(1)
 
A

Andreas Leitgeb

Arne Vajhøj said:
We are looking at two alternatives:
A) default functions properly but good performance requires an override
B) default gives good performance but may need an override to function
properly

C) default hashCode() works perfectly well with default equals(), and only
those with a specific requirement for equality, who thus need to override
.equals(), anyway, also need to override hashCode() appropriately for
their specific equality-relation.

Oh, wait, I think, Java already does it this way...
 
A

Arne Vajhøj

C) default hashCode() works perfectly well with default equals(), and only
those with a specific requirement for equality, who thus need to override
.equals(), anyway, also need to override hashCode() appropriately for
their specific equality-relation.

That is just B in another wording.

Arne
 
D

Daniel Pitts

That is just B in another wording.

However you word it.

The truth is that equals/hashCode should probably be overridden
together, or together remain default.

I find it somewhat disappointing that there are Comparable/Comparator
interfaces, but no Hashable/Hasher interfaces.

As it turns out, usually equals doesn't work well with subclasses
anyway, so to have an external "equals" predicate makes for a cleaner
abstraction.
 
M

markspace

I find it somewhat disappointing that there are Comparable/Comparator
interfaces, but no Hashable/Hasher interfaces.


I think -- I'm not sure, but I believe -- the the ability to hash
something into a hash table (a symbol table, in some terminologies) was
considered so important and so fundamental to so many algorithms that it
was deemed intrinsic to the system. And therefore mandated for all objects.

For example, in C, one can always hash based on memory address. In Java
we don't have that option, so hashcode() takes the place of the
intrinsic property of an address.

Whereas the ability to sort or order objects was recognized as not being
intrinsic to all objects, so it was separated out. Sorting a list, or
ordering a tree, isn't something you can always do by default.

Just my two nickels here, and of course I'm guessing as to why
hashcode() is included in Object and not an interface.
 
L

Lew

I think -- I'm not sure, but I believe -- the the ability to hash
something into a hash table (a symbol table, in some terminologies) was
considered so important and so fundamental to so many algorithms that it
was deemed intrinsic to the system. And therefore mandated for all objects.

Right or wrong, that's plausible.

And a separate Hasher interface would be kludgey.
For example, in C, one can always hash based on memory address. In Java
we don't have that option, so hashCode() takes the place of the
intrinsic property of an address.

That really makes sense. We have a nearly unique code for every object, and
in practical terms one is vanishingly unlikely to get duplicate identity hashes
within any given run.
Whereas the ability to sort or order objects was recognized as not being
intrinsic to all objects, so it was separated out. Sorting a list, or
ordering a tree, isn't something you can always do by default.

Just my two nickels here, and of course I'm guessing as to why
hashcode() is included in Object and not an interface.

So to cap this topic, we have a default identity hash in the 'Object#hashCode()'
implementation that does a good job of distributing things in 'HashMap' and
the like, and matches the semantics of the default identity 'equals()'. The OP's
question as to whether one should substitute the degenerate hash gets a
resounding "NO!" One overrides 'hashCode()' for consistency with 'equals()'
exactly when the latter is overridden. If the type in question implements
'Comparable' then the 'compareTo()' method likewise should match, as should
'toString()' (in a somewhat looser sense of "match", perhaps, but not too loose).

All four methods speak to the semantics of sameness for the type in question,
and should be consistent with each other.
 
A

Arne Vajhøj

However you word it.

The truth is that equals/hashCode should probably be overridden
together, or together remain default.

It should.

It must with the hashCode of today.

Arne
 
A

Arne Vajhøj

I think -- I'm not sure, but I believe -- the the ability to hash
something into a hash table (a symbol table, in some terminologies) was
considered so important and so fundamental to so many algorithms that it
was deemed intrinsic to the system. And therefore mandated for all
objects.

Using Object as key is in my opinion rarely a good design.

And relevant more specific types defined in the Java library
could implement Hashable.
Just my two nickels here, and of course I'm guessing as to why
hashcode() is included in Object and not an interface.

It is a plausible explanation.

I think the Hashable interface would have been
good.

But the decision was made many years ago.

And .NET made the same decision!

Arne
 
M

markspace

On 8/27/2012 5:03 PM, markspace wrote:
...
...

In Java we have System.identityHashCode() which provides an address-like
hash code for any object.


I think System.identityHashCode() is the (same as the) default
implementation for Object::hashCode(), yes?

So there's a small bit of evidence in support of the idea that
Object::hashCode() is meant to mimic the idea of just hashing on address.
 
M

markspace

Yes, but it could have been written, with a slightly different
explanation, without putting hashCode() in Object.


And then you get into situations like this:

if( object instanceof Hasher )
hash = ((Hasher) o).hashCode();
else
hash = System.identityHashCode( object );

And I think we all know to use polymorphism instead here. This is kind
of what I'm saying. The usefulness of a hash code was consider
intrinsic to any object, and they wanted to avoid the code above.
Therefore, Object::hashCode() becomes the design spec.

I don't think you need such indirect evidence:

"As much as is reasonably practical, the hashCode method defined by
class Object does return distinct integers for distinct objects. (This
is typically implemented by converting the internal address of the
object into an integer, but this implementation technique is not
required by the JavaTM programming language.) "


It doesn't really inform the reasons why hashCode() is part of Object
though. On that front, I'm speculating (i.e., making stuff up).
 
L

Lew

Patricia said:
markspace wrote:

...


In Java we have System.identityHashCode() which provides an address-like
hash code for any object.

Which is simply a wrapper method for the 'Object#hashCode()' method.

"Returns the same hash code for the given object as would be returned by the
default method hashCode(), whether or not the given object's class overrides
hashCode()."
 
L

Lew

markspace said:
I think System.identityHashCode() is the (same as the) default
implementation for Object::hashCode(), yes?

Why wonder when it's in the Javadocs?

Hm?

It is, in fact, required to be.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,677
Members
48,796
Latest member
Greg L.

Latest Threads

Top