hashCode

A

Arne Vajhøj

?

package deal;
public class ToBeHashed {
private int noGettersForMeThanks;
...
}

package express;
Hasher<ToBeHashed> hasher = new Hasher<ToBeHashed>() {
public int hashCode(ToBeHashed obj) {
// How do you obtain the value of
// obj.noGettersForMeThanks
// if that would be useful?
}
...
}

As an example of why a hasher might want access to a strictly-private
field, I offered String: How could a Hasher<String> (outside String
itself) use String's private `hash' element? (And before you say
"Give String a getHash() method," ponder what hashCode() does.)

That is a very good point.

We could argue that equals should only depends on retrievable
information and so should hash.

But as you point out then there is the issue of caching
of hash values.

Arne
 
D

Daniele Futtorovic

?

package deal;
public class ToBeHashed {
private int noGettersForMeThanks;
...
}

package express;
Hasher<ToBeHashed> hasher = new Hasher<ToBeHashed>() {
public int hashCode(ToBeHashed obj) {
// How do you obtain the value of
// obj.noGettersForMeThanks
// if that would be useful?
}
...
}

As an example of why a hasher might want access to a strictly-private
field, I offered String: How could a Hasher<String> (outside String
itself) use String's private `hash' element? (And before you say
"Give String a getHash() method," ponder what hashCode() does.)

Sorry for not being clear, Eric. I meant to emphasize how IMO that one
single point you mention is the decisive argument that settles the issue.

Imposing that classes should expose all information "relevant" (says
who??) to hashing is utter rubbish IMNSHO.
 
J

Jim Janney

Eric Sosman said:
?

package deal;
public class ToBeHashed {
private int noGettersForMeThanks;
...
}

package express;
Hasher<ToBeHashed> hasher = new Hasher<ToBeHashed>() {
public int hashCode(ToBeHashed obj) {
// How do you obtain the value of
// obj.noGettersForMeThanks
// if that would be useful?
}
...
}

As an example of why a hasher might want access to a strictly-private
field, I offered String: How could a Hasher<String> (outside String
itself) use String's private `hash' element? (And before you say
"Give String a getHash() method," ponder what hashCode() does.)

I have, and I don't see a problem. String can define hashCode(), and
hashCodeIgnoreCase(), and hashCode(Locale), if those turn out to be
useful things to do for that particular class (String is an exceptional
class in several ways). The objection is to building hashCode() into
the Object hierarchy and hard-coding its use in HashMap. A separate
interface would allow more flexibility.
 
J

Jim Janney

Daniele Futtorovic said:
Sorry for not being clear, Eric. I meant to emphasize how IMO that one
single point you mention is the decisive argument that settles the issue.

Imposing that classes should expose all information "relevant" (says
who??) to hashing is utter rubbish IMNSHO.

Objects that compare equal must hash to the same value. It follows that
if the hash function uses a value, so must the comparison method.
Relying on non-public information in the comparison method means
allowing two objects to compare non-equal even though there is no way,
using public information, to distinguish between them. I can honestly
say I've never done that or felt a need to do that.
 
E

Eric Sosman

Daniele Futtorovic said:
On 8/30/2012 6:52 PM, Daniele Futtorovic wrote:
[...]
As an example of why a hasher might want access to a strictly-private
field, I offered String: [...]

Imposing that classes should expose all information "relevant" (says
who??) to hashing is utter rubbish IMNSHO.

Objects that compare equal must hash to the same value. It follows that
if the hash function uses a value, so must the comparison method.

Since java.lang.String had already been mentioned, it's sort
of too bad you didn't look at it before posting. Had you done so,
you'd have found that [1] hashCode() uses the private field `hash'
and [2] equals() does not.
 
R

Robert Klemme

Daniele Futtorovic said:
On 31/08/2012 03:43, Eric Sosman allegedly wrote:
On 8/30/2012 6:52 PM, Daniele Futtorovic wrote:
[...]
As an example of why a hasher might want access to a strictly-private
field, I offered String: [...]

Imposing that classes should expose all information "relevant" (says
who??) to hashing is utter rubbish IMNSHO.

Objects that compare equal must hash to the same value. It follows that
if the hash function uses a value, so must the comparison method.

Since java.lang.String had already been mentioned, it's sort
of too bad you didn't look at it before posting. Had you done so,
you'd have found that [1] hashCode() uses the private field `hash'
and [2] equals() does not.

Well, Jim's wording may not be correct to the last bit but the message
is still true: String's hash member just caches the hash code derived
from the characters of the String. And obviously equals() must compare
the characters. So basically both recur to the same underlying data.

Cheers

robert
 
J

Jim Janney

Eric Sosman said:
Daniele Futtorovic said:
On 31/08/2012 03:43, Eric Sosman allegedly wrote:
On 8/30/2012 6:52 PM, Daniele Futtorovic wrote:
[...]
As an example of why a hasher might want access to a strictly-private
field, I offered String: [...]

Imposing that classes should expose all information "relevant" (says
who??) to hashing is utter rubbish IMNSHO.

Objects that compare equal must hash to the same value. It follows that
if the hash function uses a value, so must the comparison method.

Since java.lang.String had already been mentioned, it's sort
of too bad you didn't look at it before posting. Had you done so,
you'd have found that [1] hashCode() uses the private field `hash'
and [2] equals() does not.

If you want to play gotcha, it's sort of too bad you don't read this
newsgroup more often. I pointed out the use of the private field some
time ago, when we were discussing immutable classes.

If you'd rather argue on the actual issues, a cached result is not extra
information. The hashCode method in String doesn't return anything that
can't be computed from publically available information.
 
E

Eric Sosman

Eric Sosman said:
Objects that compare equal must hash to the same value. It follows that
if the hash function uses a value, so must the comparison method.

Since java.lang.String had already been mentioned, it's sort
of too bad you didn't look at it before posting. Had you done so,
you'd have found that [1] hashCode() uses the private field `hash'
and [2] equals() does not.

If you want to play gotcha, it's sort of too bad you don't read this
newsgroup more often. I pointed out the use of the private field some
time ago, when we were discussing immutable classes.

If you'd rather argue on the actual issues, a cached result is not extra
information. The hashCode method in String doesn't return anything that
can't be computed from publically available information.

Right. Except hashCode() can arrange to compute the value
just once, while an external Hasher without access to the private
field would need to recompute it every single time. I never said
a Hasher could not compute a perfectly good hash code (for a sane
class, at any rate), just that it would have to forego benefits
that are available to an internal hashCode() method.

Are those benefits worth while? Sometimes yes, sometimes no.
Sun apparently believed that they were in fact worth while in the
case of java.lang.String (indeed, Bloch says String's hashCode()
received a lot of attention and went through multiple generations).
If you think caching String's hash is not worth the effort, I
encourage you to experiment with a variant rt.jar that omits the
cache, run some timings, and report the results.

To the other issue, about whether HashMap et al. should
have a constructor taking an externally-supplied Hasher as an
alternative to using the key's own hashCode() -- well, HashMap
is not a final class. Have at it!

(Nor, by the way, do I deny the potential utility of such
a hashed map. It could, for example, have been used to get
around the inadequacies of String.hashCode() in early Java ...)
 
J

Jim Janney

Eric Sosman said:
Eric Sosman said:
Objects that compare equal must hash to the same value. It follows that
if the hash function uses a value, so must the comparison method.

Since java.lang.String had already been mentioned, it's sort
of too bad you didn't look at it before posting. Had you done so,
you'd have found that [1] hashCode() uses the private field `hash'
and [2] equals() does not.

If you want to play gotcha, it's sort of too bad you don't read this
newsgroup more often. I pointed out the use of the private field some
time ago, when we were discussing immutable classes.

If you'd rather argue on the actual issues, a cached result is not extra
information. The hashCode method in String doesn't return anything that
can't be computed from publically available information.

Right. Except hashCode() can arrange to compute the value
just once, while an external Hasher without access to the private
field would need to recompute it every single time. I never said
a Hasher could not compute a perfectly good hash code (for a sane
class, at any rate), just that it would have to forego benefits
that are available to an internal hashCode() method.

Are those benefits worth while? Sometimes yes, sometimes no.
Sun apparently believed that they were in fact worth while in the
case of java.lang.String (indeed, Bloch says String's hashCode()
received a lot of attention and went through multiple generations).
If you think caching String's hash is not worth the effort, I
encourage you to experiment with a variant rt.jar that omits the
cache, run some timings, and report the results.

I certainly think that the performance gained from caching matters. But
your objection is without substance: once again, there is no reason why
String can't define a hashCode method for instances of Hasher to call.
No need to forego anything.

And String is very much an unusual case: it's final, immutable, and
frequently used as a hash key. Not many other classes will fall into
that category.
 
L

Lew

On 8/31/2012 11:08 AM, Jim Janney wrote:
On 31/08/2012 03:43, Eric Sosman allegedly wrote:
On 8/30/2012 6:52 PM, Daniele Futtorovic wrote:
[...]
As an example of why a hasher might want access to a strictly-private
field, I offered String: [...]

Imposing that classes should expose all information "relevant" (says
who??) to hashing is utter rubbish IMNSHO.

Objects that compare equal must hash to the same value. It follows that
if the hash function uses a value, so must the comparison method.
Since java.lang.String had already been mentioned, it's sort
of too bad you didn't look at it before posting. Had you done so,
you'd have found that [1] hashCode() uses the private field `hash'
and [2] equals() does not.



If you want to play gotcha, it's sort of too bad you don't read this

newsgroup more often. I pointed out the use of the private field some

time ago, when we were discussing immutable classes.



If you'd rather argue on the actual issues, a cached result is not extra

information. The hashCode method in String doesn't return anything that
can't be computed from publically available information.

But it does impose a need for the hash function to access a private method
so that it can return that value faster. Denying access to the private member
would kill that optimization, one that any type of immutable object might want
to use.

That is the motivation for requiring access to private members.

Your anger is misplaced, as your recent post rejects that scenario, implying
that the only reason to use private members in 'hashCode'-alikes is for
calculation of the value.
 
J

Jim Janney

Lew said:
On 8/31/2012 11:08 AM, Jim Janney wrote:
On 31/08/2012 03:43, Eric Sosman allegedly wrote:
On 8/30/2012 6:52 PM, Daniele Futtorovic wrote:

As an example of why a hasher might want access to a strictly-private
field, I offered String: [...]
Imposing that classes should expose all information "relevant" (says
who??) to hashing is utter rubbish IMNSHO.

Objects that compare equal must hash to the same value. It follows that
if the hash function uses a value, so must the comparison method.

Since java.lang.String had already been mentioned, it's sort
of too bad you didn't look at it before posting. Had you done so,
you'd have found that [1] hashCode() uses the private field `hash'
and [2] equals() does not.



If you want to play gotcha, it's sort of too bad you don't read this

newsgroup more often. I pointed out the use of the private field some

time ago, when we were discussing immutable classes.



If you'd rather argue on the actual issues, a cached result is not extra

information. The hashCode method in String doesn't return anything that
can't be computed from publically available information.

But it does impose a need for the hash function to access a private method
so that it can return that value faster. Denying access to the private member
would kill that optimization, one that any type of immutable object might want
to use.

That is the motivation for requiring access to private members.

Your anger is misplaced, as your recent post rejects that scenario, implying
that the only reason to use private members in 'hashCode'-alikes is for
calculation of the value.

Once again: once again, there is no reason why String can't define a
hashCode method for instances of Hasher to call. No need to forego
anything.
 
L

Lew

Jim said:
Once again: once again, there is no reason why String can't define a
hashCode method for instances of Hasher to call. No need to forego
anything.

And, indeed, once again once again, 'String' has already provided such a method.

I don't see what the big to-do is about 'Object' having 'hashCode()' defined. It's
just fine as an address proxy in the default implementation, never mind its use
for hashing, and the override to provide value equality is exactly the effort needed
for classes that need a "real" hash, only at least it's in the very class whose logic
it is, rather than artificially separated into a separate 'Hasher' type.

So once again once again, the existing mechanism shows itself to be at worst
not much worse than the proposal.

Also, once again once again, the hash must match the idea of equality for the
type. I don't notice anyone arguing (yet) that 'equals()' should not be present
in 'Object'. Best practices once again once again promote keeping 'equals()',
'hashCode()', 'compareTo()' (where present) and 'toString()' consistent with
each other.

So once again once again once again once again we see the status quo
being pretty much equivalent once again once again to the proposal
once again.
 
J

Jim Janney

Lew said:
And, indeed, once again once again, 'String' has already provided such a method.

I don't see what the big to-do is about 'Object' having 'hashCode()' defined. It's
just fine as an address proxy in the default implementation, never mind its use
for hashing, and the override to provide value equality is exactly the effort needed
for classes that need a "real" hash, only at least it's in the very class whose logic
it is, rather than artificially separated into a separate 'Hasher' type.

So once again once again, the existing mechanism shows itself to be at worst
not much worse than the proposal.

Also, once again once again, the hash must match the idea of equality for the
type. I don't notice anyone arguing (yet) that 'equals()' should not be present
in 'Object'. Best practices once again once again promote keeping 'equals()',
'hashCode()', 'compareTo()' (where present) and 'toString()' consistent with
each other.

So once again once again once again once again we see the status quo
being pretty much equivalent once again once again to the proposal
once again.

Nothing in the hypothetical design prevents String, or any other class,
from defining a hashCode method, or multiple methods, with access to
private data or native code. Objections on this basis are therefore
without basis. I'm sorry if you find this confusing, but it's really
very simple.

It's also true that the majority of hash functions should depend only
on publically available data. This is not "rubbish" but a direct
consequence of how hashing works: objects that compare equal must hash
to the same value. In a small number of cases, using a private member
to cache the result of the hash calculation turns out, on average, to
be better than not cacheing it. This takes us back to the previous
point.

As you point out, the current design is not all that bad, but it's also
limiting and error prone. Limiting because a class can only have one
hash function when several might be useful: for instance, String has
equalsIgnoreCase but no hashCodeIgnoreCase. If you want to hash by
object identity for a class that overrides equals() and hashCode(), you
have to use a completely different implementation of the standard hash
map. Error prone because it takes concerns that should be kept together
(hashing is a concern of the map, not the object being hashed) and
divides them into otherwise unrelated classes. If you want to override
equals() for a class, how do you know where to look for hash maps that
might be affected?
 
D

Daniele Futtorovic

It's also true that the majority of hash functions should depend only
on publically available data. This is not "rubbish"

Indeed, *that* is not. It's merely questionable.

Mandating that *every* class conform to this scheme, however, is rubbish.

All this discussion is essentially an invitation to throw access control
to the four winds.
 
J

Jim Janney

Daniele Futtorovic said:
Indeed, *that* is not. It's merely questionable.

Mandating that *every* class conform to this scheme, however, is rubbish.

All this discussion is essentially an invitation to throw access control
to the four winds.

I was going to comment on this, but when I went to pick up another
"once again" they were fresh out of them... told me some crazy story
about a guy coming through in a big hurry and cleaning out the entire
stock. So you'll just have to imagine what I might have said :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,140
Latest member
SweetcalmCBDreview
Top