hashCode and equals (again)

T

Todd

Hello,

I have spent a great deal of time reading through the postings in this
group as well as tutorials/explanations on sites elsewhere (i.e.,
Roedy's, etc.), but have not been able to get a good grasp of hashCode
and equals. I understand most of the rules for hashCode are defined
for use of objects in maps and other comparable collections, so it is
from that POV that I am trying to get a good grasp of the concepts.

Please help if you can - especially the SCCE later.

1. Originally, I thought that it made sense to make an equals method
that uses hashCode as its criteria for equality. However, as I now
understand hashCode, the code _must_ be the same for equal objects,
BUT it is _possible_ to be the same for non-equal objects. Am I
stating this correctly?

2. When would one use a set of criteria to determine equality that is
different from the criteria used to generate a hashCode?

3. Why aren't the hashCode_s in the following code the same?

package hashcode;

public class Main
{
public static void main(String[] args)
{

int[] a = { 1, 7, 0, 0 };
int[] b = { 1, 7, 0, 0 };

System.out.println( "equals: " + a.equals( b ) );
System.out.println( "hash a: " + a.hashCode() );
System.out.println( "hash b: " + b.hashCode() );


Integer[] c = { 1, 7, 0, 0 };
Integer[] d = { 1, 7, 0, 0 };

System.out.println( "equals: " + c.equals( d ) );
System.out.println( "hash c: " + c.hashCode() );
System.out.println( "hash d: " + d.hashCode() );


int[] e = a.clone();
Integer[] f = c.clone();

System.out.println( "equals: " + a.equals( e ) );
System.out.println( "hash a: " + a.hashCode() );
System.out.println( "hash e: " + e.hashCode() );
System.out.println( "equals: " + c.equals( f ) );
System.out.println( "hash c: " + c.hashCode() );
System.out.println( "hash f: " + f.hashCode() );
}

}

Thanks,
Todd
 
T

Todd

Hello,

I have spent a great deal of time reading through the postings in this
group as well as tutorials/explanations on sites elsewhere (i.e.,
Roedy's, etc.), but have not been able to get a good grasp of hashCode
and equals. I understand most of the rules for hashCode are defined
for use of objects in maps and other comparable collections, so it is
from that POV that I am trying to get a good grasp of the concepts.

Please help if you can - especially the SCCE later.

1. Originally, I thought that it made sense to make an equals method
that uses hashCode as its criteria for equality. However, as I now
understand hashCode, the code _must_ be the same for equal objects,
BUT it is _possible_ to be the same for non-equal objects. Am I
stating this correctly?

2. When would one use a set of criteria to determine equality that is
different from the criteria used to generate a hashCode?

3. Why aren't the hashCode_s in the following code the same?

package hashcode;

public class Main
{
public static void main(String[] args)
{

int[] a = { 1, 7, 0, 0 };
int[] b = { 1, 7, 0, 0 };

System.out.println( "equals: " + a.equals( b ) );
System.out.println( "hash a: " + a.hashCode() );
System.out.println( "hash b: " + b.hashCode() );

Integer[] c = { 1, 7, 0, 0 };
Integer[] d = { 1, 7, 0, 0 };

System.out.println( "equals: " + c.equals( d ) );
System.out.println( "hash c: " + c.hashCode() );
System.out.println( "hash d: " + d.hashCode() );

int[] e = a.clone();
Integer[] f = c.clone();

System.out.println( "equals: " + a.equals( e ) );
System.out.println( "hash a: " + a.hashCode() );
System.out.println( "hash e: " + e.hashCode() );
System.out.println( "equals: " + c.equals( f ) );
System.out.println( "hash c: " + c.hashCode() );
System.out.println( "hash f: " + f.hashCode() );
}

}

Thanks,
Todd

Question I forgot:
Should the hashCode be used as an equality criteria ~ as long as it is
combined with some other criteria?
 
T

Thomas Fritsch

Todd said:
Hello,

I have spent a great deal of time reading through the postings in this
group as well as tutorials/explanations on sites elsewhere (i.e.,
Roedy's, etc.), but have not been able to get a good grasp of hashCode
and equals. I understand most of the rules for hashCode are defined
for use of objects in maps and other comparable collections, so it is
from that POV that I am trying to get a good grasp of the concepts.

Please help if you can - especially the SCCE later.

1. Originally, I thought that it made sense to make an equals method
that uses hashCode as its criteria for equality. However, as I now
understand hashCode, the code _must_ be the same for equal objects,
BUT it is _possible_ to be the same for non-equal objects. Am I
stating this correctly?
The hashCode method is designed to execute *quick*. The benefit is that
HashMaps then can quickly decide whether two objects are definitely not
equal, or with some small probability might be equal.
2. When would one use a set of criteria to determine equality that is
different from the criteria used to generate a hashCode?
(1) When an exact check of equality would take much more time than a quick
calculation of a hashCode.
(2) When the class principally provides more than 0xFFFFFFF different
objects. (For example the String class: There is an inifinite number of
different Strings, but there are only 0xFFFFFFFF different hashCodes)
3. Why aren't the hashCode_s in the following code the same?
It is because of the concrete hashCode implementation in class Object
(arrays always use Object's hashCode implementation). You might check this
in the JavaDoc of Object#hashCode now.
package hashcode;

public class Main
{
public static void main(String[] args)
{

int[] a = { 1, 7, 0, 0 };
int[] b = { 1, 7, 0, 0 };

System.out.println( "equals: " + a.equals( b ) );
System.out.println( "hash a: " + a.hashCode() );
System.out.println( "hash b: " + b.hashCode() );


Integer[] c = { 1, 7, 0, 0 };
Integer[] d = { 1, 7, 0, 0 };

System.out.println( "equals: " + c.equals( d ) );
System.out.println( "hash c: " + c.hashCode() );
System.out.println( "hash d: " + d.hashCode() );


int[] e = a.clone();
Integer[] f = c.clone();

System.out.println( "equals: " + a.equals( e ) );
System.out.println( "hash a: " + a.hashCode() );
System.out.println( "hash e: " + e.hashCode() );
System.out.println( "equals: " + c.equals( f ) );
System.out.println( "hash c: " + c.hashCode() );
System.out.println( "hash f: " + f.hashCode() );
}

}

Thanks,
Todd
 
H

Hendrik Maryns

Todd schreef:
Yes. There generally is no need to make your equals() function that
complex, that it depends on hashCode()

If you are lazy and implement an easy hashCode(), but I wouldn’t see a
good reason of the top of my head.

Arrays are objects. hashCode() for an array is the hashCode of Object,
which is simply based on its place in memory, it has nothing to do with
its contents. If you want the hashCode to depend on the objects in it,
use ArrayList or any other Collections class.

Similarly, equals() for arrays uses Object’s equals(), which is the same
as ==.
package hashcode;

public class Main
{
public static void main(String[] args)
{

int[] a = { 1, 7, 0, 0 };
int[] b = { 1, 7, 0, 0 };

System.out.println( "equals: " + a.equals( b ) );
System.out.println( "hash a: " + a.hashCode() );
System.out.println( "hash b: " + b.hashCode() );
Question I forgot:
Should the hashCode be used as an equality criteria ~ as long as it is
combined with some other criteria?

The singular is ‘criterion’: http://en.wiktionary.org/wiki/criteria
(luckily I looked that up, since I was wrong as well).

Isn’t that covered by your first question? I’d say ‘no’.

public class Whatever {
int i, j;
@Override public boolean equals(Object other){
if (!other instanceof Whatever) {
return false;
}
Whatever otherWE = (Whatever) other;
return otherWE.i == i && otherWE.j == j;
}
@Override public int hashCode() {
return i;
}
}

Would be perfectly valid, although I would suggest to change hashCode to
something like
@Override public int hashCode() {
return i ^ j;
}
Experts will be able to tell you more about how to combine i and j,
there are some reasons for not using ‘+’.

H.
--
Hendrik Maryns
http://tcl.sfs.uni-tuebingen.de/~hendrik/
==================
http://aouw.org
Ask smart questions, get good answers:
http://www.catb.org/~esr/faqs/smart-questions.html


-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.4-svn0 (GNU/Linux)
Comment: Using GnuPG with SUSE - http://enigmail.mozdev.org

iD8DBQFH/N+De+7xMGD3itQRAgBHAJ0eqzJ3Rgx+uUZMTPWevtW6o0M+OQCeNM08
+QV081NBO2COadJZOX2CC+8=
=gfsN
-----END PGP SIGNATURE-----
 
P

Patricia Shanahan

Thomas Fritsch wrote:
....
(2) When the class principally provides more than 0xFFFFFFF different
objects. (For example the String class: There is an inifinite number of
different Strings, but there are only 0xFFFFFFFF different hashCodes)
....

@enable_pedantry

There are 0x100000000 distinct hash code values, 0 through
Integer.MAX_VALUE. This means, for example, that Integer can use its own
int value as hash code.

The finite number of char values, combined with the use of an int to
represent the length of a String, sets a very large but finite limit on
the number of different Strings.

@disable_pedantry

Patricia
 
P

Patricia Shanahan

Todd wrote:
....
Should the hashCode be used as an equality criteria ~ as long as it is
combined with some other criteria?
....

I don't think so. You are going to have to do all the equals tests
anyway, so why bother calculating the hashCode as well?

However, they should be related. I generally begin by selecting a set of
fields that I consider to be the essence of the value of the object.
Those fields are used both in the equals test and in the hashCode
calculation.

Patricia
 
A

Arved Sandstrom

Todd said:
Hello,

I have spent a great deal of time reading through the postings in this
group as well as tutorials/explanations on sites elsewhere (i.e.,
Roedy's, etc.), but have not been able to get a good grasp of hashCode
and equals. I understand most of the rules for hashCode are defined
for use of objects in maps and other comparable collections, so it is
from that POV that I am trying to get a good grasp of the concepts.

Please help if you can - especially the SCCE later.

1. Originally, I thought that it made sense to make an equals method
that uses hashCode as its criteria for equality. However, as I now
understand hashCode, the code _must_ be the same for equal objects,
BUT it is _possible_ to be the same for non-equal objects. Am I
stating this correctly?

2. When would one use a set of criteria to determine equality that is
different from the criteria used to generate a hashCode?

Usually you wouldn't. The significant fields that you pick to define
equality are the ones that semantically describe the identity of an object.
Once you've picked those fields for equals(), in order to satisfy the
contracts you'll pick from a subset of those fields to implement hashCode(),
bearing in mind that a set is a subset of itself. Because with a hashCode
you'd like to maximize the variability, it's in your interests to use as
many of the significant fields (used in equals()) as you can.
3. Why aren't the hashCode_s in the following code the same?
[ SNIP ]

Use the Arrays.equals() and Arrays.hashCode() methods.

AHS
 
M

Mike Schilling

Patricia Shanahan said:
Thomas Fritsch wrote:
...
...

@enable_pedantry

There are 0x100000000 distinct hash code values, 0 through
Integer.MAX_VALUE. This means, for example, that Integer can use its
own
int value as hash code.

The finite number of char values, combined with the use of an int to
represent the length of a String, sets a very large but finite limit
on
the number of different Strings.

@disable_pedantry

Patricia
 
R

Roedy Green

1. Originally, I thought that it made sense to make an equals method
that uses hashCode as its criteria for equality. However, as I now
understand hashCode, the code _must_ be the same for equal objects,
BUT it is _possible_ to be the same for non-equal objects. Am I
stating this correctly?

Equal hashCodes in general are not sufficient to ensure Object
equality. However, if the hashCodes are not equal, you know the
Objects can’t possibly be equal. Consider how many 50-character
Strings there are (65535^50) and how many possible hashCodes there are
(2^32). It should be obvious there are WAY more Strings than
hashCodes. So the same hashCode HAS to be reused over and over for
different Strings.
 
R

Roedy Green

Originally, I thought that it made sense to make an equals method
that uses hashCode as its criteria for equality. However, as I now
understand hashCode, the code _must_ be the same for equal objects,
BUT it is _possible_ to be the same for non-equal objects. Am I
stating this correctly?

Yes. Mathematically this is like a homologous mapping from the set of
Strings to the smaller set of HashCodes. Most of the time you can't
avoid duplicate reuse of the hashcodes.

Perhaps thinking of hashcodes as a generalisation of the % operator
might help. Equal numbers %149 must give the same result, but is also
possible for unequal numbers %149 to give the same result.
 
R

Roedy Green

2. When would one use a set of criteria to determine equality that is
different from the criteria used to generate a hashCode?

a fast compare might check for equality of hashcode. If not equal you
know right away the two can't be equal. You then compare the
addresses. If equal, you know the entire objects must be equal (the
same).

Failing that, you check size. If they are not the same you know the
two can't be equal.

Failing that, you do your byte by byte comparison.
 
R

Roedy Green

2. When would one use a set of criteria to determine equality that is
different from the criteria used to generate a hashCode?

Here is a strategy for writing an equals method for some Object:

1. Check for equality of the cached hashcodes. If they are not equal
you know right away the two Objects can’t be equal.

2. You then compare the two addresses/references. If equal, you know
the entire Objects must be equal (the same).

3. Failing that, you check the Object sizes. If they are not the same
you know the two can’t be equal.

4. Failing that, you do your byte by byte comparison.
 
R

Roedy Green

int[] a = { 1, 7, 0, 0 };
int[] b = { 1, 7, 0, 0 };

System.out.println( "equals: " + a.equals( b ) );
System.out.println( "hash a: " + a.hashCode() );
System.out.println( "hash b: " + b.hashCode() );

See http://mindprod.com/jgloss/hashcode.html

int [] is using the lame default hashCode algorithm for Objects based
on equals defined as ==. It uses something effectively the same as
the Object's address as the hashCode.
 
H

Hal Rosser

Todd said:
Hello,

I have spent a great deal of time reading through the postings in this
group as well as tutorials/explanations on sites elsewhere (i.e.,
Roedy's, etc.), but have not been able to get a good grasp of hashCode
and equals. I understand most of the rules for hashCode are defined
for use of objects in maps and other comparable collections, so it is
from that POV that I am trying to get a good grasp of the concepts.

Please help if you can - especially the SCCE later.

1. Originally, I thought that it made sense to make an equals method
that uses hashCode as its criteria for equality. However, as I now
understand hashCode, the code _must_ be the same for equal objects,
BUT it is _possible_ to be the same for non-equal objects. Am I
stating this correctly?

2. When would one use a set of criteria to determine equality that is
different from the criteria used to generate a hashCode?

3. Why aren't the hashCode_s in the following code the same?

When comparing object, most of the time you don't worry about the hashCode,
When comparing objects, it up to you - the creator of the class - to
determine how objects of the same class are compared.
If you have a class "Customer" - and you create an array of Customer
objects, then lets assume you want to sort that array in ascending order.
Would you expect the array to be sorted in Alphabetical order by Last name -
or by zip code, or by account balance? How do you decide which is larger?
If you implement the Comparable interface and override the compareTo
method - YOU decide in what order the customers should be sorted. You can
use the sort method of the Arrays class to sort that array of objects - but
only if the Customer class implements the Comparable interface. So take a
look at the Comparable interface in the API. The answer to your questions
may be simpler than you had anticipated.
If you want info about hashCode - then Roedy and the others have you
covered. I may have interpreted your post differently.
You can have your equals method use the compareTo method.
 
R

Roedy Green

How do you quickly check the size of two objects? And why can't
the sizes be different for equal objects, e.g., some String
in the object not used in equality testing?

You compare the length of the parts of the object that participate in
the equality.
 
R

Roedy Green

You compare the length of the parts of the object that participate in
the equality

there is no method to give to you aggregate length of the object. You
would have to serialise it and measure that -- hardly an efficient
process.
 
R

Roedy Green

It can't be. Addresses shift as GC moves items around, but the hashCode()
cannot change during the life of the application.

It could be the initial address, then cached. It could be the
address of the handle slot in handle-based JVMs.

The JVM implementor has quite a bit of freedom how to implement
Object.hashCode. It could even be a sequential counter incremented on
every new.
 
T

Todd

Thanks for all of the responses. I have learned quite a bit, as well
as gained several more references. I was not aware of the
Arrays.hashCode methods and have used them successfully in my project.
 
T

Todd

Thanks for all of the answers. I have learned a lot and added several
references to my library. I have been able to implement what I
believe are proper equals and hashCode methods in my project.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,077
Latest member
SangMoor21

Latest Threads

Top