equals(), Sets, Maps, and degrees of equality

S

Sean Mitchell

Anyone ever run into the case where you wish an Object could have more than one equals(), or that Set and Map implementations would let you pass in something like a closure to determine key equality?

It seems to me that objects can be equal in varying degrees. Let's consider a class Dog:

public class Dog {
String breed;
String name;
String age;
}

I may want to have a Set<Dog>, which holds only one Dog of each breed, irrespective of name of age. In this case my equals()/hashcode() would only consider breed.

But I may also want a Mag<Dog, Owner> in which each Dog is made unique by name.

And of course, there is the most intuitive case where I want to use equals() to see if the instances map on all three fields.

I suppose I could create a wrapper class for each purpose which only overrides equals() and hashcode(), but that seems very unsatisfying and inefficient.

I'm interested in how other people have dealt with this. Surprisingly, I have not been able to Google up much on this subject.

Thoughts?
 
S

Stefan Ram

Sean Mitchell said:
And of course, there is the most intuitive case where I want
to use equals() to see if the instances map on all three
fields.

If you know which concept the objects of a class do model,
then you know when two of the are equal. Otherwise, it's
not OOP anymore.

There is nothing wrong with that, but it just does not
seem to fit Java well, which is an OOPL (in a certain sense).
 
O

Owen Jacobson

Anyone ever run into the case where you wish an Object could have more
than one equals(), or that Set and Map implementations would let you
pass in something like a closure to determine key equality?

Given the lack of anything sufficiently "like a" closure, until Java 8
if not later, you'll have to live with the options you have.
It seems to me that objects can be equal in varying degrees. Let's
consider a class Dog:

public class Dog {
String breed;
String name;
String age;
}

I may want to have a Set<Dog>, which holds only one Dog of each breed,
irrespective of name of age. In this case my equals()/hashcode() would
only consider breed.

But I may also want a Mag<Dog, Owner> in which each Dog is made unique by name.

And of course, there is the most intuitive case where I want to use
equals() to see if the instances map on all three fields.

I suppose I could create a wrapper class for each purpose which only
overrides equals() and hashcode(), but that seems very unsatisfying and
inefficient.

That's one option, and it's not as inefficient (at least in terms of
run time) as you might expect, unless you have several million wrappers
kicking around. It's clunky to write, true.

Another option is the TreeSet and TreeMap classes, which have slightly
worse lookup and insertion complexity guarantees (amortized O(log n) vs
amortized O(1) on lookup, for example) but allow passing a Comparator
that also controls equality tests for that set or map.

-o
 
E

Eric Sosman

Anyone ever run into the case where you wish an Object could have more than one equals(), or that Set and Map implementations would let you pass in something like a closure to determine key equality?

IIRC, Common Lisp supports four or five different notions of
"equality." So, yes: The idea of equivalences beyond Java's two has
in fact been deemed useful by someone.
It seems to me that objects can be equal in varying degrees. Let's consider a class Dog:

public class Dog {
String breed;
String name;
String age;
}

I may want to have a Set<Dog>, which holds only one Dog of each breed, irrespective of name of age. In this case my equals()/hashcode() would only consider breed.

Seems odd: Why should Fido rather than Rover or Wossname be the
sole representative Cocker Spaniel? Or, why do you want a set of Dogs,
rather than a set of Breeds? Even then I think you'd have difficulty:
Are Schnauzer, Miniature Schnauzer, and Standard Schnauzer one breed or
three?

But okay -- Let's not dwell on the quirks of the example: We'll
suppose that you've got a bunch of objects with multiple attributes,
and you sometimes want to think of them equivalent if their X's match,
while other times you want to pay attention only to their Y's.
But I may also want a Mag<Dog, Owner> in which each Dog is made unique by name.

And of course, there is the most intuitive case where I want to use equals() to see if the instances map on all three fields.

I suppose I could create a wrapper class for each purpose which only overrides equals() and hashcode(), but that seems very unsatisfying and inefficient.

I'm interested in how other people have dealt with this. Surprisingly, I have not been able to Google up much on this subject.

Two avenues of attack seem plausible. One, as you mention, is to
use a helper class to designate the chosen "identity" attributes. I
think I'd prefer to make it an inner class rather than a wrapper class,
but maybe that just means I'm still too hung up on your dogs and breeds.

The other approach is to implement your own BreedSet that uses
breedEquals() and breedHashCode() instead of the usual methods (and,
of course, documents that fact in large red letters). But this feels
an awful lot like the first step down a slippery slope, one that may
find you implementing umpty-leven specialized variations of Set and
Map and regretting the original choice ...

I'd be inclined to go with the inner class, or perhaps with a
wrapper if there's a reason you can't modify Dog. YMMV.
 
V

Volker Borchert

Owen said:
[ class to be used in multiple Maps with different equalities ]

I suppose I could create a wrapper class for each purpose which only
overrides equals() and hashcode(), but that seems very unsatisfying and
inefficient.

That's one option, and it's not as inefficient (at least in terms of
run time) as you might expect, unless you have several million wrappers
kicking around. It's clunky to write, true.

Another option is the TreeSet and TreeMap classes, which have slightly
worse lookup and insertion complexity guarantees (amortized O(log n) vs
amortized O(1) on lookup, for example) but allow passing a Comparator
that also controls equality tests for that set or map.

Yet another option is to write your own HashMap.
 
M

markspace

I'd be inclined to go with the inner class, or perhaps with a
wrapper if there's a reason you can't modify Dog. YMMV.


I wonder if it's possible to make something like a wrapper, where
instead of one per object you just have a singleton of special cases.

public class Dog {
String name;
String breed;
float weight;

DogComparisonStrategy compare;

public boolean equals( Object o ) {
return compare.equals( this, o );
}

public int hashcode() {
return compare.hashcode( this );
}

}

class CompareByWeight implements DogComparisonStrategy {

public boolean equals( Dog d1, Object o ) {
if( !(o instanceof Dog) ) return false;
Dog d2 = (Dog) o;
return d2.weight == d1.weight;
}

public int hashcode( Dog d ) {
return Float.getFloatbits( d.weight );
}
}


At least you might not need a million of the things, unlike wrapper.
Obviously, you have to be able to modify Dog for this to work.
 
S

Sean Mitchell

 Seems odd: Why should Fido rather than Rover or Wossname be the
sole representative Cocker Spaniel?

Just cuz.

Silly example: Father-in-law is going hunting. He has a kennel full of
dogs. Wants a hound and a bird dog. Doesn't care which exact ones.
Two avenues of attack seem plausible.  One, as you mention, is to
use a helper class to designate the chosen "identity" attributes.  I
think I'd prefer to make it an inner class rather than a wrapper class,
but maybe that just means I'm still too hung up on your dogs and breeds.

Hmmm. This is interesting. How would it work? So, I'd have an inner
class for each type of equality I need, basically exposing its parent
and providing the desired equals()? Think I'll play around with that
idea.

     The other approach is to implement your own BreedSet that uses
breedEquals() and breedHashCode() instead of the usual methods (and,
of course, documents that fact in large red letters).  But this feels
an awful lot like the first step down a slippery slope, one that may
find you implementing umpty-leven specialized variations of Set and
Map and regretting the original choice ...

Yes, as I said in my reply to Owen (which I think I accidentally sent
as reply to author), if possible I'd rather use smarter people's work
than write my own implementation. His TreeSet/TrreMap proposal is my
favourite solution so far.


Cheers,

Sean
 
S

Sean Mitchell

I wonder if it's possible to make something like a wrapper, where
instead of one per object you just have a singleton of special cases.

Possible, but I could see a lot of concurrency issue arising. Might
work well in a single threaded application.
 
M

markspace

Possible, but I could see a lot of concurrency issue arising. Might
work well in a single threaded application.


What? Why? There's no concurrency issues there. There's no state
beyond what is in the Dog object. Method call != concurrency issue. In
fact it's the opposite, method calls are inherently thread safe.

If you need concurrency, you add it to the Dog object.
 
S

Sean Mitchell

What?  Why?  There's no concurrency issues there.  There's no state
beyond what is in the Dog object.  Method call != concurrency issue.  In
fact it's the opposite, method calls are inherently thread safe.

If you need concurrency, you add it to the Dog object.

Maybe I don't understand what you are proposing. It sounds like you
want a singleton wrapper, into which I stuff my Dog, before I put him
into my Set/Map.

Please elaborate.

Cheers,

Sean
 
M

markspace

Maybe I don't understand what you are proposing. It sounds like you
want a singleton wrapper, into which I stuff my Dog, before I put him
into my Set/Map.


Yes, although I'd call that the opposite of a wrapper. It's a Strategy
pattern.

But there's no additional concurrency issues created by using it. Just
use Dog normally. Your Dog class isn't thread safe as you wrote it, so
there's no additional issues.

If you do require thread safety, protecting the field I added is no
different than protecting the other fields in Dog. Do it exactly the
same way. (How you do so may vary depending on your needs for the
class, so there's rather more options than I'd care to go into. Read a
good book on Java Concurrency or read the JLS for ideas.)
 
D

Daniel Pitts

Just cuz.

Silly example: Father-in-law is going hunting. He has a kennel full of
dogs. Wants a hound and a bird dog. Doesn't care which exact ones.


Hmmm. This is interesting. How would it work? So, I'd have an inner
class for each type of equality I need, basically exposing its parent
and providing the desired equals()? Think I'll play around with that
idea.



Yes, as I said in my reply to Owen (which I think I accidentally sent
as reply to author), if possible I'd rather use smarter people's work
than write my own implementation. His TreeSet/TrreMap proposal is my
favourite solution so far.


Cheers,

Sean
I've often wanted the equivalent of "Comparator" for the Hash
implementations of Map/Set.

public interface HashStrategy<T> {
int hash(T object);
boolean equivalent(T a, T b);
}


public class DefaultHashStrategy<T> extends HashStrategy<T> {
public int hash(T object) {
return object == null ? 0 : object.hashCode();
}

public boolean equivalent(T a, T b) {
return a == null ? b == null : a.equals(b);
}
}


public class HashMap<K,V> implements Map<K,V> {
private final HashStrategy<? super K> hashStrategy;

public HashMap(HashStrategy<? super K> hashStrategy) {
this.hashStrategy = hashStrategy;
}
public HashMap() {
this(new DefaultHashStrategy<K>());
}
// ... the rest of the HashMap implementation.
}
 
R

Roedy Green

I may want to have a Set<Dog>, which holds only one Dog of each breed,
irrespective of name of age. In this case my equals()/hashcode()
would only consider breed.
But I may also want a Mag<Dog, Owner> in which each Dog is made unique by name.

And of course, there is the most intuitive case where I want to
use equals() to see if the instances map on all three fields.

Hmm. You have the ability to override equals so it behaves
differently classes that inherit from each other. But for all
practical purposes all but one version is inaccessible.

Let's say you wrote three equals methods. and named them

equalBySpecies
equalByOwener
equalByCallName

Now you could write something like this

public boolean equals ( Object o )
{
switch (how )
{
case SPECIES: return equalBySpecies ( o) ;
case OWNER: return equalByOwner(o);
case CALLNAME : return equalByCallName(o);
default: throw new IllegalArgumentException ("Bad how sort" );
}
}

Now the problem is, how do get the value of how into the equals
method? It could be a static, but then you could not have separate
Collections running at once. Perhaps you have to extend the Collection
class to store it there. That still does not solve the problem of how
the Object knows for this invocation of equals who which collection is
asking. Gaak! would it have to analyse the stack?

But then, you can pass a Comparator can't you to some collections.
Then the individual objects don't need to know how they are being
compared. It is not their code.





}
--
Roedy Green Canadian Mind Products
http://mindprod.com
HP makes a dozens of quite different printer models all called the 1200.
Mine has a GO button, which is not needed since it runs anyway if nothing
is blocking it. However, it has no STOP, FLUSH or OFF button.
 
E

Eric Sosman

Maybe I don't understand what you are proposing. It sounds like you
want a singleton wrapper, into which I stuff my Dog, before I put him
into my Set/Map.

I think "concurrency issues" might not be the best description,
but there are certainly drawbacks to markspace's suggestion. Back
to your original post: You wanted a Dog whose breed (only) would
govern its membership in a Set<Dog> but whose name (only) would
matter for a Map<Dog,Owner>. markspace's idea of endowing each Dog
with a DogComparisonStrategy wouldn't work if the same Dog instance
could be a member of the Set *and* a key in the Map: You'd need to
visit every Dog and reset its DogComparisonStrategy before doing
any operation on either the Set or the Map. (Strictly speaking,
you'd only need to visit the Dogs that participated in the relevant
data structure -- but since it would be at best dubious to traverse
a Set or Map whose Dogs were in the wrong state you'd need a third
Collection<Dog> that would not use the DogComparisonStrategy at all.)

The notion of making Breed and Name inner classes of Dog seems
more and more attractive:

class Dog {
class Breed { ... };
class Name { ... };
private Breed breed;
private Name name;
...
}

The wrapper approach could also work, and would be necessary if Dog
weren't alterable:

class Dog {
private String name;
private String breed;
...
}

class DogByBreed {
private Dog dog;
...
public int hashCode() {
return dog.getBreed().hashCode();
}
...
}

class DogByName {
private Dog dog;
...
public int hashCode() {
return dog.getName().hashCode();
}
...
}
 
M

markspace

markspace's idea of endowing each Dog
with a DogComparisonStrategy wouldn't work if the same Dog instance
could be a member of the Set *and* a key in the Map:


Right. It's really not that different than making subclasses to handle
different schemes. It just tries to be a bit more pithy about how it
does it. I'm mostly just tossing it out there to spur a little thought
and discussion. The "concurrency" comment caught me off guard, I'll admit.

I actually like the suggestion to use a Tree with a supplied Comparator.
It seems closest to what the OP really needs. Second choice is just
use the wrappers. Third choice: if profiling shows wrappers or the Tree
slowing things down, implement custom Map/Set classes to do exactly
what's needed.

You'd need to
visit every Dog and reset its DogComparisonStrategy before doing


Stuff like that probably isn't kosher. What if some external process
affects the the Map or Set right in the middle of iterating over its
contents? This could be really ugly and a real maintenance hazard.
 
E

Eric Sosman

Right. It's really not that different than making subclasses to handle
different schemes.

It seems strikingly different to me. Your suggestion is that
a Dog could dictate which of several hashCode/equals pairs to use,
but any particular Dog instance could select only one pair at a time.
With inner classes or wrappers, a single Dog instance could present
multiple aspects simultaneously and could participate simultaneously
in multiple collections based on those different key-nesses.

Look at it this way: In your formulation, the Dog instance says
"I am known by my breed" or "I am known by my name," and every Set<Dog>
or Map<Dog,Owner> has to accept the Dog's own decision. With inner
classes or wrappers, the Dog says "I am a Beagle and my name is Snoopy,"
and each Set<Dog.Breed> or Map<Dog.Name,Owner> makes its own choice
about which aspect counts. In one case the Dog dictates its aspect;
in the other the collection chooses the aspect it cares about.
 
M

markspace

It seems strikingly different to me. Your suggestion is that
a Dog could dictate which of several hashCode/equals pairs to use,
but any particular Dog instance could select only one pair at a time.
With inner classes or wrappers, a single Dog instance could present
multiple aspects simultaneously and could participate simultaneously
in multiple collections based on those different key-nesses.


Oh I see. The wrapper being a subclass of Dog. I was thinking of
comparing my Strategy pattern with subclasses which implemented specific
behavior.

class Dog { .. some fields .. }

class DogSortedByName extends Dog {...}

class DogSortedByBreed extends Dog {...}

The strategy pattern is similar, because we really can't change a class
of Dog once its used. In extraordinary cases we could, if the object
could be removed from a given collection completely, then we could
change the its strategy and add it to another collection, but this may
not be useful or practical.

The wrapper pattern is definitely more flexible, at the cost of needing
one wrapper object for each Dog object. The Strategy pattern is less
flexible, but only needs one Strategy object per strategy (in this case
anyway), and doesn't seem to buy us much over concrete subclasses.
Given that we actually don't know that much about the OP's problem, I
thought it would be interesting to speculate on some different solutions.
 
A

Andreas Leitgeb

Sean Mitchell said:
Anyone ever run into the case where you wish an Object could have
more than one equals(), or that Set and Map implementations would
let you pass in something like a closure to determine key equality?

Yep, I also noticed this lack in the context of "orthogonality".
I just still haven't found any actual practical use for it.
It seems to me that objects can be equal in varying degrees. Let's
consider a class Dog:
public class Dog {
String breed;
String name;
String age;
}
I may want to have a Set<Dog>, which holds only one Dog of each breed,

In that case, I'd use a Map<String,Dog>, which would be keyed by the breed
name itself, rather than by an arbitrary dog of that breed.

Probably, I'd even create a class Breed, that would encapsulate some more
properties along with the name, and then use the Breed class as key.

For the example at hand, this would allow filling the structure with breeds
that are "requested" (initially associated with null Dog), then some other
part of the code could pick an appropriate Dog for each requested Breed.
But I may also want a Map<Dog, Owner> in which each Dog is made unique
by name.

That doesn't make sense to me, because the owner would rather be a property
of the dog(*), than of its name. Who's the owner of "any dog called Rufus", btw?

*: natural language is sometimes quite at odds with modelling language. :)
 
L

Lew

Andreas said:
Yep, I also noticed this lack in the context of "orthogonality".
I just still haven't found any actual practical use for it.

This is terrible modeling.
In that case, I'd use a Map<String,Dog>, which would be keyed by the breed
name itself, rather than by an arbitrary dog of that breed.

Probably, I'd even create a class Breed, that would encapsulate some more
properties along with the name, and then use the Breed class as key.

+
This is the way to go.
For the example at hand, this would allow filling the structure with breeds
that are "requested" (initially associated with null Dog), then some other
part of the code could pick an appropriate Dog for each requested Breed.


That doesn't make sense to me, because the owner would rather be a property
of the dog(*), than of its name. Who's the owner of "any dog called Rufus", btw?

*: natural language is sometimes quite at odds with modelling language. :)

Andreas is on to something here.
 
S

Sean Mitchell

On 11/10/2011 10:29 AM, Sean Mitchell wrote:
I think "concurrency issues" might not be the best description,

No, probably not. But I think you see where my reluctance to follow
that approach lies.
The notion of making Breed and Name inner classes of Dog seems
more and more attractive:
[snip]

The wrapper approach could also work, and would be necessary if Dog
weren't alterable


I think, fundamentally, that is the problem. I don't think that Dog
or any subclass of Dog should have to be in any way concerned with
how I want to use him in a collection. Although IMO it is reasonable
for Dog to know whether or nor he is the same as another Dog in all
the ways that are important to Dogs.

The core of the issue for me is that the most commonly used
implementations of Set (possibly Map was a poor example, as has been
illustrated by Andreas and others) rely solely on the object's
equals() to determine equivalence.

TreeSet will take a Comparator as pointed out, and so probably this
is good enough, but it seems a little hackish to use a Comparator,
and a SortedSet to solve a problem of equality (or perhaps,
equivalency).


Sean
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top