Collection of distinc objects

D

D¿ejm

Hello All,

I am C++ programmer and I have problem. I didn't found answer after some
time of googling. It's 4:30am here, and I can't think anymore. So please
help ;).

In my Java program I have complicated iterator which returns references to
object. It returns the same reference/object/instance many times. I want to
build collection, where each object appears only once. Data in object is not
important, it can be equal or not, so I can't use any hash function based on
it.

In C++ I would use Set of pointers. Pointers in C++ can be compared with all
operators <, ==, >, so it works with Set's and it's efficient for me.

But what about Java ? References can only be compared with ==. How can I
write my Comparator implementation which requires triple state result
(negative, zero, positive) ?
Is it not possible to simply use any of Java collection for such task ? Do I
have to write my own ?

Regards
Dz
 
L

Luc The Perverse

D¿ejm said:
Hello All,

I am C++ programmer and I have problem. I didn't found answer after some
time of googling. It's 4:30am here, and I can't think anymore. So please
help ;).

In my Java program I have complicated iterator which returns references to
object. It returns the same reference/object/instance many times. I want
to
build collection, where each object appears only once. Data in object is
not
important, it can be equal or not, so I can't use any hash function based
on
it.

In C++ I would use Set of pointers. Pointers in C++ can be compared with
all
operators <, ==, >, so it works with Set's and it's efficient for me.

But what about Java ? References can only be compared with ==. How can I
write my Comparator implementation which requires triple state result
(negative, zero, positive) ?
Is it not possible to simply use any of Java collection for such task ? Do
I
have to write my own ?

The compare function returns an integer - I believe String uses Comparator
and returns greater, less or zero
 
D

D¿ejm

Ok. It was simple. Sorry for bothering. It should work I think:

public class ReferenceComparator <T> implements Comparator <T>
{
public int compare( T o1, T o2 )
{
if ( o1 == o2 )
return 0;
else if ( System.identityHashCode( o1 ) < System.identityHashCode( o2 ) )
return -1;
else
return 1;
}
}
 
D

Daniel Pitts

D¿ejm said:
Ok. It was simple. Sorry for bothering. It should work I think:

public class ReferenceComparator <T> implements Comparator <T>
{
public int compare( T o1, T o2 )
{
if ( o1 == o2 )
return 0;
else if ( System.identityHashCode( o1 ) < System.identityHashCode( o2 ) )
return -1;
else
return 1;
}
}
Also look into java.util.IdentityHashMap,
Although, I don't know why Java doesn't provode IdentityHashSet, ohwell.
 
D

Dzejm

Ok. It was simple. Sorry for bothering. It should work I think:
[...]
Also look into java.util.IdentityHashMap,
Although, I don't know why Java doesn't provode IdentityHashSet, ohwell.

Thanks. That's better solution than mine. My code returns always "1" for
distinct objects with the same hashCode. So it would tell that (a > b) and
(b > a) at the same time. Probably it would not work.

I wonder if it is possible to implement in java any comparison function that
impose some kind of constant order for objects with the same hashCode. I
mean function with always returns
-1 for foo(a,b) and 1 for foo(b,a) when (a != b). Probably it is totally
impossible. Isn't it ?

Best Regards
Dz
 
M

Mike Schilling

D¿ejm said:
Ok. It was simple. Sorry for bothering. It should work I think:

public class ReferenceComparator <T> implements Comparator <T>
{
public int compare( T o1, T o2 )
{
if ( o1 == o2 )
return 0;
else if ( System.identityHashCode( o1 ) < System.identityHashCode( o2 ) )
return -1;
else
return 1;
}
}

This isn't foolproof. It's possible (though unlikely) for two distinct
objects to return the same value for identityHashCode(), and now your
comparator is illegal, since compare(o1,o2) == compare(o2,o1) == 1. This
can lead to strange behavior in TreeSet.

I can see two solutions which would always work:

1. Build a Set implementation on top of IdentityHashMap, where adding a
member to the set does a put to the underlying map with that member as the
key, and checking whether the set contains an object checks whether the map
contains that key. See java.util.HashSet for details (it's built exactly
this way on top of HashMap)

2. Create a wrapper class like this:

public class Wrapper {
private Object obj;

public Wrapper(Object o) {
obj = o;
}

public boolean equals(Object o) {
return (o instanceof Wrapper && ((Wrapper)o).obj == obj;
}

public int hashCode() :
return obj == null ? 0 : obj.hashCode();
}
}

Now make a Set of Wrappers.
 
R

Red Orchid

Message-ID: said:
I wonder if it is possible to implement in java any comparison function that
impose some kind of constant order for objects with the same hashCode. I
mean function with always returns
-1 for foo(a,b) and 1 for foo(b,a) when (a != b).


From your articles, I guess:

If comparison of pointers can be used for ordering in your context,
this will be possible:

<code>
//
// "T" is supreme super type of your data.
//
class T {
private static final AtomicInteger _sid = new AtomicInteger();
private final int serial = _sid.incrementAndGet();

//
public final int compareTo(T another) {
if (this == another) {
return 0;
}
return this.serial - another.serial;
}
}

....
class ... implements Comparator <T> {
public int compare(T o1, T o2) {
return o1.compareTo(o2);
}
}
</code>
 
D

D¿ejm

This isn't foolproof. It's possible (though unlikely) for two distinct
objects to return the same value for identityHashCode(), and now your
comparator is illegal, since compare(o1,o2) == compare(o2,o1) == 1. This
can lead to strange behavior in TreeSet.

Thanks. I also spoted this.
I can see two solutions which would always work:
1. Build a Set implementation on top of IdentityHashMap, [...]

Yes. This seams to be good solution.
2. Create a wrapper class like this: [...]
Now make a Set of Wrappers.

Sorry, but it doesn't work always. Does it ? It works for HashSet but not
for TreeSet.
TreeSet is trying to cast to Comparable and it of course fails. And
implementing Comparable in this Wrappers in general way seams to be
impossible.

Best Regards
Dz
 
O

Oliver Wong

D¿ejm said:
TreeSet is trying to cast to Comparable and it of course fails. And
implementing Comparable in this Wrappers in general way seams to be
impossible.

Instead of providing TreeSet with something which is comparable, you can
also provide TreeSet with a comparator.

Design your comparator so that it sorts using the (identity)hashcode,
and if the hashcodes collide, just choose an arbitrary ordering. But here's
the trick: Whenever you choose an arbitrary ordering, RECORD what the
ordering is in some data structure internal to the comparator, so that the
next time the ordering of those two objects are requested, your comparator
returns the same ordering.

- Oliver
 
J

John Ersatznom

Dzejm said:
Ok. It was simple. Sorry for bothering. It should work I think:
[...]

Also look into java.util.IdentityHashMap,
Although, I don't know why Java doesn't provode IdentityHashSet, ohwell.


Thanks. That's better solution than mine. My code returns always "1" for
distinct objects with the same hashCode. So it would tell that (a > b) and
(b > a) at the same time. Probably it would not work.

I wonder if it is possible to implement in java any comparison function that
impose some kind of constant order for objects with the same hashCode. I
mean function with always returns
-1 for foo(a,b) and 1 for foo(b,a) when (a != b). Probably it is totally
impossible. Isn't it ?

For your original problem, I'd use a HashSet and objects in it that
don't override Object.hashCode(). If the objects do override hashCode()
and that can't be readily changed I'd put them in a holder -- in fact,
I'd create something like

package whatever.util;

public final class Holder<T> {
public T object;
}

Then you can use a HashSet<Holder<Foo>> to collect Foos without
discarding duplicates (as decided by Foo's equality operator and hash
code method) unless they are actually the same instance.

Well, as long as you can code it so you can't end up with two different
Holders referencing the same Foo. Or you can stuff Foos in an
IdentityHashMap as keys with null values, and use the keySet() method to
get a Set view of the Foos.

As for imposing an arbitrary but fixed order on instances of a class,
you can use something like:

public interface Numbered {
public long getInstanceNumber ();
}

public class AbstractNumbered implements Numbered {
private static long nextInstance = 0;
private long instanceNumber;

public AbstractNumbered () {
instanceNumber = nextInstance;
++nextInstance;
}

public long getInstanceNumber () {
return instanceNumber;
}
}

public class Foo extends AbstractNumbered {

public Foo (args) {
construct stuff
}

public Foo (otherArgs) {
construct stuff
}

other stuff
}

public class InstanceNumberComparator<T extends Numbered> implements
Comparator<Foo> {
public int compare (T foo1, T foo2) {
long n1 = foo1.getInstanceNumber();
long n2 = foo2.getInstanceNumber();
return (n1 < n2)?-1:((n1 > n2)?1:0);
}
}

Now you can use Numbereds in things like TreeSet using an
InstanceNumberComparator. You can extend AbstractNumbered and not even
have to worry about implementing the numbering. (Everything extending
AbstractNumbered ends up numbered from a single sequence, mind you, so
if you have Foos and Bars both derived from AbstractNumbered you might
have Foos with numbers 1 and 2, a Bar with number 3, a Foo numbered 4,
etc.; this has no downside and does have the benefit that not only does
every Foo have a unique number, but this is true even if Foo has a bunch
of subclasses and you get all this for free. Also it won't run out of
unique numbers and "wrap" before you have OOME on any
presently-conceivable deployment system, unless you create and discard
truly stupendous numbers of objects descended from AbstractNumbered over
time on a rock-solid system with a rock-solid power supply. This
probably means a Linux box on a UPS running for decades doing nothing
*but* create objects without maintaining references to any of them. It
certainly won't run out on a normal system before either the power fails
for whatever reason or Windows needs rebooting again.)
 
M

Mike Schilling

D¿ejm said:
2. Create a wrapper class like this: [...]
Now make a Set of Wrappers.

Sorry, but it doesn't work always. Does it ? It works for HashSet but not
for TreeSet.
TreeSet is trying to cast to Comparable and it of course fails. And
implementing Comparable in this Wrappers in general way seams to be
impossible.

Sure. My understanding was that you needed some collection you could iterate
over and that would be a HashSet of Wrappers.
 
D

D¿ejm

Sure. My understanding was that you needed some collection you could
iterate
over and that would be a HashSet of Wrappers.

Yes, you are right. I was just trying to solve two things in the same time:
collection of distinct objects and creating universal "ReferenceComparator".

Best Regards
Dz
 
D

D¿ejm

Design your comparator so that it sorts using the (identity)hashcode,
and if the hashcodes collide, just choose an arbitrary ordering. But here's
the trick: Whenever you choose an arbitrary ordering, RECORD what the
ordering is in some data structure internal to the comparator, so that the
next time the ordering of those two objects are requested, your comparator
returns the same ordering.

That's good idea. Identity-hashcode probably doesn't collide to often, so
recording would not cause performance drop.
I thought that I will use static WeakHashMap to record this information for
coliding Objects. But there is a problem, WeakHashMap is not using identity
but .equals(), which can be overwritten.
Anyway thanks. I will build something myself.

BR
Dz
 
J

John Ersatznom

D¿ejm said:
Yes, you are right. I was just trying to solve two things in the same time:
collection of distinct objects and creating universal "ReferenceComparator".

Best Regards
Dz

There's always:

public class IdentityHashSet<T> extends AbstractSet<T> {
private IdentityHashMap<T, T> map = new IdentityHashMap<T, T>();

public boolean add (T object) {
boolean result = map.containsKey(object);
map.put(object, object);
return result;
}

public int size () {
return map.size();
}

public Iterator<T> iterator () {
return map.keySet().iterator();
}
}

AIUI, this should work, and support adding and removing objects from the
set, including removing objects through the iterator. Since it's only
fourteen lines of code (excluding blank lines, comments, Javadoc...)
it's surprising it's absent from java.util!


And of course there's the

public class Holder<T> {
public final T contents;
public Holder<T> (T contents) { this.contents = contents; }
@SuppressWarnings("unchecked")
public boolean equals (Object o) {
return this == o ||
(o instanceof Holder &&
((Holder)o).contents == contents);
}
public int hashCode () {
return System.identityHashCode(contents);
}
}

Some Holder<T>s put into a HashSet will behave as the Ts put into the
IdentityHashSet. (Separate Holders with the same T compare equal, but
distinct T instances, even if those compare equal, produce
unequal-comparing holders.)

Read the caveats in Sun's API docs regarding IdentityHashMap<K, V>;
they'll apply also to IdentityHashSet<T> and HashSet<Holder<T>>.

See an earlier posting of mine to this same thread for a way to get this
behavior with TreeSet (distinct instances behaving distinctly). It
requires being able to give each distinct object a distinct "instance
number" to compare with a comparator. That's easy to do if you can make
the objects of a class you create, as I demonstrated there. If not, you
can use:

public class InstanceComparator<T> implements Comparator<T> {
private int next = 0;
public int compare (T t1, T t2) {
int i = getID(t1);
int j = getID(t2);
return (i < j)?-1:((i > j)?1:0);
}
private WeakHashMap<Holder<T>, Integer> map = new
WeakHashMap<Holder<T>, Integer>();
private int getID (T t) {
Holder<T> h = new Holder<T>(t);
Integer i = map.get(h);
if (i != null) return i.intValue();
map.put(h, Integer.valueOf(next));
return next++;
}
}

This uses the earlier Holder<T> class as well as a WeakHashMap of
Integers under the hood and an incremented "next id" number to generate
id numbers and associate these with distinct instances of T (even if
they compare equal). The WeakHashMap of Holder<T> gives us a
"WeakIdentityHashMap", so the mappings for objects that are no longer
used go away instead of eating up memory, while keeping the
IdentityHashMap semantics. Different instances of the comparator have
their own mappings. This works with any object of reference type,
including boxed integers, one-element arrays, and so forth.

Note that the above class depends on the semantics that even distinct
Holders may compare equal, but that Holders compare equal iff the held
objects are identity-equal. This is true for the Holder implementation
in this posting but not for the one in the earlier posting.

Implementing an actual WeakIdentityHashMap<K, V> class is left as an
exercise for the reader but shouldn't be too difficult since 90% of the
work is already done with Holder<K> and WeakHashMap<K, V>. :)

And of course the Holder<T> class is just begging for expanding to
implement Comparable<T> and use a static instance of
InstanceComparator<T> under the hood...not to mention maybe a renaming
to make it's exact purpose clearer, which is to totally subvert the
normal object-comparison semantics. ;)

All of the code above modulo typos and "Organize Imports" (ctrl+O in
Eclipse).
 
C

Chris Uppal

John said:
public class IdentityHashSet<T> extends AbstractSet<T> {
[implementation snipped]

Incidentally, new in 1.6 is java.util.Collections.newSetFromMap(Map) which can
be used to create implementations of various kinds of Set backed by the
corresponding kinds of Map. Handy, if somewhat hacky.

-- chris
 
D

Dzejm

Hello,

Many thanks for you code. I'm especially interested in this:

[...]
public class InstanceComparator<T> implements Comparator<T> {
private int next = 0;
public int compare (T t1, T t2) {
int i = getID(t1);
int j = getID(t2);
return (i < j)?-1:((i > j)?1:0);
}
private WeakHashMap<Holder<T>, Integer> map = new
WeakHashMap<Holder<T>, Integer>();
private int getID (T t) {
Holder<T> h = new Holder<T>(t);
Integer i = map.get(h);
if (i != null) return i.intValue();
map.put(h, Integer.valueOf(next));
return next++;
}
}

I planed to do it in the same way. But I thought it wouldn't work so I
dropped it.
Is it working ??? WeakHashMap is holding Holders. They have strong
references to something, but those Holders are not strongly referenced
anywhere. I think they will by collected by GC, despite their contents. Then
we loose information about assigned IDs. Numbers will be generated all the
time, even for objects which where compared before.
Am I right ?

Regards
Dz
 
D

Daniel Pitts

Chris said:
John said:
public class IdentityHashSet<T> extends AbstractSet<T> {
[implementation snipped]

Incidentally, new in 1.6 is java.util.Collections.newSetFromMap(Map) which can
be used to create implementations of various kinds of Set backed by the
corresponding kinds of Map. Handy, if somewhat hacky.

-- chris
Actually, if you look at HashSet source, you'll se it is backed by
HashMap.

Interestingly enough, in C++, map<k, v> is backed by a set<pair<k, v>>
 
J

John Ersatznom

Dzejm said:
I planed to do it in the same way. But I thought it wouldn't work so I
dropped it.
Is it working ??? WeakHashMap is holding Holders. They have strong
references to something, but those Holders are not strongly referenced
anywhere. I think they will by collected by GC, despite their contents. Then
we loose information about assigned IDs. Numbers will be generated all the
time, even for objects which where compared before.
Am I right ?

It *was* untested. There needs to be a second WeakHashMap from T to
Holder<T> so the Ts prevent the corresponding Holder<T>s from being
garbage collected. When the T itself vanishes, the entry in that map
goes away and the Holder<T> is only weakly reachable, which nails the
entry in the original map as well.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,431
Messages
2,571,679
Members
48,796
Latest member
Greg L.

Latest Threads

Top