Wanted: generic deep copy for clone()

T

Tim Tyler

: Tim Tyler wrote:

:> : Instead, the best way to go (IMO of course) is to make it as easy as
:> : possible to implement clone() according to the original plan. To that
:> : end, I provide you with the following code:
:>
:> That is - essentially - what I was visualising originally.
:>
:> I'm not quite sure what you are suggesting I was asking for -
:> but your code is basically what I was thinking of.

: Ah. I thought you wanted something like:

: Object result = ObjectUtil.clone(original);

: where the clone was no longer a method on Object, but rather a static
: piece of code that works the same way everywhere. [...]

: It's the aspect of removing control over cloning from the class itself
: that I was disagreeing with.

Yes - that would be dumb.

I can see how what I wrote can be read that way.

However, I just want to use the conventional cloning mechanism -
exactly as you suggest.
 
T

Tim Tyler

: Tim Tyler wrote:

:> : It appears that now I have two methods of doing what I asked after -
:> : all I have to do now if figure out the pros and cons of each one ;-)
:>
:> It seems to me a possible variant on this code would be to
:> catch what I presume is a RuntimeException at
:> the: f.setAccessible(true); line - and then catch the possible
:> IllegalAccessException and attempt access using get and set calls
:> based on the name of the field.
:>
:> That would help deal with the case where the SecurityManager chokes
:> on the setAccessible(true); line.

: That would be possible (incidentally, it throws a SecurityException,
: which is a subclass of RuntimeException). To me, that seems like it's
: crossing a line, making assumptions that the implementor followed the
: JavaBeans specification and that all important state is accessible
: through JavaBeans.

I agree it's not so neat.

So maybe the serialisation approach is better - because it will work in
managed environments - and unnecessarily writing code that won't work in
managed environments hinders code reuse.
 
T

Tim Tyler

: The code is roughly:

: public class DeepCloneFactory
: {
: public static Object deepClone(Object obj) throws Exception
: {
: ByteArrayOutputStream baos=new ByteArrayOutputStream();
: ObjectOutputStream oos=new ObjectOutputStream(baos);
: oos.writeObject(obj);
: oos.flush();
: oos.close();
: ByteArrayOutputStream bais=new ByteArrayInputStream(baos.toByteArray($
: ObjectInputStream ois=new ObjectInputStream(bais);
: obj=ois.readObject();
: ois.close();
: return obj;
: }
: }

I had a stab at fleshing this out a bit:

/**
* Deep clone an object
*
* To use this - use code such as:
*
* public Object clone() {
* return ObjectUtilities.clone(this);
* }
*
* @param object - object to be copied
* @return - a copy of the object
*/
public static Object clone(Object object) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();

try {
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(object);
oos.flush();
oos.close();
ByteArrayInputStream bais = new ByteArrayInputStream(baos.toByteArray());
ObjectInputStream ois = new ObjectInputStream(bais);
object = ois.readObject();
ois.close();
} catch (IOException e) {
throw new RuntimeException(e);
} catch (ClassNotFoundException e) {
throw new RuntimeException(e);
}

return object;
}

: Of course all of your objects must implement Serializable in order for
: this to work consistently.

Yes - failure to implement Serializable should really get a
runtime exception explaining exactly what has gone wrong.

I believe this code shows some promise ;-)

It also suggests that my original plan - of implementing hashCode using
a similar techniques - is going to be extremely simple. The entire
object is in a byte array at one point - so calculating a CRC should
not be difficult.

Lastly, equals() is also apparently an attractive target using this
approach ;-)

Serialisation is a bit of a black box to me at the moment. The JVM
might date-stamp the objects (or something) for all I currently know -
in which case the plan of implementing hashCode() and equals() won't
be so simple. Even if the JVM plays ball - and continues to do so in
future versions - there's some possibility that custom serialisation
code will not.

Not too big a concern though - methinks ;-)

It looks like I may soon have seen the back of most of
those pesky clone(), equals() and hashCode() methods -
at least until performance optimisation time...
 
T

Tim Tyler

: I believe this code shows some promise ;-)

: It also suggests that my original plan - of implementing hashCode using
: a similar techniques - is going to be extremely simple. The entire
: object is in a byte array at one point - so calculating a CRC should
: not be difficult.

: Lastly, equals() is also apparently an attractive target using this
: approach ;-)

Well, here we go:

This code is totally untested.

If anyone wants to help me poke holes in it they would be more than welcome.

/**
* Deep test for equality
* Note: all objects must be serializable
*
* To use this - use code such as:
*
* public boolean equals(Object object) {
* return ObjectUtilities.equals(this, object);
* }
*
* @param object1 - the first object
* @param object2 - the second object
* @return - true iff the two objects have equal contents
*/
public static boolean equals(Object object1, Object object2) {
ByteArrayOutputStream baos1 = new ByteArrayOutputStream();
ByteArrayOutputStream baos2 = new ByteArrayOutputStream();

try {
ObjectOutputStream oos1 = new ObjectOutputStream(baos1);
oos1.writeObject(object1);
oos1.flush();
oos1.close();
byte[] bytes1 = baos1.toByteArray();

ObjectOutputStream oos2 = new ObjectOutputStream(baos2);
oos2.writeObject(object2);
oos2.flush();
oos2.close();
byte[] bytes2 = baos1.toByteArray();

return equals(bytes1, bytes2);
} catch (IOException e) {
throw new RuntimeException(e);
}
}

/**
* Deep hash code
* Note: all objects must be serializable
*
* To use this - use code such as:
*
* public int hashCode(Object object) {
* return ObjectUtilities.hashCode(this);
* }
*
* @param object - the object to be hashed
* @return - the hash code
*/
public static int hashCode(Object object) {
return hashCode(object, new CRC32());
}

/**
* Deep hash code
* Note: all objects must be serializable
*
* To use this - use code such as:
*
* public int hashCode(Object object) {
* return ObjectUtilities.hashCode(this, new CRC32());
* }
*
* @param object - the object to be hashed
* @param checksum - java.util.zip.Checksum instance to be used to compute the hash
* @return - the hash code
*/
public static int hashCode(Object object, Checksum checksum) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();

try {
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(object);
oos.flush();
oos.close();
byte[] bytes = baos.toByteArray();
checksum.update(bytes, 0, bytes.length);

return (int) checksum.getValue();
} catch (IOException e) {
throw new RuntimeException(e);
}
}

private static boolean equals(byte[] bytes1, byte[] bytes2) {
int length = bytes1.length;
if (length != bytes2.length) {
return false;
}

for (int i = length; --i >= 0;) {
if (bytes1 != bytes2) {
return false;
}
}

return true;
}
 
T

Tim Tyler

: : I believe this code shows some promise ;-)

: : It also suggests that my original plan - of implementing hashCode using
: : a similar techniques - is going to be extremely simple. The entire
: : object is in a byte array at one point - so calculating a CRC should
: : not be difficult.

: : Lastly, equals() is also apparently an attractive target using this
: : approach ;-)

: Well, here we go:

: This code is totally untested.

Here's a refactored, cleaned up, debugged - and more tested version
(suggestions are - of course - still welcome).

It seems to work OK. I'm reasonably happy with what it does
when encountering a non-serializable class.

Deep copying, deep equality testing and deep hash code computation
are rather fundamental Java programming problems.

They ought really to be solved in the JDK.

There's copy/deepCopy in Smalltalk, Clone in C#, copy.copy in
Python and deep_clone in Eiffel - but there's nothing built-in
in Java.

import java.io.ByteArrayInputStream;
import java.io.ByteArrayOutputStream;
import java.io.IOException;
import java.io_ObjectInputStream;
import java.io_ObjectOutputStream;
import java.io.Serializable;
import java.lang.reflect.Field;
import java.lang.reflect.InvocationTargetException;
import java.lang.reflect.Method;
import java.util.zip.CRC32;
import java.util.zip.Checksum;

public class ObjectUtilities implements Cloneable, Serializable {
/**
* Deep clone an object
*
* To use this - use code such as:
*
* public Object clone() {
* return ObjectUtilities.clone(this);
* }
*
* @param object - object to be copied
* @return - a copy of the object
*/
public static Object clone(Object object) {
byte[] ba = serializeToByteArray(object);

try {
ByteArrayInputStream bais = new ByteArrayInputStream(ba);
ObjectInputStream ois = new ObjectInputStream(bais);
object = ois.readObject();
ois.close();
} catch (IOException e) {
throw new RuntimeException(e);
} catch (ClassNotFoundException e) {
throw new RuntimeException(e);
}

return object;
}

/**
* Deep test for equality
* Note: all objects must be serializable
*
* To use this - use code such as:
*
* public boolean equals(Object object) {
* return ObjectUtilities.equals(this, object);
* }
*
* @param object1 - the first object
* @param object2 - the second object
* @return - true iff the two objects have equal contents
*/
public static boolean equals(Object object1, Object object2) {
return equals(serializeToByteArray(object1), serializeToByteArray(object2));
}

/**
* Deep hash code
* Note: all objects must be serializable
*
* To use this - use code such as:
*
* public int hashCode(Object object) {
* return ObjectUtilities.hashCode(this);
* }
*
* @param object - the object to be hashed
* @return - the hash code
*/
public static int hashCode(Object object) {
return hashCode(object, new CRC32());
}

/**
* Deep hash code
* Note: all objects must be serializable
*
* To use this - use code such as:
*
* public int hashCode(Object object) {
* return ObjectUtilities.hashCode(this, new CRC32());
* }
*
* @param object - the object to be hashed
* @param checksum - java.util.zip.Checksum instance to be used to compute the hash
* @return - the hash code
*/
public static int hashCode(Object object, Checksum checksum) {
byte[] bytes = serializeToByteArray(object);
checksum.update(bytes, 0, bytes.length);

return (int) checksum.getValue();
}

private static byte[] serializeToByteArray(Object object) {
ByteArrayOutputStream baos = new ByteArrayOutputStream();
try {
ObjectOutputStream oos = new ObjectOutputStream(baos);
oos.writeObject(object);
oos.flush();
oos.close();
} catch (IOException e) {
throw new RuntimeException(e);
}

return baos.toByteArray();
}

private static boolean equals(byte[] bytes1, byte[] bytes2) {
int length = bytes1.length;
if (length != bytes2.length) {
return false;
}

for (int i = length; --i >= 0;) {
if (bytes1 != bytes2) {
return false;
}
}

return true;
}
}
 
F

Filip Larsen

Tim Tyler wrote
Here's a refactored, cleaned up, debugged - and more tested version
(suggestions are - of course - still welcome).

It seems to work OK. I'm reasonably happy with what it does
when encountering a non-serializable class.

The correctness of your implementation of deep equals and hash are based on
the assumption that the serialization mechanism serialize the same object
tree to the same byte stream. I don't recall to have read such a requirement
in the Object Serialization specification and I would not be surprised if
the serialization mechanism indeed did serialize the same object tree to a
different byte stream in special cases. And even if the serialization
mechanisms is guaranteed to serialize the same object tree into the same
byte stream then there is nothing to stop a serializable class to include
all sorts of fields in its serialization state that has nothing to do with
determining equality.

I think the essence of the problem with your quest for deep operators in
Java lies in the fact that the language define these operators to work on
the semantical level, not on the syntactical level. Or to put it another
way, from the representation of an arbitrary class alone (for example a
serialized stream) you cannot safely assume that two unequal representations
is equivalent to two semantically unequal object trees. The reason that
clone-by-serialization can be made to work (i.e. actually clone semantical
state) is because the objects are both serialized *and* deserialized.
 
T

Tim Tyler

: Tim Tyler wrote

:> Here's a refactored, cleaned up, debugged - and more tested version
:> (suggestions are - of course - still welcome).
:>
:> It seems to work OK. I'm reasonably happy with what it does
:> when encountering a non-serializable class.

: The correctness of your implementation of deep equals and hash are based on
: the assumption that the serialization mechanism serialize the same object
: tree to the same byte stream.

Indeed.

: I don't recall to have read such a requirement in the Object
: Serialization specification and I would not be surprised if the
: serialization mechanism indeed did serialize the same object tree to a
: different byte stream in special cases.

It's possible. You could certainly create cases where this happened
by serialising the object yourself.

The built-in serialisation mechanism doesn't seem to do this normally,
though - i.e. there is no time-stamp, object-ID - or anything like that
included.

: And even if the serialization mechanisms is guaranteed to serialize the
: same object tree into the same byte stream then there is nothing to stop
: a serializable class to include all sorts of fields in its serialization
: state that has nothing to do with determining equality.

My aim here was to provide a "deep" equality check.
I assume all fields are needed.

Serialisation provides a mechansim to ignore certain fields - by
labelling them as "transient". I regard that as a bonus.

Is it possible that serialisation and equality checking require
different fields to be labelled as "transient"? I suppose so -
but I reckon there will be many more cases where the same fields
need to be ignored in both cases. This approach is attractive there.

The serialisation approach is also likely to ignore static data -
at least if there is any relevant static data the default
serialisation mechanism won't take it into account.

: I think the essence of the problem with your quest for deep operators in
: Java lies in the fact that the language define these operators to work on
: the semantical level, not on the syntactical level. Or to put it another
: way, from the representation of an arbitrary class alone (for example a
: serialized stream) you cannot safely assume that two unequal representations
: is equivalent to two semantically unequal object trees. [...]

I agree that a deep equality check is not going to be perfect
for every case - even once the mechanism that exists for ignoring
fields is taken into account.

However my aim is modest - take 90% of the effort out of implementing
deep "equals", "hashCode" and "clone" operators during development -
and I think there the approach looks like it is going to be successful.

In those cases where a deep copy of instance data minus certain
specified fields is not suitable, you can always fall back on
rolling your own equals() method.
 
Joined
Oct 4, 2008
Messages
1
Reaction score
0
Doug Pardee?? What a sick name and attitude.

Seems like Doug Pardee wants to put on a Doug Lea mask. Mother****ker just because your name is Doug Pardee, doesnt mean that you are Doug Lea and you can talk about threads and other stuff. Shut the **** up and sit in a corner. All that you are is Doug Mother****ee.

Half of what you spoke was bullshit. You are better off SUACU (Shut up and **** Up).
 
Joined
Aug 8, 2010
Messages
1
Reaction score
0
Java generic deep copy utility

I created a generic deep copy utility and placed it at genericdeepcopy.com. It works in a fair number of situations I believe.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top