Chris said:
John C. Bollinger wrote:
I think that this condition is too strong. If we weaken it to allow an
implementation of copy as:
public <whatever> copy() { return this; }
then the system will work more gracefully.
The 'copy' operation is not a precisely defined operation -- there is no
God-given (or Gosling-given) single idea of what a 'copy' is. So the method
called copy() can only ever represent a class-designers /best guess/ at what is
/likely/ to be most useful to users of that class. In particular, the amount
of state that is replicated (as opposed to shared) between the original and the
copy is not something that can be decided by some global policy. The degree of
'deepness' needed for a copy is determined (even ignoring context) by private
details of the object's implementation, and by the mapping between the objects
state and its semantics.
The operational details of the 'copy' operation are not precisely
defined, but we can and should define the purpose of copying and / or
the expected characteristics of the result (as they relate to our
efforts). I have added this to the web page: "The purpose of
duplication as chosen for this project is to produce from one object a
second object such that the two may be used independently without
concern for operations on one unexpectedly affecting the behavior of the
other, including when the two are used in any combination among multiple
threads without external synchronization." This statement is subject to
debate; it is based in part on one of Chris Smith's comments elsewhere
in this thread regarding whether or not a copy needs to be distinct from
the original. Are there other conceivable purposes for object copying
that are not covered by that statement (and that we should consider
supporting)?
The suggested purpose for copying is consistent with your observations
about the necessary depth of copying (or lack thereof). It also
implies, as do you in passing, that the necessary depth may be context
dependent.
For instance a Rectangle class that internally
maintains its definition/state as two fields of class Point should obviously
(for most normal purposes) implement copy as a fully deep operation. OTOH, a
ColoredRectangle that implemented its state as 4 floats and a Color would
probably implement copy as shallow.
Note that how much state is replicated depends (in part) on details that are
private, and hence the caller (in many circumstances) would be wrong to attempt
to dictate how much of the state is replicated vs. shared.
(Incidentally, that's why I object to, and have avoided using, the method name
clone() -- which I think strongly implies a particular strategy for copying,
and one that is not necessarily appropriate.)
I think that the limiting case of shallowness is a "copy" that replicates no
state, and shares all state. I.e.
return this;
That turns out to be a natural implementation of "copy" for genuine Singletons
(if there are any ;-). And -- much more importantly -- for any object which is
intended to act as the "sole representative" of some other entity. This
interpretation of copy() is that it should return an object that is as distinct
/as semantically possible/ from the original (up to some context and
implementation-dependent limit on deepness). The limiting case of "distinct as
semantically possible" is /no/ difference; if the only semantically valid
"equivalent" of some object is that object itself, then that's what the "copy"
should provide.
I think I am persuaded.
Returning to the ColoredRectangle. I said that it would probably implement
copy as a shallow copy (a pure clone()), but really that's a design error.
Rectangle /might/ know how Colors work, and whether the best way to get a
duplicate 'handle' on a colour is simple to copy the reference. It's not
entirely implausible that it /would/ know that, but I don't think it should
/have/ to know it. If Colors are expensive (not likely I admit) then it would
be an entirely plausible design that Color provided a number of pre-defined
instances that (purely for efficiency reasons) should remain unique --
Color.RED, Color.BLUE, -- but that less commonly used instances would be
created on-the-fly and discarded at need. But that's not possible if classes
like Rectangle are going to second-guess what it means to copy a colour. So
the implementation of ColoredRectangle.copy() should also copy() the Color.
The implementation of Color.copy() would just "return this;" (assuming that the
objects were immutable, and if the class designer had thought it worthwhile to
implement an optimised version of copy()).
I agree that it is bad for classes to be forced to know or assume
implementation details of other classes. On the other hand, there is a
major practicality issue with enabling (and relying on) copying
arbitrary objects: _every_ existing class would have to be examined to
ensure that it provided an appropriate implementation of copy(). Many
could be safely copied via an inherited mechanism that recursively
copied all members (taking into account multiple references to the same
objects) but some could not, and those would all suddenly be exposed to
breakage.
If we were designing this feature for a new language I would be
satisfied to make it usable on any object, by any object. For
retrofitting Java with this facility, however, I think we need to make
classes opt in. That leaves the question of what to do in the fairly
likely scenario of running into an object that cannot be copied inside
another object that you're trying to copy. I don't see a single correct
answer, so it may be that we need to provide multiple options.
Anyway, with all that out of the way. How about the following, as a
start-point ?
public class Object
{
....
// the JVM-level clone operation
private native Object __clone();
protected final Object clone()
throws CloneNotSupportedException
{
if (this instanceOf NotClonableMarkerInterface)
throw new CloneNotSupportedException();
return this.__clone();
}
With the hope that this may become more than just an academic exercise,
I think we need to keep compatibility in mind as a goal. To that end,
we cannot make Object.clone() final, and we should not change its
behavior with respect to the Cloneable interface. I think the best
alternative may simply be to deprecate Object.clone() and build a
parallel facility.
public Object copy()
throws CloneNotSupportedException
{
Object copy = this.clone();
copy.postClone();
return copy;
}
protected void postClone()
{
// default is no action
}
...
}
Some notes:
1) I'm not completely convinced that it's ever appropriate for an object to be
marked with NotClonableMarkerInterface. There may /be/ a good reason, though,
so I've left the test-and-throw in for now. Unless a good use for it can be
found, though, (perhaps something to do with security) then I think it would be
much better to get rid of it and CloneNotSupportedException.
I dislike marker interfaces, so I would be quite satisfied to both
deprecate Cloneable and avoid introducing a new marker interface. I
don't know whether we can do the former, but I am confident that can do
the latter.
2) I'm not completely convinced that the clone() method should be available to
subclasses at all. It might be better to make clone() private, and change the
copy implementation to something like:
public final Object copy()
throws CloneNotSupportedException
{
if (this instanceOf SingularInstanceMarkerInterface)
return this;
Object copy = this.clone();
copy.postClone();
return copy;
}
but that /reeks/ of over-engineered brittleness to me...
I agree, I'm not satisfied with that. And not only because it
introduces _another_ marker interface.
3) Subclasses have the choice of overriding copy(), which they would do if they
wanted to
return this;
or if -- for whatever reason -- they knew a better way to implement copying
than going via the system-level clone() operation.
Right. I think that's a good feature.
4) ... or they could override postClone() to do any massaging of the results of
clone() that were necessary to preserve sanity. Nearly all implementations
would take the form:
protected void postClone()
{
m_field1 = m_field1.copy();
m_field6= m_field6.copy();
//...
}
That seems a little sugary to me. If copy() can be overridden then why
do we need postClone()? What is gained by providing two different
avenues for fixing up the state of the copy?
5) In the above I've assumed that the copy() operation should be universally
available. That way seems best to me (especially in a language which makes the
equally problematical equals() method public). The main reasons for
restricting it seem to be that (a) the semantics are /not/ obvious so users
should be forced to think before using it, (b) some objects should not be
copyable. The point (a) is valid, but I don't think that restricting copy()
actually helps. And, as I've said, with the extended interpretation of copy()
that I'm urging, I think that most (perhaps all) of the examples of
non-copyable object evaporate.
I'm leaning the other way, as I described above, but I'm willing to
suspend judgment until we work out some of the other design issues.
6) But, that said, I don't really see much harm in including an /unchecked/
CopyNotSupportedException that subclasses could override copy() to throw.
If we think that copy() should not be supported by default, then I'd just
remove copy() (and postClone()) from Object, add an interface
interface Copyable
{
Object copy();
}
that subclasses could implement at will.
I think that in the end we will probably need to offer such an
exception. I am satisfied with it being unchecked, especially if we
make classes declare themselves copyable instead of being copyable by
default.