How to make a deep copy of an object (Searching for Idiom)

B

Brian Schröder

Hello Group,

I sometimes which to make a deep copy of an object. I know I could use Marshal,
but thats slow so I want to write a routine #deep_copy. (Or should I overwrite
#dup ?)

Now the question is, how do you write those. I could use this:

class A
...
def deep_copy
result = self.dup
result.field1 = self.field1.deep_copy
...
end
end

or

class A
def initialize(field1 = 'default value', ...)
@field1 = field
...
end

def deep_copy
self.new(@field1.deep_copy)
end
end

Which allows me to use instance variables.

Is there something more elegant. What do you prefer? Am I on the right track?

Best regards,

Brian
 
K

Kent Sibilev

If your object doesn't have singleton methods you can use this
construct:

Marshal.load(Marshal.dump(a))

Cheers,
Kent.
 
B

Brian Schröder

If your object doesn't have singleton methods you can use this
construct:

Marshal.load(Marshal.dump(a))

Thanks, but that was what I wanted to avoid.

Regards,

Brian
 
T

Tim Hunter

Brian said:
Hello Group,

I sometimes which to make a deep copy of an object. I know I could use
Marshal, but thats slow so I want to write a routine #deep_copy. (Or
should I overwrite
#dup ?)

Brian, the classes in my new RVG library must have deep_copy methods. I
asked this question on c.l.r a few weeks ago but didn't get any
suggestions. I did some searching around and found some an old thread on
ruby_talk which helped, but with ruby-talk down I can't find it right now.

The general form looks like this:

def deep_copy
copy = self.class.new
ivs = instance_variables
ivs.each do |iv|
itv = instance_variable_get(iv)
otv = case
when itv.nil?
nil
when itv.respond_to?:)deep_copy)
itv.deep_copy
when itv.respond_to?:)dup)
itv.dup
else
itv
end
copy.instance_variable_set(iv, otv)
end
return copy
end

The idea is that the instance variables can refer 1) to other objects that
have a deep_copy method, 2) to "normal" Ruby objects that can be duped, and
3) to immediate objects like Fixnum which don't need to be duped, just
assigned. I also have a special case for nil since it responds to :dup but
can't actually be duped.

If #initialize takes arguments, then you'll need a slightly different
version.

For testing purposes I also implemented a deep_equal method with the same
general form.

P.S. I'd appreciate any hearing any criticisms.
 
B

Brian Schröder

Brian, the classes in my new RVG library must have deep_copy methods. I
asked this question on c.l.r a few weeks ago but didn't get any
suggestions. I did some searching around and found some an old thread on
ruby_talk which helped, but with ruby-talk down I can't find it right now.

The general form looks like this:

def deep_copy
copy = self.class.new
ivs = instance_variables
ivs.each do |iv|
itv = instance_variable_get(iv)
otv = case
when itv.nil?
nil
when itv.respond_to?:)deep_copy)
itv.deep_copy
when itv.respond_to?:)dup)
itv.dup
else
itv
end
copy.instance_variable_set(iv, otv)
end
return copy
end

The idea is that the instance variables can refer 1) to other objects that
have a deep_copy method, 2) to "normal" Ruby objects that can be duped, and
3) to immediate objects like Fixnum which don't need to be duped, just
assigned. I also have a special case for nil since it responds to :dup but
can't actually be duped.

If #initialize takes arguments, then you'll need a slightly different
version.

For testing purposes I also implemented a deep_equal method with the same
general form.

P.S. I'd appreciate any hearing any criticisms.

Interesting. But this won't work for instance variables that point to arrays of
deep_copy-able objects. Right?

Maybe this is connected to the "object-state" thread floating around somewehere
else.

Regards,

Brian
 
T

Tim Hunter

Brian said:
Interesting. But this won't work for instance variables that point to
arrays of deep_copy-able objects. Right?

I knew you'd spot that :) In a couple of cases I had to replace Arrays with
something like this:

class Content < Array
def deep_copy
copy = self.class.new
each do |c|
copy << case
when c.nil?
nil
when c.respond_to?:)deep_copy)
c.deep_copy
when c.respond_to?:)dup)
c.dup
else
c
end
end
return copy
end
end

And, yes, I have to be careful not to use methods that return Array objects.
 
B

Brian Schröder

I knew you'd spot that :) In a couple of cases I had to replace Arrays with
something like this:

class Content < Array
def deep_copy
copy = self.class.new
each do |c|
copy << case
when c.nil?
nil
when c.respond_to?:)deep_copy)
c.deep_copy
when c.respond_to?:)dup)
c.dup
else
c
end
end
return copy
end
end

And, yes, I have to be careful not to use methods that return Array objects.

Maybe it would make sense to extend the base classes Object, Array, Hash with a
deep-copy functionality. That would be something for the extensions project.

The problem here is, that we have object state that is not contained in
"visible slots" i.e. instance variables. So this would be one case, where the
proposal for a

#each_state_slot
#assign_to_state_slot(key, value),

extension of all ruby objects would make sense. Then a deep copy would be dead
simple. (Except for recursive structures, ...)

Note, that in my view no objects with infinite state slots exist, because they
would have to be generated from more fundamental state. E.g.

class Infinite
def each_state_slot
i = 0
loop do yield(i+=1, :state) end
end
end

would simply be a wrong implementation. Infinite in this example has no own
state.

Regards,

Brian
 
M

Mauricio Fernández

Hello Group,

I sometimes which to make a deep copy of an object. I know I could use Marshal,
but thats slow so I want to write a routine #deep_copy. (Or should I overwrite
#dup ?)

Now the question is, how do you write those. I could use this:

class A
...
def deep_copy
result = self.dup
result.field1 = self.field1.deep_copy
...
end
end

or

class A
def initialize(field1 = 'default value', ...)
@field1 = field
...
end

def deep_copy
self.new(@field1.deep_copy)
end
end

Which allows me to use instance variables.

Is there something more elegant. What do you prefer? Am I on the right track?

Have you considered the somewhat contrived

class A; attr_accessor :a; end;
a = A.new
a.a = a

or more realistically

class A; attr_accessor :children; def add(name); (@children ||= []) << C.new(name, self) end; end
class C; def initialize(name, parent); @name, @parent = name, parent end end
A.new.add "foo"

?
 
B

Brian Schröder

Hello Group,

I sometimes which to make a deep copy of an object. I know I could use
Marshal, but thats slow so I want to write a routine #deep_copy. (Or should
I overwrite#dup ?)

Now the question is, how do you write those. I could use this:

class A
...
def deep_copy
result = self.dup
result.field1 = self.field1.deep_copy
...
end
end

or

class A
def initialize(field1 = 'default value', ...)
@field1 = field
...
end

def deep_copy
self.new(@field1.deep_copy)
end
end

Which allows me to use instance variables.

Is there something more elegant. What do you prefer? Am I on the right
track?

Have you considered the somewhat contrived

class A; attr_accessor :a; end;
a = A.new
a.a = a

or more realistically

class A; attr_accessor :children; def add(name); (@children ||= []) <<
C.new(name, self) end; end class C; def initialize(name, parent); @name,
@parent = name, parent end end A.new.add "foo"

?

Yes, my structures are not cyclic.

Regards,

Brian
 
R

Robert Klemme

Brian Schröder said:
Maybe it would make sense to extend the base classes Object, Array, Hash
with a
deep-copy functionality. That would be something for the extensions
project.

IMHO not. Reason beeing that the semantics of deep copy are class
dependend. You might not want to copy all instances in an object graph for
deep copy and that might totally depend on the class at hand and / or (even
worse) application. IMHO there is no real general solution to thid. Of
course you could define a method in Object like

def deep_copy
Marshal.load( Marshal.dump( self ) )
end

but you don't gain much that way. And it won't even work in the general
case (consider Singletons etc.).
The problem here is, that we have object state that is not contained in
"visible slots" i.e. instance variables. So this would be one case, where
the
proposal for a

There's a much more serious problem with the proposed implementation: it
does not cope with graphs of objects that contain cycles. Do do that you
need to keep track of objects copied already. Marshal does this - and it's
efficient. If you want to do that yourself, you'll likely need a hash[old
oid -> copy] to manage this. I doubt though that it's more efficient than
Marshal.

I had to discover that there's more to deep copying than just traversing the
object graph and copying each instance in turn a while ago myself. I help
you can benefit from my earlier errors... :)

Kind regards

robert
 
B

Brian Schröder

Brian Schröder said:
Maybe it would make sense to extend the base classes Object, Array, Hash
with a
deep-copy functionality. That would be something for the extensions
project.

IMHO not. Reason beeing that the semantics of deep copy are class
dependend. You might not want to copy all instances in an object graph for
deep copy and that might totally depend on the class at hand and / or (even
worse) application. IMHO there is no real general solution to thid. Of
course you could define a method in Object like

def deep_copy
Marshal.load( Marshal.dump( self ) )
end

but you don't gain much that way. And it won't even work in the general
case (consider Singletons etc.).
The problem here is, that we have object state that is not contained in
"visible slots" i.e. instance variables. So this would be one case, where
the
proposal for a

There's a much more serious problem with the proposed implementation: it
does not cope with graphs of objects that contain cycles. Do do that you
need to keep track of objects copied already. Marshal does this - and it's
efficient. If you want to do that yourself, you'll likely need a hash[old
oid -> copy] to manage this. I doubt though that it's more efficient than
Marshal.

I had to discover that there's more to deep copying than just traversing the
object graph and copying each instance in turn a while ago myself. I help
you can benefit from my earlier errors... :)

Kind regards

robert

Ok, then back to the original question. If I don't want a general solution, but
want to deep-copy my special class, that contains some instance variables, with
some multi-dimensional arrays in it. What is a nice way to go about it?

Regards,

Brian
 
C

Chuckl

I've also been thinking about deep_copy, but for a different reason. I
want to keep a mutation log for each object instance. So if this
occurs:

foo = Obj.new:)blah)
foo.warp_blah!
foo.twist_blah = :different_blah

Then somewhere, preferably abstracted inside each object instance, I'd
have a log like:

puts foo.mutation_log => ["wrap_blah!", ["twist_blah=",
:different_blah]]

When the method Obj.checkpoint is called, the log is cleared and the
current state returned as a Marshall-ing, or some other store-savable
format.

I understand this is not precisely the same thing as deep_copy, but I
wonder if anyone else has done this? Obviously or not, I'm looking to
implement an orthogonal persistence object system in ruby.
Cheers,
Chuckl
 
A

Austin Ziegler

I've also been thinking about deep_copy, but for a different reason. I
want to keep a mutation log for each object instance. So if this
occurs:

It's not quite the same as what you're saying (it's difficult to do a
true mutation log if your code doesn't write to a mutation log by
defaultt), but look at Transaction::Simple. It does live object
transactions and checkpointing.

-austin
 
R

Robert Klemme

Brian Schröder said:
Brian Schröder said:
On Sat, 11 Dec 2004 09:22:24 +0900

Brian Schröder wrote:

Interesting. But this won't work for instance variables that point
to
arrays of deep_copy-able objects. Right?

I knew you'd spot that :) In a couple of cases I had to replace
Arrays
with
something like this:

class Content < Array
def deep_copy
copy = self.class.new
each do |c|
copy << case
when c.nil?
nil
when c.respond_to?:)deep_copy)
c.deep_copy
when c.respond_to?:)dup)
c.dup
else
c
end
end
return copy
end
end

And, yes, I have to be careful not to use methods that return Array
objects.


Maybe it would make sense to extend the base classes Object, Array,
Hash
with a
deep-copy functionality. That would be something for the extensions
project.

IMHO not. Reason beeing that the semantics of deep copy are class
dependend. You might not want to copy all instances in an object graph
for
deep copy and that might totally depend on the class at hand and / or
(even
worse) application. IMHO there is no real general solution to thid. Of
course you could define a method in Object like

def deep_copy
Marshal.load( Marshal.dump( self ) )
end

but you don't gain much that way. And it won't even work in the general
case (consider Singletons etc.).
The problem here is, that we have object state that is not contained in
"visible slots" i.e. instance variables. So this would be one case,
where
the
proposal for a

There's a much more serious problem with the proposed implementation: it
does not cope with graphs of objects that contain cycles. Do do that you
need to keep track of objects copied already. Marshal does this - and
it's
efficient. If you want to do that yourself, you'll likely need a
hash[old
oid -> copy] to manage this. I doubt though that it's more efficient
than
Marshal.

I had to discover that there's more to deep copying than just traversing
the
object graph and copying each instance in turn a while ago myself. I
help
you can benefit from my earlier errors... :)

Kind regards

robert

Ok, then back to the original question. If I don't want a general
solution, but
want to deep-copy my special class, that contains some instance variables,
with
some multi-dimensional arrays in it. What is a nice way to go about it?

This seems to work fairly good:

class Object
def deep_copy(h = {})
ident = self.id
cpy = h[ident]

unless cpy
cpy = case self
when String
frozen? ? self : dup
when Array
map {|o| o.deep_copy(h)}
when Hash
# this looses some state (i.e. default etc.)
inject({}) {|c, (k,v)| c[k.deep_copy(h)] = v.deep_copy(h); c}
when Class, Module, Symbol, FalseClass, TrueClass, NilClass, Fixnum,
Bignum
self
# probably more special cases like Struct
else
cpy = self.class.allocate

instance_variables.each do |var|
cpy.instance_variable_set(var,
instance_variable_get(var).deep_copy(h))
end

cpy
end

cpy.freeze if frozen?
h[ident] = cpy
end

cpy
end
end

Kind regards

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top