Legitimacy of deepcopy

Discussion in 'Python' started by Eugeni Doljenko, Jun 4, 2004.

  1. There is a list of custom objects. I want do duplicate this list to modify
    objects in new list and then compare old and new. I could do it with
    deepcopy

    class foo:
    def __init__(self, str):
    self.str = str
    old = [foo('str1'), foo('str2')]

    import copy
    new = copy.deepcopy(old)

    But I've found very few deepcopy uses in Python library and other programs,
    so I wonder it's possible to do it other, more elegant way.
     
    Eugeni Doljenko, Jun 4, 2004
    #1
    1. Advertising

  2. Eugeni Doljenko wrote:

    > There is a list of custom objects. I want do duplicate this list to modify
    > objects in new list and then compare old and new. I could do it with
    > deepcopy
    >
    > class foo:
    > def __init__(self, str):
    > self.str = str
    > old = [foo('str1'), foo('str2')]
    >
    > import copy
    > new = copy.deepcopy(old)
    >
    > But I've found very few deepcopy uses in Python library and other programs,
    > so I wonder it's possible to do it other, more elegant way.


    The problem with deepcopy is that it is impossible to do "correctly,"
    since "correctly" depends on where your data structure abstractions are.
    A simple example:

    steve = People('Steve')
    joe = People(steve)
    mystruct = [(steve, joe, Dollars(5)), steve]

    Now, should a deepcopy have two distinct steves, or one? Is the $5.00
    in the new structure "the same as" another $5.00 or not? Sometime you
    mean the identity of an object, and sometimes you mean it as a value,
    and no general purpose function named deepcopy can guess where to stop
    copying.

    --
    -Scott David Daniels
     
    Scott David Daniels, Jun 4, 2004
    #2
    1. Advertising

  3. Eugeni Doljenko

    David Bolen Guest

    Scott David Daniels <> writes:

    > The problem with deepcopy is that it is impossible to do "correctly,"
    > since "correctly" depends on where your data structure abstractions are.


    Well, just because there's more than one way to do it, need not mean
    that a particular solution isn't a correct one. The semantics of
    deepcopy are well-defined and I think perfectly valid. Now, if they
    don't fit a particular need then it doesn't fit the need, but that's
    up to the user to decide on a case by case basis.

    To the original poster, we used deepcopy in a number of interface
    points where we need to ensure isolation of internal data structures
    versus copies of that information returned to callers. For example,
    we can't just return a reference to an internal list or dictionary
    which would let the caller later mutate our internal object without
    our knowledge. For that, deepcopy works just the way we need, since
    our primary goal is to avoid mutation of our original objects.

    In most of our use cases, the objects in question are providing data
    storage (either in-memory or as a simulation for a database) and need
    strict isolation between their internal storage and what callers see
    as the objects being returned.

    In such cases, I really don't see any good solution in lieu of
    deepcopy, and in fact deepcopy (as covered below) is probably already
    a pretty good minimal solution in terms of actual work it does. It's
    certainly a legitimate need and deepcopy is certainly a legitimate
    implementation in my eyes. (I'll grant you that sometimes it feels
    strange forcing object copies when you don't normally find yourself
    needing to, but when yoiu want copies, you want copies :))

    > A simple example:
    >
    > steve = People('Steve')
    > joe = People(steve)
    > mystruct = [(steve, joe, Dollars(5)), steve]
    >
    > Now, should a deepcopy have two distinct steves, or one? Is the $5.00
    > in the new structure "the same as" another $5.00 or not? Sometime you
    > mean the identity of an object, and sometimes you mean it as a value,
    > and no general purpose function named deepcopy can guess where to stop
    > copying.


    Valid questions, but just because they can have different answers need
    not mean that deepcopy can't pick its own set of answers and stick to
    them. In the deepcopy case, it will have one distinct steve
    (referenced twice from the new list, and once from within the new joe
    within the list) because deepcopy keeps a memo list of objects it has
    already copied (to avoid problems with recursion). By doing so it
    achieves the primary goal of ensuring that the steve references in the
    copy point to a distinct steve from the original (since it is the
    original container object you are deep copying), but they remain
    internally consistent with how they were used in the original object.

    Likewise, the Dollars() instance in the copy will be distinct from
    that in the original.

    A subtlety (which I'm guessing you are referring to by your identity
    comment) is that a deepcopy (IMHO) is aimed at ensuring that the new
    copy does not have the ability to mutate objects in the original.
    Towards that end, there's no real need to create copies of immutable
    objects since they can't be mutated in the first place. So immutable
    objects such as numbers, tuples, etc.. will simply have references to
    the same object placed in the copy (unless overridden by
    __deepcopy__). This is no different to how a copy.copy() of an
    immutable object will return a reference to the same object (barring
    an override of __copy__).

    You could probably view the general deepcopy guideline as making the
    minimum number of actual copies to ensure that no references to
    mutable objects from the original object remain in the copy, but that
    the internal reference structure from the original is maintained. In
    general, I find that to be a very practical approach.

    Of course, if that's not what you wanted, then deepcopy isn't going to
    fit the bill. If you have control of the objects involved, the
    __copy__ and __deepcopy__ hooks are provided to let you make some
    changes, but it certainly may not fit all cases. But that hardly
    makes it "incorrect" or less useful for cases where it does fit (which
    to be honest, I'd guess is the majority of them).

    -- David
     
    David Bolen, Jun 7, 2004
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    647
  2. Donnal Walter

    deepcopy in new-style classes

    Donnal Walter, Jun 22, 2004, in forum: Python
    Replies:
    0
    Views:
    276
    Donnal Walter
    Jun 22, 2004
  3. OKB (not okblacke)

    deepcopy raises TypeError for method/function?

    OKB (not okblacke), Sep 9, 2004, in forum: Python
    Replies:
    3
    Views:
    424
    Alex Martelli
    Sep 10, 2004
  4. Dan Perl
    Replies:
    0
    Views:
    335
    Dan Perl
    Sep 14, 2004
  5. ‘5ÛHH575-UAZWKVVP-7H2H48V3
    Replies:
    7
    Views:
    710
    Kanenas
    Feb 15, 2005
Loading...

Share This Page