Problem with garbage collection (sort of)

F

Frank Millman

Hi all

I am writing a general accounting application. A user can log in to
the application and stay there for a long time, maybe all day, moving
around the menu system, selecting an item to perform a task, then
returning to the menu and selecting another item. I want to ensure
that each item selected cleans itself up correctly when it is
completed. My worry is that if I do not do this, the number of 'dirty'
items will accumulate in memory and will cause performance to suffer.

I have come across a situation where a number of instances created by
an item are not deleted when the item has completed. It took me a long
time to identify what was actually happening, but now that I have
identified it, the 'fix' that I have come up with is rather ugly.
Below is a very artificial example to explain the scenario.

Should I worry about this, or is the performance hit negligible?
Assuming that I should worry about it, can anyone suggest a better
solution? It may be that the problem is caused by bad design, in which
case I may follow this up with more details, to see if anyone can come
up with a better approach.

#------------------------------------------------------
class a:
def __init__(self,b):
self.b = b
print 'a created'
def __del__(self):
print 'a deleted'

class b:
def __init__(self):
self.p = []
e = a(self)
self.p.append(e)
print 'b created'
def __del__(self):
print 'b deleted'

class c:
def __init__(self):
self.f = b()
print 'c created'
def __del__(self):
# del self.f.p
print 'c deleted'

y = c()
#del y.f.p

#--------------------------------------------

y is an instance of class c, which creates an instance of class b,
which creates an instance of class a. When y goes out of scope and is
deleted, I want the instances of class b and class a to be deleted as
well.

The problem is that class a keeps a reference to class b (self.b) and
class b keeps a reference to class a (self.p), so the reference counts
do not go down to zero without some additional action.

I have 2 possible solutions - the two commented-out lines. If either
of these lines are uncommented, all instances are deleted.

The ugliness is that p is internal to class b - neither class c nor
the main application requires any knowledge of it. However, my
solution requires one or the other to explicitly delete it.

Any advice will be much appreciated.

Many thanks

Frank Millman
 
A

Aahz

y is an instance of class c, which creates an instance of class b,
which creates an instance of class a. When y goes out of scope and is
deleted, I want the instances of class b and class a to be deleted as
well.

Works for me in Python 2.2.3; what version are you using?
 
M

Michael Hudson

(e-mail address removed) (Frank Millman) writes:

[...]
The problem is that class a keeps a reference to class b (self.b) and
class b keeps a reference to class a (self.p), so the reference counts
do not go down to zero without some additional action.

Sure, but since 2.0, Python supplies that additional action
itself... the problems are the __del__ methods. This must be
documented somewhere... ah, here:

http://www.python.org/doc/current/ref/customization.html#l2h-175

though that's not totally clear.

Basically, a cycle will not be cleared if one of the objects making up
the cycle has a __del__ method.

Sometimes it is possible to arrange things so only "leaf" objects have
__del__ methods, then all this ceases to be a problem.

Cheers,
mwh
 
A

Anand Pillai

The problem you are describing is one of 'cyclical'
references. This happens when obj A holds a reference
to obj B and obj B holds a reference to obj A.

One way to work around this is to implement a global
registry object which stores all the objects in its
dictionary using names as keys. In your code you would
no longer need references to objects inside your classes
but just look them up from the global regsitry using
something like this.

myobj=myglobals.lookup('myobject')
myobj.performsomeaction()

....

Now since myobj is of local scope and not of class
scope (self...) python gc immmediately garbage collects
it. But the original object still remains active in the
global registry.

At the exit of your program write a hook to sys.exitfunc
which cleans up this global registry.

I had a similar problem in one of my projects and implemented
this using Alex Martelli's "borg" non-pattern. The code is
available in my harvestman project page at
http://members.lycos.co.uk/anandpillai

I have not yet got around to testing the actual benefits
in gc I gained from this approach yet :)

-Anand
 
F

Frank Millman

(e-mail address removed) (Frank Millman) wrote:

y is an instance of class c, which creates an instance of class b,
which creates an instance of class a. When y goes out of scope and is
deleted, I want the instances of class b and class a to be deleted as
well.
The problem is that class a keeps a reference to class b (self.b) and
class b keeps a reference to class a (self.p), so the reference counts
do not go down to zero without some additional action.

Thanks for all the replies - I have learned a lot. It seems that this
whole thing is not a problem at all. In other words, my instances
*are* being deleted by the cyclic garbage collector, and my only
problem was that I could not confirm this positively.

As Michael says, the very act of creating a __del__ method in my
classes prevents them from being deleted! However, as I only added the
__del__ method to try to confirm the deletion, this is not a problem
in practice.

I replaced my __del__ methods with the DelWatcher class suggested by
Tim. At first, nothing changed. Then I added 'import gc; gc.collect()'
at the end, and lo and behold, I could see all my instances being
deleted.

Being a bit of a sceptic, I still did not regard this as positive
confirmation that it will work in practice, as I do not have a
gc.collect() in my live application, so I added the DelWatcher class
there to see if I got the 'deleted' messages. I understand that
gc.collect() runs automatically from time to time, so I waited a
while, and did not get any messages. Then as I did some more work in
the application, the messages started appearing for the older items.
Re-reading the documentation on the gc module confirms that
gc.collect() is only triggered when the number of 'dirty' objects
exceeds a threshold.

For the record, I tried Tim's suggestion of using weakrefs, and it
worked perfectly. I did some timing tests and it seems to have very
little overhead. However, as Tim says, it is better to stick to the
cyclic garbage collector now that I have confidence that it is working
correctly.

Many thanks to all.

Frank Millman
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,900
Latest member
Nell636132

Latest Threads

Top