of destructors, open files and garbage collection

M

massimo s.

Hi,

Python 2.4, Kubuntu 6.06. I'm no professional programmer (I am a ph.d.
student in biophysics) but I have a fair knowledge of Python.

I have a for loop that looks like the following :

for item in long_list:
foo(item)

def foo(item):
item.create_blah() #<--this creates item.blah; by doing that it
opens a file and leaves it open until blah.__del__() is called

Now, what I thought is that if I call

del(item)

it will delete item and also all objects created inside item. So I
thought that item.blah.__del__() would have been called and files
closed.
Question 1:
This is not the case. I have to call del(item.blah), otherwise files
are kept open and the for loops end with a "Too many open files"
error. Why isn't __del__() called on objects belonging to a parent
object? Is it OK?

So I thought:
oh, ok, let's put del(self.blah) in item.__del__()
Question 2:
This doesn't work either. Why?

Thanks a lot,
M.
 
P

Paul Moore

Now, what I thought is that if I call

del(item)

it will delete item and also all objects created inside item.

Sort of, but it's a bit more subtle. You'll stop the name "item" from
referring to your item - if nothing else refers to your item, it will
be garbage collected (and __del__ will get called). But you can have
other references, and in this case, __del__ is not called until *they*
are released as well.

Here's an example:
.... def __del__(self):
.... print "del called"
........ # Let's create a new C, but store a second reference to it in "a".
....
OK. Now in your case, it's a bit more complex. You delete item.
Suppose that causes the item to be garbage collected (there are no
other references). Then, the item will be collected. This removes the
attribute item.blah, which refers to the blah object. So the blah
object is collected - *as long as no other references exist to that
item*. Here's another example:
.... def __init__(self):
.... self.c = C()
.... def __del__(self):
.... print "B's delete called"
....B's delete called
del called
See? Even though b was collected, its c attribute is still accessible
under the name 'a', so it's kept alive.
Question 1:
This is not the case. I have to call del(item.blah), otherwise files
are kept open and the for loops end with a "Too many open files"
error. Why isn't __del__() called on objects belonging to a parent
object? Is it OK?

Did the above help to clarify?
So I thought:
oh, ok, let's put del(self.blah) in item.__del__()
Question 2:
This doesn't work either. Why?

It's not needed - it's not the item.blah reference that's keeping the
blah object alive, it's another one.

You *can* fix this by tracking down all the references and explicitly
deleting them one by one, but that's not really the best answer.
You're micromanaging stuff the garbage collector is supposed to handle
for you. Ultimately, you've got a design problem, as you're holding
onto stuff you no longer need. Whether you use del, or add an explicit
blah.close() method to close the filehandle, you've got to understand
when you're finished with a filehandle - if you know that, you can
close it at that point.

Here's a final example that may help:
.... a.append(C())
....
# Lots of work, none of which uses a ....
a = [] # or del a
del called
del called
del called
del called
del called
del called
del called
del called
del called
del called

See how you finished with all of the C objects right after the for
loop, but they didn't get deleted until later? I suspect that's what's
happening to you. If you cleared out the list (my a = [] statement) as
soon as you're done with it, you get the resources back that much
sooner.

Hope this helps,
Paul.
 
M

Marc 'BlackJack' Rintsch

massimo s. said:
I have a for loop that looks like the following :

for item in long_list:
foo(item)

def foo(item):
item.create_blah() #<--this creates item.blah; by doing that it
opens a file and leaves it open until blah.__del__() is called

Now, what I thought is that if I call

del(item)

it will delete item and also all objects created inside item.

It will delete the *name* `item`. It does nothing to the object that was
bound to that name. If the name was the only reference to that object, it
may be garbage collected sooner or later. Read the documentation for the
`__del__()` method for more details and why implementing such a method
increases the chance that the object *won't* be garbage collected!

Relying on the `__del__()` method isn't a good idea because there are no
really hard guaranties by the language if and when it will be called.

Ciao,
Marc 'BlackJack' Rintsch
 
M

massimo s.

It will delete the *name* `item`. It does nothing to the object that was
bound to that name. If the name was the only reference to that object, it
may be garbage collected sooner or later. Read the documentation for the
`__del__()` method for more details and why implementing such a method
increases the chance that the object *won't* be garbage collected!

Relying on the `__del__()` method isn't a good idea because there are no
really hard guaranties by the language if and when it will be called.

Ok, I gave a look at the docs and, in fact, relying on __del__ doesn't
look like a good idea.

Changing the code as to add an explicit method that closes dangling
filehandles is easy. It would be somehow nice because -since that
method would be added to a plugin API- it *forces* people writing
plugins to ensure a way to close their dangling files, and this may be
useful for a lot of future purposes. However I'd also like to track
references to my objects -this would help debugging a lot. How can I
do that?
 
M

massimo s.

Relying on the `__del__()` method isn't a good idea because there are no
really hard guaranties by the language if and when it will be called.

Ok, I read the __del__() docs and I understand using it is not a good
idea.

I can easily add a close_files() method that forces all dangling files
to be closed. It would be useful in a number of other possible
situations. However, as rightly pointed out by the exhaustive answer
of Paul Moore, tracking references of my objects would be very useful.
How can I do that?
 
T

Terry Reedy

| Hi,
|
| Python 2.4, Kubuntu 6.06. I'm no professional programmer (I am a ph.d.
| student in biophysics) but I have a fair knowledge of Python.
|
| I have a for loop that looks like the following :
|
| for item in long_list:
| foo(item)
|
| def foo(item):
| item.create_blah() #<--this creates item.blah; by doing that it
| opens a file and leaves it open until blah.__del__() is called
|
| Now, what I thought is that if I call
|
| del(item)
|
| it will delete item

No, it removes the association between the name 'item' and the object it is
currently bound to. In CPython, removing the last such reference will
cause the object to be gc'ed. In other implementations, actual deletion
may occur later. You probably should close the files directly and arrange
code so that you can do so before too many are open.

tjr
 
M

massimo s.

No, it removes the association between the name 'item' and the object it is
currently bound to. In CPython, removing the last such reference will
cause the object to be gc'ed. In other implementations, actual deletion
may occur later. You probably should close the files directly and arrange
code so that you can do so before too many are open.

Thanks a lot, I'll follow that way.

m.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top