does python have useless destructors?

  • Thread starter Michael P. Soulier
  • Start date
D

David Turner

Marcin 'Qrczak' Kowalczyk said:
It's not statically known which variables hold objects which define __del__.
This implies that you must walk over all local variables of all function
activations, in addition to GC overhead, and you must manage reference
counts in all assignments. I'm afraid it's unacceptable.

You don't need static knowledge of which names refer to deterministic
objects. You only need to know at three points: when binding a name,
when exiting a function scope, and when calling __del__ on an object.

In all three cases, you have immediate access to a list of objects
that are to be unreferenced.

So what's the problem? The vast majority of objects I suspect will
not have __del__ methods, so the reference counting that has to be
performed is minimal.

Regards
David Turner
 
D

David Turner

Carl Banks said:
Are you aware that CPython already does this?

Yes, of course. As I pointed out elsewhere, there is a weakness in
the current CPython implementation, in that exceptions grab extra
references to locals for the traceback. A relatively minor code block
at the end of each exception handler could fix this problem.

I assumed that you knew this, and I assumed that you were aware of the
limitations of reference counting (in particular, that you cannot rely
on a reference count ever going to zero). That is why I took your
statement to mean something else.

The limitations you refer to occur extremely infrequently in practice.
Besides which, normally one does not create multiple references to
RAII objects. Even if one did, I've never seen a situation in which a
cycle involving a RAII object was created. I think it's unlikely that
this would ever occur, simply because RAII objects tend not to
reference anything that they don't explicitly own. That's the whole
point of RAII, after all.

Bear in mind, we're not talking about reference-counting *everything*,
only those objects that *explicitly* define a __del__ method. Those
are few and far between.

But if I assumed incorrectly, then you should know this: reference
counting simply doesn't take away the need to explicitly release
resources, unless you don't care about robustness (that thing you
claimed RAII is). It works for the common case most of the time, but
the danger always lurks that a reference could get trapped somewhere,
or a cycle could arise.

It's far less likely that an unwary programmer creates a cycle
involving a RAII object (for the reasons I explained above) than that
he forgets to write a try/finally block.

At any rate, this criticism is beside the point: even if RAII doesn't
solve all problems, that's still not a good enough reason not to
consider it for inclusion in Python.

I have yet to hear a strong argument as to why it would be impossible
to implement on Python's current platforms.

If you don't want that stuff happening, then you better use the
explicit finally block, reference counted finalization or not.

Destruction, please. Finalization is another thing altogether.


Regards
David Turner
 
D

David Turner

Terry Reedy said:
Thanks. This makes it much clearer what behavior some people miss, and why
the behavior is hard to simulate in CPython (all PyObjects are heap
allocated).

In fact, the RAII idiom is quite commonly used with heap-allocated
objects. All that is required is a clear trail of ownership, which is
generally not that difficult to achieve. Here's some C++ code I wrote
recently:

typedef boost::shared_ptr<std::ifstream> file_ptr;
typedef std::stack<file_ptr> file_stack;

file_stack files;

files.push(file_ptr(new std::ifstream(file_to_compile)));
while(!files.empty())
{
std::ifstream& f = files.top().get();
token << f;
if (token == Token::eof)
files.pop();
else
parse(token);
}

The RAII object in this case is std::ifstream. What I have is a stack
of file objects, wrapped in the reference-counting container
boost::shared_ptr. I know that the files will be closed no matter
what happens in the while() loop, because the ifstream class closes
the file when it is destroyed. It will be destroyed by shared_ptr
when the last reference disappears. The last reference will disappear
(assuming stored another reference in some outer scope) when the
"files" container goes out of scope.

It's pretty darn hard to mess something like this up. So yes, you
*can* use heap-based objects as RAII containers.

Regards
David Turner
 
D

David Turner

Carl Banks said:
[snip]
I don't need to know whether my function is the sole user of an object
and it falls to me to free it when I'm done, because the system takes
care of that. I get it, I use it, I forget about it.

The problem is, you can't always afford to forget about it. Sometimes
you have to make sure that at this point in the program, this resource
has been released.

If you're relying on garbage collection to do that for you, you're
asking for trouble.


Carl, this is why we're suggesting a mechanism other than garbage
collection to deal with this. It's very handy to wrap a resource in
an object and put the acquire() in __init__ and the release() in
__del__. But this only works if we know when __del__ is going to be
called. Therefore, when an object explicitly declares __del__ (which
happens very rarely), it should use a reference counting mechanism to
decide when to call __del__ *as opposed to calling __del__ when the
object is garbage collected*.

This is exactly how CPython works at the moment, bar one exceptional
case which needs to be fixed.

Regards
David Turner
 
M

Marcin 'Qrczak' Kowalczyk

You don't need static knowledge of which names refer to deterministic
objects. You only need to know at three points: when binding a name,
when exiting a function scope, and when calling __del__ on an object.

You must manage reference counts on any parameter passing (increment),
function return (decrement refcounts of local variables), function result
(increment the refcount of the value being returned if it's a constant
or a global variable), and assignment (increment one defcount, decrement
another). Each decref is accompanied with a test for 0.

This is a high overhead. You basically impose the overhead CPython has
to all implementations, even those which already manage liveness of their
objects using a better GC scheme.

It doesn't help that only some objects must have reference counts changed,
because you statically don't know which are these, so you must check all
potential objects anyway.
 
M

Marcin 'Qrczak' Kowalczyk

The D programming language somehow contrives to have both garbage
collection and working destructors. So why can't Python?

What happens in D when you make a pointer to a local variable,
the variable goes out of scope, and you use the pointer?

Undefined behavior? If so, it's unacceptable for Python.
 
I

Isaac To

Marcin> What happens in D when you make a pointer to a local variable,
Marcin> the variable goes out of scope, and you use the pointer?

Marcin> Undefined behavior? If so, it's unacceptable for Python.

After reading some docs of D, I find no reference to the behaviour that
Turner talked about. There is a way in which one can make sure memory is
deallocated when a function exits, which uses the alloca C function. Such
objects *must* not have destructor. It is rather obvious what the run-time
would do: simply allocate them on stack using the GNU C alloca function, and
forget them completely. When the function returns, the stack pointer
manipulation will get rid of the object. It is easy for these objects to
work with the implemented copy collector of D: just scan the stack as well
when doing copy-collect. Since they have no destructor, there is no
consequence of destruction of such objects at all during such stack wind-up,
other than freeing up some memory which can happen during the next GC run.

There is some mentions about reference counting. But those must be
implemented by the programmer himself. I.e., the programmer must allocate
fields himself to do reference counting, and do the increment and decrement
himself. The D runtime does no reference counting. And the program would
be buggy (e.g., call destructor too early even when a reference to it is
still there) if the programmer miss an AddRef or Release somewhere. Indeed,
from the docs of D about the "benefit of GC", it is quite obvious that D
would do anything to avoid having to do reference counting in the runtime.
I'm wondering whether Turner makes the behaviour all up.

Regards,
Isaac.
 
D

Duncan Booth

Donn Cave said:
Er, Python guarantees that the object will almost always be destroyed?
What the heck does that mean?

It means the same as it does in other systems with garbage collection (such
as Java or .Net). Your object will be destroyed when the garbage collector
determines that it is unreachable. The 'almost always' is because even with
C Python's reference counting where when your program exits the system
attempts to free all the objects, if you start deliberately trying to
prevent it freeing everything it will give up.

I don't know of any systems either using reference counting or garbage
collections that actually guarantee that every object will be fully
finalised on program termination.
| Guaranteeing that all resources will be released at a specific time
| has implications for performance, and finalisers are actually pretty
| rare in practice, so the usual solution is to compromise.

It isn't clear to me what it takes to guarantee immediate
finalization, so if it is to you, that might be worth spelling out a
little more. It doesn't cost anything in the ordinary case, I mean
with C Python's reference count storage model, if you get it, you get
it for free.

You can guarantee immediate finalization for some objects by reference
counting, but reference counting is an overhead. Just because C Python is
paying that cost doesn't mean the cost isn't there.
 
D

David Turner

Marcin 'Qrczak' Kowalczyk said:
On Mon, 14 Jun 2004 00:05:45 -0700, David Turner wrote:

It doesn't help that only some objects must have reference counts changed,
because you statically don't know which are these, so you must check all
potential objects anyway.

That's true, but the check could be cached so that all it would amount
to is a single test-and-branch. The crucial difference is that
there's no need to acquire a mutex.

Regards
David Turner
 
D

David Turner

Marcin 'Qrczak' Kowalczyk said:
What happens in D when you make a pointer to a local variable,
the variable goes out of scope, and you use the pointer?

Undefined behavior? If so, it's unacceptable for Python.

Are you suggesting that my proposed modification to the destructor
semantics will result in such misbehaviour? If so, please tell me why
you believe that to be the case.

We're not talking about pointers here, we're talking about
reference-counted objects.

Regards
David Turner
 
A

Aahz

Yes, of course. As I pointed out elsewhere, there is a weakness in
the current CPython implementation, in that exceptions grab extra
references to locals for the traceback. A relatively minor code block
at the end of each exception handler could fix this problem.

You're far more likely to succeed in changing exception handling
semantics than general object handling semantics. If you're serious
about fixing this hole in Python, write a PEP.
 
M

Marcin 'Qrczak' Kowalczyk

Are you suggesting that my proposed modification to the destructor
semantics will result in such misbehaviour? If so, please tell me why
you believe that to be the case.

It answers "why can't Python if D can". If D works similarly to C++ here
(I couldn't find the object model details described in D documentation),
it trades safety for speed by embedding objects inside variables. This
makes destructors easy, because you statically know when an object will
be destroyed and you just don't care if someone still refers to it.

But Python doesn't put objects inside variables, so it's impossible to
accurately and quickly find which objects are no longer referred to.
I don't believe D promises to run destructors of heap-allocated objects
automatically in the very moment the last reference to it is dropped.
 
D

Donn Cave

Duncan Booth said:
It means the same as it does in other systems with garbage collection (such
as Java or .Net). Your object will be destroyed when the garbage collector
determines that it is unreachable. The 'almost always' is because even with
C Python's reference counting where when your program exits the system
attempts to free all the objects, if you start deliberately trying to
prevent it freeing everything it will give up.

Well, I think more could be said about that, and has been in this
thread. My point was only that `guarantee that ... almost always'
adds up to nonsense. Sorry I didn't spell that out.
You can guarantee immediate finalization for some objects by reference
counting, but reference counting is an overhead. Just because C Python is
paying that cost doesn't mean the cost isn't there.

As long as Python is doing it that way, it's academic whether
the cost is here or there. But I see we're again talking about
a vacuous guarantee, as in `guarantee .. for some objects', kind
of a shift of subject. If that's really what you were trying to
say, then thanks for clearing that up. Otherwise, I'm confused.
Guaranteeing that all resources will be released at a specific time
would presumably mean solving the problems with cycles and __del__,
module shutdowns etc. It looked like you knew of some likely
extra cost there.

Donn Cave, (e-mail address removed)
 
C

Carl Banks

David said:
The limitations you refer to occur extremely infrequently in practice.

1. That's not my experience at all (it's not the the objects are
involved reference cycles; it's that it's a member of some other
object that's reference cycled).

2. Your claim of robustness is out the window. What you ask for does
not free the programmer from having to worry about when the
resource will be freed. It merely changes the question to, "Is it
ok if there are cases when this resource totally fails to be
freed?".

If that's all you want, and if doesn't adversely affect other concerns
(say, performance) too much, then I don't object to it, but it better
come with a disclaimer in big letters, because, unlike in C++, it
doesn't guarnatee anything.
 
C

Carl Banks

David said:
Carl Banks said:
[snip]
I don't need to know whether my function is the sole user of an object
and it falls to me to free it when I'm done, because the system takes
care of that. I get it, I use it, I forget about it.

The problem is, you can't always afford to forget about it. Sometimes
you have to make sure that at this point in the program, this resource
has been released.

If you're relying on garbage collection to do that for you, you're
asking for trouble.


Carl, this is why we're suggesting a mechanism other than garbage
collection to deal with this.
[snip]

Replace "garbage collection" with "automatic finalization" and
everything I just said is just as true.

This is exactly how CPython works at the moment, bar one exceptional
case which needs to be fixed.

What case is that?
 
A

Aahz

In fact, the RAII idiom is quite commonly used with heap-allocated
objects. All that is required is a clear trail of ownership, which is
generally not that difficult to achieve.

Not really. What you're doing is what I'd call "virtual stack" by
virtue of the fact that the heap objects are being managed by stack
objects.
Here's some C++ code I wrote recently:

typedef boost::shared_ptr<std::ifstream> file_ptr;
typedef std::stack<file_ptr> file_stack;

file_stack files;

files.push(file_ptr(new std::ifstream(file_to_compile)));
while(!files.empty())
{
std::ifstream& f = files.top().get();
token << f;
if (token == Token::eof)
files.pop();
else
parse(token);
}

The RAII object in this case is std::ifstream. What I have is a stack
of file objects, wrapped in the reference-counting container
boost::shared_ptr. I know that the files will be closed no matter
what happens in the while() loop, because the ifstream class closes
the file when it is destroyed. It will be destroyed by shared_ptr
when the last reference disappears. The last reference will disappear
(assuming stored another reference in some outer scope) when the
"files" container goes out of scope.

It's pretty darn hard to mess something like this up. So yes, you
*can* use heap-based objects as RAII containers.

It's pretty darn easy: what happens when you pass those pointers outside
of their stack holders? That's a *normal* part of Python programming;
changing that requires a whole new paradigm.
 
?

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Roger said:
Actually I did.

Can you please repeat your proposed solution? I must have
missed it. All I can find is

<quote>
I would do the same thing that shutdown does. Run
the destructors, but don't delete the underlying
memory. Set all the fields to None (just as shutdown
points modules at None). Finally in a second round
actually free the memory.
</quote>
(apparently alternative approach of throwing an
exception omitted)

This specifies that you want to do the same thing as
shutdown, but later you say that you don't want to
do the same thing as shutdown does:

<quote>
I meant that during shutdown the modules are forcibly
garbage collected, and to a certain extent Python
treats their names like weak references
</quote>

This is not true: In Python, modules are not forcibly
garbage collected at shutdown. Instead, they are
emptied, and explicitly removed from sys.modules.

You also specify "Run the destructors, but don't
delete the underlying memory". This is, unfortunately,
not a complete specification. It does not specify
what to do, and when to do it. While it says
that we should run destructors, it does not say
which objects to run the destructors for.

You clearly don't mean: "After each statement, run
the destructors on all objects". But I cannot guess
what it is that you mean.

Regards,
Martin
 
D

David Turner

It's pretty darn easy: what happens when you pass those pointers outside
of their stack holders? That's a *normal* part of Python programming;
changing that requires a whole new paradigm.

What happens is that you get an extra reference to it, just as you'd
expect. When that reference is released, the file closes.

So, no new paradigms there then.

Regards
David Turner
 
D

David Turner

Hi
You're far more likely to succeed in changing exception handling
semantics than general object handling semantics. If you're serious
about fixing this hole in Python, write a PEP.

I agree that it would be far easier just to change the exception
handling semantics.

In certain implementations, it would be very much harder to take the
next step towards reference counting objects with __del__.

So what I'll do is write a PEP to fix the exception handling semantics
in CPython, and hope that pressure from users who discover how much
easier it is to write RAII -style code will eventually introduce
reference counting to Jython and friends.

Oops, I revealed my Evil Master Plan...

Regards
David Turner
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top