Ron said:
My question is a simple one: how do we combine destructors with GC?
The most straightforward way is for the garbage collector to know how
to invoke the equivalent of delete on the objects. (I would abhor a
scheme whereby GC'able objects have to inherit from some special base
class with a virtual destructor; it would be better if this was
intelligent somehow).
For some types of objects, calling the destructor at GC time might be
too late. Fine; those objects have to be coded with two-step
destruction. Deterministically do the actions that have to be taken,
using a chain of member functions. The destructor then behaves as a
finalizer. It ensures that those actions happen, plus any other cleanup
actions.
A class written like that can be used with garbage collection, as well
as with new/delete and RAII.
Destructors do not become superfluous just because one usage for them
does. Closing files and shutting down other resources in a timely manner
becomes hard when object termination occurs at an indeterminate time.
The indeterminate lifetime of an object isn't caused by garbage
collection. It's caused by the semantics of the program. Garbage
collection solves the problem of computing that lifetime.
If you don't have garbage collection or some other scheme, you still
have to compute the lifetime of an object and call for it to be
destroyed by explicit delete.
And if you do that, that delete call may also be too late in releasing
an operating system resource.
You simply have to regard that operating system resource handle has
having its own lifetime, which is contained within the lifetime of the
encapsulating object.
You can take the responsibility for computing that contained lifetime,
and let the garbage collector determine the main lifetime.
You don't want to use the destructor for ending the resource lifetime,
because that will turn your entire object into garbage, while it is
still reachable.
So the obvious thing is release the resource and change the state of
the object to indicate that the object does not have that resource.
Garbage collection is not incompatible with your program knowing when
to release a resource.
class ResourceWrapper {
private:
SomeResource *res;
public:
// ...
void ReleaseResource()
{
if (res != 0) {
DestroyResource(res);
res = 0;
}
}
virtual ~ResourceWrapper()
{
ReleaseResource();
}
};
Okay, so that gets the obvious out of the way. Now the problem.
The issue is that functions like ResourceWrapper::ReleaseResource() are
ad hoc. Whereas a destructor is a formalism built into the language.
The destructor formalism does something nice. Namely, it ensures that
the member and base destructors are called.
Here, the partial cleanup done by ReleaseResource() has the
responsibility of doing whatever needs to be done in the base and
member objects, if anything. If ReleaseResource() is virtual and is
overridden, the derived ReleaseResource() will probably have to call
the parent one.
In the intelligently designed Common Lisp Object System, any method can
be endowed with auxiliary methods which are called if that primary
method is called. The auxiliary methods can be specialized throughout
the class lattice, and be designated as "before", "after" or "around".
In CLOS terminology, the automatic constructor calling in C++ resembles
before methods being fired. Whereas destructors are after-methods. Sort
of. The most derived destructor that is called is kind of like the
primary method, and the base ones that are called are like
after-methods. There is no counterpart to the automatic calling of
destructors on member objects.
C++ could benefit from member functions which can be extended with
auxiliaries. In a class having some virtual function Foo() it would be
nice to be able to define a special overload of Foo() which is always
called before Foo(), and another overload which is always called after.
That is to say, if the virtual Foo() is invoked on the object, then the
before Foo() is called in that class, and in all the derived classes
which also have one. Then the appropriate override of Foo() is invoked,
at whatever level in the class hierarchy that may be, and then the
afters are called, in derived to base order.
With befores and afters, certain aspects of resource cleanup would be
easier to manage. The ResourceWrapper would look like this:
virtual void ReleaseResource()
{
// nothing to do here now; it's moved to the after function
}
after void ReleaseResource()
{
if (res != 0) {
DestroyResource(res);
res = 0;
}
}
So now if ReleaseResource() is called on that object, no matter how
that is derived, the ReleaseResource() after-function is called.
(Provided that no bullshit happens with exceptions!) If someone derives
from this class and overrides ReleaseResource(), that will not prevent
our cleanup from happening.
These before and after functions don't need any weird magic in the
vtable or anything. They are not virtual (and in fact making them so
ought to be forbidden).
It works simply like this. When the compiler translates the virtual
function which looks like this:
void ReleaseResource(int arg)
{
BODY;
}
it generates the code as:
virtual void ReleaseResource(int arg)
{
BEFORE(arg);
BODY;
AFTER(arg);
}
Here, BEFORE represents the name of the outermost "before
ReleaseResource" function, and AFTER the call to the nearest "after
ReleaseResource". Both functions are called in the normal way. Imagine
scope resolution being used to specify them exactly with whatever funny
names they have known to the compiler.
And of course the befores and afters are automatically instrumented
with exception-safe code which ensures their own continuity. E.g. an
"after ReleaseResource(int arg) { BODY; } is generated as:
after void ReleaseResource(int arg)
{
BODY;
AFTER(arg);
}
No exception safety there; that is deliberate. If anything throws, the
subsequent actions are not invoked.