New C++ garbage collector

M

Matthias Meixner

Just in case you are interested, there is a new garbage collector for C++:

http://sourceforge.net/projects/meixnergc/

So what does it do what others cannot do? Here is the list:

- It is not a conservative garbage collector like e.g. the Boehm
collector. Therefore, all garbage will eventually be cleaned

- Due to using mark and sweep it has no problems with cyclic data
structures (no reference counting).

- It uses incremental garbage collection to smoothly spread the workload
of the garbage collector over the whole runtime

- By using a non-intrusive design it allows to handle all kind of
data/objects.

- It provides a universal smart pointer that can be used to point to
anything, single objects and arrays (no different smart pointer types
for objects and arrays). It may also safely be used to point to data on
stack or global data.

- It simulates the normal automatic pointer type conversion, e.g.
gc_ptr<A> will be automatically casted to gc_ptr<B> if A is a subclass of B.

Currently it is designed to be used with gcc / pthreads. The
single-threading version should be usable with any C++ compiler. The
multi-threading version should work with any platform that supports
POSIX threads.

So what do you think?

- Matthias Meixner
 
D

debra h

Just in case you are interested, there is a new garbage collector for C++:

http://sourceforge.net/projects/meixnergc/

So what does it do what others cannot do? Here is the list:

- It is not a conservative garbage collector like e.g. the Boehm
collector. Therefore, all garbage will eventually be cleaned

- Due to using mark and sweep it has no problems with cyclic data
structures (no reference counting).

- It uses incremental garbage collection to smoothly spread the workload
of the garbage collector over the whole runtime

- By using a non-intrusive design it allows to handle all kind of
data/objects.

- It provides a universal smart pointer that can be used to point to
anything, single objects and arrays (no different smart pointer types
for objects and arrays). It may also safely be used to point to data on
stack or global data.

- It simulates the normal automatic pointer type conversion, e.g.
gc_ptr<A> will be automatically casted to gc_ptr<B> if A is a subclass of B.

Currently it is designed to be used with gcc / pthreads. The
single-threading version should be usable with any C++ compiler. The
multi-threading version should work with any platform that supports
POSIX threads.

So what do you think?

- Matthias Meixner

The collector is also used by a number of programming language
implementations that either use C as intermediate code, want to
facilitate easier interoperation with C libraries, or just prefer the
simple collector interface. For a more detailed description of the
interface, see here: http://www.hpl.hp.com/personal/Hans_Boehm/gc/
 
S

SG

Just in case you are interested, there is a new garbage collector for C++:
http://sourceforge.net/projects/meixnergc/

[...]

The bullet points you list don't really explain much. Your README is
much more informative. Maybe, you should consider putting this README
(or some equivalent documetation) somewhere easy to find so people
don't have to download the source code first.

It seems like an interesting concept. Your GC is even trying to invoke
destructors, though, I would feel uneasy about this in a multi-
threaded application. When and where does garbage collection actually
take place? It sounds like you're performing "little garbage
collection steps" during each invocation of your overloaded new
operator instead of, say, using a separate GC thread. Anyhow, it must
be hard to get it right in a multi-threaded application. I wouldn't be
surprized if there are a couple of problems/bugs left. BTW: I just
skimmed over the source code and wondered why you felt the need to
explicitly disable inlining for a couple functions. Usually, when
people observe that their programs break after turning on compiler
optimizations, it's due to undefined behaviour... ;-)

I generelly don't feel the need to use/test any garbage collectors for
C or C++. I guess the kind of applications I deal with mostly --
number chrunching -- simply doesn't call for garbage collection. I get
along very well without it. That's not to say the topic isn't
interesting!

Cheers!
SG
 
B

Bo Persson

Leigh said:
So you introduce a new feature (gc) at the expense of another
feature (dtors)? Seems pointless to me.

Not really. If you want deterministic destruction, you use a
destructor. If you don't care about the objects, and just want to
recycle the memory, you might use a gc.

The "firm consensus" was that you don't want destructors to run
(closing a file, releasing a lock, flushing a buffer, closing a
socket?) at random moments when you try to allocate more memory. Not
that everyone agree on this.


Bo Persson
 
Ö

Öö Tiib

Resulting in resource leaks?  Seems pointless to me.

Real problem is that the code did leak the object.

GC calling destructors at undetermined moment of time (no GC can
warrant sooner than "never") will add another problem to that real
problem.

For example your end users have issue. Something tries to reuse the
resource but fails since GC is too lazy for them with that particular
object sometimes. Symptoms that show may be something totally
unrelated to that resource.

For your user-support GC might be always eager. So your user-support
can not reproduce the issue and so there are two annoyed and unhappy
people talking with each other over phone thanks to your "non-
pointless" GC.
 
M

Marc

Leigh said:
I would argue that it depends on the resource in question as to whether
or not it is ok to free it at a random moment. Memory can be considered
as one such resource.

Not calling destructors means you can not make use of RAII (one of the
most important C++ features). Garbage collected C++ to me seems like a
pointless mental exercise. If calling delete on an object just called
the destructor but left memory deallocation to the garbage collecter
then the garbage collector becomes, at worst, redundant and, at best, a
way to avoid sloppy programmer memory leaks which if RAII was being
employed would be unlikely in any event.

You seem to see this as an all-or-nothing. I am not at all a specialist
of garbage collection, but it seems to me that you could still run
destructors normally and predictably for objects that reach the end of
their scope or are explicitly deleted. What remains are those objects
that would be leaked without a GC, and for those you would have the
choice of running the destructor or not. If you program carefully
enough, you know what type of objects may be leaked and ensure their
destructor is an appropriate finalization; otherwise it seems safer to
pretend the objects live forever (true leakage) and reclaim the memory
in a way that can't be observed (ie without running a destructor in
particular). Now with too much separation between the GCed objects and
the regular objects, you may lose some of the advantages of having a GC.

As for the usual misconception that the GC is used only by sloppy
programmers who can't handle memory allocation properly... Let's avoid
this conversation and just discuss the best possible GC under the
assumption that for some project we want one.
 
G

Gert-Jan de Vos

Leigh Johnston  wrote:



You seem to see this as an all-or-nothing. I am not at all a specialist
of garbage collection, but it seems to me that you could still run
destructors normally and predictably for objects that reach the end of
their scope or are explicitly deleted. What remains are those objects
that would be leaked without a GC, and for those you would have the
choice of running the destructor or not. If you program carefully
enough, you know what type of objects may be leaked and ensure their
destructor is an appropriate finalization; otherwise it seems safer to
pretend the objects live forever (true leakage) and reclaim the memory
in a way that can't be observed (ie without running a destructor in
particular).  Now with too much separation between the GCed objects and
the regular objects, you may lose some of the advantages of having a GC.

C++ always had deterministic destruction. Existing C++ code and C++
idioms rely on this. If you call destructors when finalizing an object
from a GC, it will break C++ code that was not written with GC in
mind.

I have used MS's C++/CLI a bit which tries to provide full C++ in a
garbage collected environment. It separates the standard destructor
and
the finalizer of an object. The run time makes sure that only one of
these
gets called. The destructor get called as normal, when an automatic
object
reaches end of scope or when a new-ed object is deleted. The finalizer
is
called when the GC collects the object if it was not destructed
before.
The context in which the finalizer runs is entirely different from
that of
the destructor: you cannot use members or base-objects. The fact that
it
also runs non-deterministicly gave us a hard time trying to reproduce
and
debug problems in finalizer code. My current conclusion is that all
objects
that need deterministic life time management (because they own
resources
with side-effects, other than plain memory), must be implemented as
non-GC
objects. Then add a trivial GC-ed wrapper object to expose the object
to
the GC world. Even objects that need large amounts of memory (we work
with
64 MB datasets in a 32 bit OS), are to large to rely on a GC only.

In the end I have to agree with Leigh: C++ style deterministic life
time
management does not mix well with a garbage collector. Still, a unique
advantage of a GC is that a dangling pointer will never refer to
memory
other than the object for which it was allocated.
 
Ö

Öö Tiib

I replied else-thread that certain resources (of which memory can be
considered one) may be suitable for non-deterministic releasing.  C++
and RAII is superiour to C++ and garbage collection; this is my opinion
of course and (hopefully) the opinion of others here too.

Yes, but it seems that you think that RAII and destructors are
unavailable with GC. That is not true. Everything is like normal. Only
thing that GC checks (despite you still explicitly delete everything
when you want to) is that you do not have dangling pointers to it.
Otherwise it does not let you to reuse the underlying memory for
something else with new. That makes dereferencing dangling pointers
not UB but detectable defect with well-predictable outcome. Accessing
something out of bounds still remains UB defect, but since it usually
affects close neighbors it is easier to track down.
 
M

Matthias Meixner

Am 26.10.2010 10:35, schrieb SG:
Just in case you are interested, there is a new garbage collector for C++:
http://sourceforge.net/projects/meixnergc/

[...]

The bullet points you list don't really explain much. Your README is
much more informative. Maybe, you should consider putting this README
(or some equivalent documetation) somewhere easy to find so people
don't have to download the source code first.

It seems like an interesting concept. Your GC is even trying to invoke
destructors, though, I would feel uneasy about this in a multi-
threaded application. When and where does garbage collection actually
take place? It sounds like you're performing "little garbage
collection steps" during each invocation of your overloaded new
operator instead of, say, using a separate GC thread.

Yes. Therefore, there is no need to tune the speed of some GC thread.
The garbage collector speed automatically adapts to the speed of the
object allocation.
Anyhow, it must
be hard to get it right in a multi-threaded application. I wouldn't be

Garbage is not used by any thread any more or it would not be garbage.
Therefore, I do not expect any additional problems due to multithreading
that would not already exist without the garbage collector.
surprized if there are a couple of problems/bugs left. BTW: I just
skimmed over the source code and wondered why you felt the need to
explicitly disable inlining for a couple functions. Usually, when

These functions had a problem with optimizations enabled with
-fstrict-aliasing when inlined due to some pointer casts. This is
resolved in the newest version in the svn repository.
people observe that their programs break after turning on compiler
optimizations, it's due to undefined behaviour... ;-)

In this case the optimizer just overlooked some aliases which resulted
in some wrong assumptions about the code in the optimization phase of
the compiler.
 
M

Matthias Meixner

Am 26.10.2010 19:55, schrieb Bo Persson:
The "firm consensus" was that you don't want destructors to run
(closing a file, releasing a lock, flushing a buffer, closing a
socket?) at random moments when you try to allocate more memory. Not
that everyone agree on this.

With my code you can manually start the garbage collector to clean up
all garbage. This allows you to clean up garbage at a deterministic
point in time. BTW. this is what the unittest does to test the correct
operation of the garbage collector.
So while it may be more sloppy than having every bit under control it
releases you to keep track of everything yourself. From this point of
view I think it is good to call the destructor to also release
non-memory ressources.
 
M

Matthias Meixner

Am 26.10.2010 23:28, schrieb Matthias Meixner:
Am 26.10.2010 19:55, schrieb Bo Persson:

With my code you can manually start the garbage collector to clean up
all garbage. This allows you to clean up garbage at a deterministic
point in time. BTW. this is what the unittest does to test the correct
operation of the garbage collector.
So while it may be more sloppy than having every bit under control it
releases you to keep track of everything yourself. From this point of
view I think it is good to call the destructor to also release
non-memory ressources.

Just another point to remember: In a mixed environment where you have
both types of objects - garbage collected objects and manually managed
objects, it is vital to call the destructor or you would introduce
memory leaks. For example if an object internally holds some pointer to
some unmanaged memory. Now if the destructor of this object would not be
called the memory used by the object itself would be reclaimed but the
unmanaged memory it internally holds some pointers to would not be
released, i.e. producing a memory leak.
 
G

Garrett Hartshaw

Yes, but it seems that you think that RAII and destructors are
unavailable with GC. That is not true. Everything is like normal. Only
thing that GC checks (despite you still explicitly delete everything
when you want to) is that you do not have dangling pointers to it.
Otherwise it does not let you to reuse the underlying memory for
something else with new. That makes dereferencing dangling pointers
not UB but detectable defect with well-predictable outcome. Accessing
something out of bounds still remains UB defect, but since it usually
affects close neighbors it is easier to track down.

IMHO, then the GC seems pointless, as RAII should take care of the
memory problems. While a GC does seem to be a good idea on the surface,
if it comes to be relied on too heavily, a lazy programmer will probably
allow other resources to be leaked than memory, and the GC will do
nothing to stop this. Albeit, lack of GC will do nothing to stop this
either, but by making the programmer concentrate more on what is
destroyed when, the lack of GC makes it more likely that such bugs will
be caught early.
--Garrett Hartshaw
 
J

James Kanze

So you introduce a new feature (gc) at the expense of another
feature (dtors)? Seems pointless to me.

You don't remove destructors; they still work as usual. But the
garbage collector doesn't call them. A garbage collector would
call finalize functions, which are something different. If
a class needs a deterministic destructor (very few do), then
it's finalize function might assert that the destructor has been
called.

The most important aspect of garbage collection isn't that it
prevents memory leaks; it's that it prevents, or at least allows
reliably testing dangling pointers. The memory handling is
convenient in certain contexts as well: it's one less thing the
programmer has to worry about, and it certainly works a lot
better than the usual shared_ptr solutions, but it's not
a silver bullet either (and we've all had to deal with Java
programs which leaked).
 
J

James Kanze

On 26/10/2010 18:55, Bo Persson wrote:

[...]
Not calling destructors means you can not make use of RAII
(one of the most important C++ features).

If you never called destructors, that would be a disaster. But
no one has suggested such a thing.
Garbage collected C++ to me seems like a pointless mental
exercise.

It's a useful tool, and it's essential for robustness, in
critical applications (or for that matter, anything connected to
the network, since dangling pointers have been used to break
into systems).
 
J

James Kanze

Just as RAII prevents memory and resource leaks, RAII also
prevents dangling pointers being an issue.

How?

We're talking about objects with arbitrary lifetimes here.
An experienced (good) C++ programmer will be using RAII in the
form of containers, smart pointers and similar and will be
keeping the use of "naked" pointers to a minimum.

Why? Smart pointers, at least the ones I've seen, don't buy you
anything most of the time.
I cannot remember the last time I had a dangling pointer
issue, it is a rare bug (for me at least).
There is nothing fundamentally wrong with shared_ptr.

Nobody said there was. I use it on a daily basis. It's
a useful tool in certain circumstances. It's not a silver
bullet, however. In some cases, either true garbage collection
or shared_ptr can be used; in such cases, shared_ptr will result
in slightly more work for the programmer, and some cost in
performance, when compared with true garbage collection. In
other cases, shared_ptr doesn't work, and garbage collection
does. And there are cases where shared_ptr works, and garbage
collection doesn't.
This is a typical troll statement I would expect from Mr
Kanze.

It's typical of you to revert to ad hominim, rather than to
discuss real technical issues.
 
J

James Kanze

On 26/10/2010 18:58, Öö Tiib wrote:

[...]
I replied else-thread that certain resources (of which memory can be
considered one) may be suitable for non-deterministic releasing. C++
and RAII is superiour to C++ and garbage collection; this is my opinion
of course and (hopefully) the opinion of others here too.

And C++ with RAII *and* garbage collection is superior to
either. Why limit your options? (If all you have is a hammer,
everything looks like a nail. It's best to have many different
tools in your toolbox, so you can use the most appropriate.)
 
J

Juha Nieminen

Leigh Johnston said:
Not calling destructors means you can not make use of RAII (one of the
most important C++ features).

I don't understand why using a GC engine in a C++ program would disable
RAII. RAII is related to scope-bound lifetime of local objects. It's not
related to the lifetime of dynamically allocated objects (which must be
managed manually or, in this case, via a GC engine).
 
B

Bart van Ingen Schenau

[snip] In
other cases, shared_ptr doesn't work, and garbage collection
does.  And there are cases where shared_ptr works, and garbage
collection doesn't.

Can you give examples of situations where one works but the other not?

Bart v Ingen Schenau
 
J

Juha Nieminen

Bart van Ingen Schenau said:
[snip] In
other cases, shared_ptr doesn't work, and garbage collection
does.  And there are cases where shared_ptr works, and garbage
collection doesn't.

Can you give examples of situations where one works but the other not?

The obvious case where shared_ptr won't work properly is with recursive
references (object X contains a shared_ptr to object Y, which contains a
shared_ptr to object X; and in fact, an even simpler case is an object
containing a shared_ptr pointing to itself, although that's slightly less
likely, but not impossible, to happen by accident). This introduces a
leak (the objects are never deleted), and GC engines do not suffer from
this.

Reference-counting smart pointers can also cause objects to be deleted
too soon, while they are still being used, which is a much less known
fact (most programmers have never encountered or heard of such a thing).
It also requires a kind of recursion, but in this case it's not a
recursion of pointers, but a recursion of code. Basically: Module X
has a shared_ptr to object Y, X calls a method of Y, this method calls
back to X, which causes it to drop the shared_ptr, deleting Y. The
execution goes back to the method in Y, which now operates on a deleted
object. Yes this can happen in actual practical examples, and it's also
something that is not a problem with GC.

I don't know, however, what the cases are where smart pointers work
but GC doesn't...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,059
Latest member
cryptoseoagencies

Latest Threads

Top