Garbage collection in C++

C

Chris M. Thomasson

Juha Nieminen said:
Not more problematic than leaving the file handle open.

How do you know that the dtor of the file wrapper object actually closed the
file? The system specific file close function can fail; think along the
lines of POSIX 'fclose()'



(Besides, the file handle was just an *example*.)

I was nit-picking; sorry about that.

:^/
 
J

Juha Nieminen

Hans said:
Are saying that people that can't make proper programs using garbage
collection will make good programs without gc?

No. I'm saying that people who are fluent in programming might be more
tempted to write imperative code rather than modular code when they have
a GC engine available. After all, the imperative code usually requires
less writing (at least for the first version of the program).
 
M

Matthias Buelow

Juha said:
No. I'm saying that people who are fluent in programming might be more
tempted to write imperative code rather than modular code when they have
a GC engine available. After all, the imperative code usually requires
less writing (at least for the first version of the program).

I really don't understand what you mean by this.

Imperative code is:

foobie = bletch;
blorgh = knork;

Modular imperative code is:

Gront::foobie = Blobbr::bletch;
Krawall::blorgh = Wibbl::knork;

assuming there are modules Gront, Blobbr, Krawall and Wibble, which
contain these entities and :: denotes module resolution.

Manual memory management is imperative. In contrast, automatic memory
management is practically a necessity in non-imperative languages.
Functional languages like ML, Haskell etc. all have automatic memory
management. Lisp has always had GC. It would be totally impractical
attempting to manage memory manually in these high-level languages.
 
J

Juha Nieminen

Stefan said:
»There were two versions of it, one in Lisp and one in
C++. The display subsystem of the Lisp version was faster.
There were various reasons, but an important one was GC:
the C++ code copied a lot of buffers because they got
passed around in fairly complex ways, so it could be quite
difficult to know when one could be deallocated. To avoid
that problem, the C++ programmers just copied. The Lisp
was GCed, so the Lisp programmers never had to worry about
it; they just passed the buffers around, which reduced
both memory use and CPU cycles spent copying.«

Yes, it's completely "fair" to compare a C++ program which doesn't
understand the concept of a smart pointer to lisp, where "smart
pointers" (in the sense that memory is garbage-collected) are inherent.

Yes: It requires more work to make that work in C++ with smart
pointers than the same thing in lisp (although making a copy-on-write
version of std::vector is not all that hard). However, that doesn't mean
that the C++ version cannot be as efficient as the lisp version. It only
means the C++ programmers were incompetent.
»A lot of us thought in the 1990s that the big battle would
be between procedural and object oriented programming, and
we thought that object oriented programming would provide
a big boost in programmer productivity. I thought that,
too. Some people still think that. It turns out we were
wrong. Object oriented programming is handy dandy, but
it's not really the productivity booster that was
promised. The real significant productivity advance we've
had in programming has been from languages which manage
memory for you automatically.«

I still wouldn't trade modules (the most crucial part of OOP) for GC,
if I had to make an excluding choice.
»[A]llocation in modern JVMs is far faster than the best
performing malloc implementations. The common code path
for new Object() in HotSpot 1.4.2 and later is
approximately 10 machine instructions (data provided by
Sun; see Resources), whereas the best performing malloc
implementations in C require on average between 60 and 100
instructions per call (Detlefs, et. al.; see Resources).

I like how this misses one of the main reasons why memory allocation
can be slow: Cache behavior.

From experience I would estimate that the number of instructions
executed when allocating/deallocating amounts to maybe 10-20% of the
total speed, and the remaining 80-90% depends on how cache-friendly the
allocation system is. Cache misses are enormously expensive.

Good GC engines do indeed have the advantage that they can defragment
memory and make allocations more cache-friendly. This is harder (but not
impossible) to do in C/C++.
»Perhaps the most important realisation I had while developing
this critique is that high level languages are more important
to programming than object-orientation. That is, languages
which have the attribute that they remove the burden of
bookkeeping from the programmer to enhance maintainability and
flexibility are more significant than languages which just
add object-oriented features. While C++ adds object-orientation
to C, it fails in the more important attribute of being high
level. This greatly diminishes any benefits of the
object-oriented paradigm.«

On the other hand many "high-level languages" offer abstractions at
the expense of memory usage efficiency. There are "high-level languages"
where it's prohibitively difficult to create very memory-efficient
programs (which handle enormous amounts of data) while still maintaining
a fair amount of abstraction and maintainability.

This seems to have been the trend during the past 10-20 years:
Completely disregard memory usage efficiency in favor of higher level
abstractions. After all, everybody has a supercomputer on their desk and
everything they do with it is play tic-tac-toe. You don't need memory
usage efficiency, right?

C++ might not be all fancy-pancy with all the latest fads in
programming, but at least it offers the tools to write very
memory-efficient programs which are still very well designed, with a
high degree of modularity, abstraction and maintainability. That cannot
be said about all programming languages.
 
J

Juha Nieminen

James said:
Sorry, but it's a statement of fact.

It might be a fact in *your* case. I disagree on it being a fact for
everybody.

I suppose we'd have to just agree to disagree.
 
M

Matthias Buelow

Juha said:
abstractions. After all, everybody has a supercomputer on their desk and
everything they do with it is play tic-tac-toe. You don't need memory
usage efficiency, right?

And what is most of the desktop software written in?
C++ might not be all fancy-pancy with all the latest fads in
programming, but at least it offers the tools to write very
memory-efficient programs which are still very well designed, with a
high degree of modularity, abstraction and maintainability.

However, it seems to be exceedingly difficult to actually do so;
otherwise, why are most of today's applications so slow, memory hogging
and buggy?
 
J

Juha Nieminen

Matthias said:
Isn't that a good thing?

No, because it makes the code less abstract and less modular.
The best code is the one you don't have to
write. Down with bureaucracy. Creating objects just to keep track of
memory dependencies is... inane?

Then don't create objects whose sole purpose is to keep track of
memory dependencies. Create objects which have a purpose and a role in
the entire design.

This might not be the best possible example of what I'm talking about,
but I think it's at least close:

Don't do: SmartArray<char> myString = new char[100];

Instead, do: String myString(100);

The "String" is effectively performing the same bookkeeping job as the
"SmartArray<char>" would, but it's already at a much higher conceptual
and abstraction level, and thus better from a design point of view.

Gargabe collection often tempts writing code like the first example,
rather than code like the second example.
Imperative programming and modular programming are orthogonal concepts.

An imperative programming style can be detrimental if overweights the
modularity of the overall design.
 
C

Chris M. Thomasson

Matthias Buelow said:
And what is most of the desktop software written in?




However, it seems to be exceedingly difficult to actually do so;
otherwise, why are most of today's applications so slow, memory hogging
and buggy?

Because bloat-ware is cool! ;^D



Why do you think that most of Google is written in C++?
 
J

Juha Nieminen

Matthias said:
However, it seems to be exceedingly difficult to actually do so;
otherwise, why are most of today's applications so slow, memory hogging
and buggy?

Becomes the majority of programmers hired out there are incompetent?

A different programming language is not going to help that problem.
 
J

Juha Nieminen

Chris said:
»[A]llocation in modern JVMs is far faster than the best
performing malloc implementations.
[...]

This is hardcore MAJOR BULLSHI%! Who ever wrote that crap is __TOTALLY__
ignorant; WOW!

Have you actually measured for yourself, or are you basing your
opinion on assumptions?
 
J

Juha Nieminen

Matthias said:
I really don't understand what you mean by this.

char* myString = new char[100];

vs.

std::string myString(100, ' ');

(And don't nitpick on the details of the example. It's just an
example. Try to understand my *point*.)

If we had a GC engine, we might be tempted to write like the first
line above because, after all, we don't have to care about the
management of that memory. Of course the first line is extremely
non-abstract: It hard-codes the type, size and geometry of the array,
and exposes that it's accessed through a char pointer.

The second line is the more modular approach: It has encapsulated the
details inside the object, rather than them being exposed. It doesn't
hard-code the array type, it doesn't even hard-code that it *is* a
(contiguous) array, it doesn't hard-code how the array is accessed or
how it's stored in memory, and it doesn't necessarily make the array
fixed in size.

However, if you didn't have a string class like that already handled
to you by a library, the lazy approach would be to use the first method
rather than doing the work of actually encapsulating the whole concept
of "string" into a class.

I feel that a GC engine just gives the incentive to do exactly that:
Skip on the pesky encapsulation and do it in the imperative way. It's
less writing in the short run.
 
M

Matthias Buelow

Juha said:
Don't do: SmartArray<char> myString = new char[100];
Instead, do: String myString(100);

Gargabe collection often tempts writing code like the first example,
rather than code like the second example.

I don't see any truth to this statement at all.
An imperative programming style can be detrimental if overweights the
modularity of the overall design.

This sentence doesn't make any sense either. Are you using a gibberish
generator?
 
M

Matthias Buelow

Juha said:
char* myString = new char[100];
vs.
std::string myString(100, ' ');
If we had a GC engine, we might be tempted to write like the first
line above because, after all, we don't have to care about the
management of that memory.

Maybe you would, I wouldn't.
The second line is the more modular approach: It has encapsulated the
details inside the object, rather than them being exposed. It doesn't
hard-code the array type, it doesn't even hard-code that it *is* a
(contiguous) array, it doesn't hard-code how the array is accessed or
how it's stored in memory, and it doesn't necessarily make the array
fixed in size.

And what's that got to do with GC vs. manual memory management?
However, if you didn't have a string class like that already handled
to you by a library, the lazy approach would be to use the first method
rather than doing the work of actually encapsulating the whole concept
of "string" into a class.

I feel that a GC engine just gives the incentive to do exactly that:
Skip on the pesky encapsulation and do it in the imperative way. It's
less writing in the short run.

So you say, the whole incentive for you to abstract is memory handling.
I'm sorry you don't see the more obvious advantages of abstraction.
Automatic memory management is an abstraction in the same way that using
a string datatype over a char array is: it is relieving the programmer
of uninteresting implementation details.
 
J

Juha Nieminen

Matthias said:
Juha said:
Don't do: SmartArray<char> myString = new char[100];
Instead, do: String myString(100);

Gargabe collection often tempts writing code like the first example,
rather than code like the second example.

I don't see any truth to this statement at all.

Then don't.
This sentence doesn't make any sense either. Are you using a gibberish
generator?

If you don't want to understand, then so be it.
 
J

James Kanze

Memory is not the only resource which needs managing.

And? Garbage collection is only concerned with memory. Other
resources need other tools.
(And even in some cases what you are managing *is* memory, but
not memory which a GC can collect. Eg. the "memory" being
managed might be, for example, an mmapped file or a bitmap in
the display hardware.)

Not sure I follow. There are certainly cases where what
superficially looks like memory isn't memory in the garbage
collector sense; in those cases, you use some other tool.
Also sometimes even when managing pure memory, you might still
want to do something special before deallocating. (Weak
pointers are a very good example.)

Well, most garbage collectors support finalization (which is NOT
the same thing as a destructor); I'll admit, though, that I've
never really found a use for it.
 
C

Chris M. Thomasson

Juha Nieminen said:
Chris said:
»[A]llocation in modern JVMs is far faster than the best
performing malloc implementations.
[...]

This is hardcore MAJOR BULLSHI%! Who ever wrote that crap is __TOTALLY__
ignorant; WOW!

Have you actually measured for yourself, or are you basing your
opinion on assumptions?

I have measured for myself. I have created multi-threaded allocators that
are just as fast, or faster, than recent JVM's.
 
C

Chris M. Thomasson

Matthias Buelow said:
Very good; I would think that this can (and is being) implemented for
automatic memory management, too.
Absolutely!




In the trivial case (no inter-thread
dependencies), it is obvious, and in other cases, you still have the
same synchronization problem with manual management, only it might've
moved from the alloc/dealloc routines into the application logic.

The internal allocator synchronization is separate from the application
synchronization. So, technically, its was not moved from alloc/dealloc
routines to application. The application always needs to ensure that the
memory it "explicitly" frees is in a persistent quiescent state. GC can
remove the need to follow this requirement.

This is why GC can make the creation of non-blocking algorihtms easier.
However, it has the side-effect of making your algorithm totally GC
dependant. You cannot really port it to a platform that does not have a GC.
Luckily, there are other algorihtms out there than can make non-blocking
algorithms GC independent.
 
C

Chris M. Thomasson

How can GC improve the performance of a multithreaded application? Can
you show a convincing example?

What about removing the need for atomic reference counting and/or expensive
memory barriers? Of course you don't need a heavy full-blown GC to achieve
that. One can instead make use of an efficient PDR algorithm; PDR stands for
`Partial Copy-On-Write Deferred Reclamation'. The term was coined by Joe
Seigh over on `comp.programming.threads':

http://groups.google.com/group/comp.programming.threads/browse_frm/thread/82376b7295e6ce1a


If you wish to lean more about the technique, please ask on that newsgroup.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,792
Messages
2,569,639
Members
45,353
Latest member
RogerDoger

Latest Threads

Top