Do you use a garbage collector?

Juha Nieminen · Apr 11, 2008

Razii said:
That is not the topic. The topic is how the keyword "new" behaves.

You just don't get it, do you. Your original claim was that "new" in
C++ is slow "because each new calls the OS". That's just false. Each
"new" does *not* call the OS.

Stefan Ram · Apr 11, 2008

Juha Nieminen said:
I don't see how this is so much different from what Java does.

»[A]llocation in modern JVMs is far faster than the best
performing malloc implementations. The common code path
for new Object() in HotSpot 1.4.2 and later is
approximately 10 machine instructions (data provided by
Sun; see Resources), whereas the best performing malloc
implementations in C require on average between 60 and 100
instructions per call (Detlefs, et. al.; see Resources).
And allocation performance is not a trivial component of
overall performance -- benchmarks show that many
real-world C and C++ programs, such as Perl and
Ghostscript, spend 20 to 30 percent of their total
execution time in malloc and free -- far more than the
allocation and garbage collection overhead of a healthy
Java application (Zorn; see Resources).«

http://www-128.ibm.com/developerworks/java/library/j-jtp09275.html?ca=dgr-jw22JavaUrbanLegends

James Kanze · Apr 11, 2008

My point is that the style of C++ programming which produces
safe code tends to also produce good code in other respects as
well. Thus there's a positive side-effect to not having GC in
this case.

That's a non sequitur. The style of Java programming which
produces safe code also tends to produce good code in other
respects as well. Globally, it's probably more difficult to
produce safe code in Java than in C++, but this is despite
garbage collection, not because of it. (In really safe code,
for example, public functions are almost never virtual, unless
the language has some sort of built-in support for PbC, a la
Eiffel. And of course, safe code is best served by strict type
checking---none of this "everything is an Object" business.)

Juha Nieminen · Apr 11, 2008

James said:
You seem to be misunderstanding the argument. There are
specific times when garbage collection might be chosen for
performance reasons, but they are fairly rare, and as you say,
you can also optimize the performance of manual schemes. The
main argument for garbage collection is greater programmer
efficiency.

I understood the argument, and I already said that I don't feel I'm
programming inefficiently. The style of safe modular C++ programming
tends to produce clean designs. In many cases this is *not* at the cost
of increased development time.
In fact, if you can reuse your previously created modular code, the
development time may even decrease considerably. In my experience Java
tends to lead to "reckless programming" ("why should I encapsulate when
there's no need? the GC is handling everything"), which often doesn't
produce reusable code, so in the long run it may even be
counter-productive with respect to development times.

Pascal J. Bourguignon · Apr 11, 2008

James Kanze said:
That's a non sequitur. The style of Java programming which
produces safe code also tends to produce good code in other
respects as well. Globally, it's probably more difficult to
produce safe code in Java than in C++, but this is despite
garbage collection, not because of it. (In really safe code,
for example, public functions are almost never virtual, unless
the language has some sort of built-in support for PbC, a la
Eiffel. And of course, safe code is best served by strict type
checking---none of this "everything is an Object" business.)

Do you mean that in Java, once you've defined a method zorglub on a
class A, since everything is an object, you can send the messaeg
zorglub on an instance of the class B that is not a subclass of A?

James Kanze · Apr 11, 2008

»[A]llocation in modern JVMs is far faster than the best
performing malloc implementations. The common code path
for new Object() in HotSpot 1.4.2 and later is
approximately 10 machine instructions (data provided by
Sun; see Resources), whereas the best performing malloc
implementations in C require on average between 60 and 100
instructions per call (Detlefs, et. al.; see Resources).
And allocation performance is not a trivial component of
overall performance -- benchmarks show that many
real-world C and C++ programs, such as Perl and
Ghostscript, spend 20 to 30 percent of their total
execution time in malloc and free -- far more than the
allocation and garbage collection overhead of a healthy
Java application (Zorn; see Resources).«

Just for the record, this is typical market people speak.

First, of course, it's well known that an allocation in a
compacting garbage collector can be significantly faster than in
any non-compacting scheme (for variable length allocations).
It's also well known that the actual garbage collection sweep in
a compacting collector is more expensive than in a
non-compacting one. You don't get something for nothing.
Whether the trade off is advantages depends on the application.
(I suspect that there are a lot of applications where the
tradeoff does favor compacting, but there are certainly others
where it doesn't, and only talking about the number of
machine instructions in allocation, without mentionning other
costs, is disingenuous, at best.)

Secondly, I've written C++ programs which spend 0% of their
total execution time in malloc and free, and I've had to fix one
which spent well over 99.9% of its time there. With such
variance, "average" looses all meaning. And the choice of Perl
and Ghostscript as "typical" C++ programs is somewhat
disingenuous as well; both implement interpreters for languages
which use garbage collection, so almost by definition, both
would benefit from garbage collection. (In fact, both probably
implement it somehow internally, so the percentage of time spent
in malloc/free is that of a garbage collected program.)

James Kanze · Apr 11, 2008

You are incorrect. Each call into "new" will "most-likely" be
"fast-pathed" into a "local" cache.

Are you sure about the "most-likely" part? From what little
I've seen, most C runtime libraries simply threw in a few locks
to make their 30 or 40 year old malloc thread safe, and didn't
bother with more. Since most of the marketing benchmarks don't
use malloc, there's no point in optimizing it. I suspect that
you're describing best practice, and not most likely. (But I'd
be pleased to learn differently.)

Matthias Buelow · Apr 11, 2008

Juha said:
I understood the argument, and I already said that I don't feel I'm
programming inefficiently. The style of safe modular C++ programming
tends to produce clean designs.

"The style of safe modular XYZ programming tends to produce clean
designs". Valid for any language. The thing is, even if you(r
programmers) have the time and expertise to produce such a shiny clean
design, it won't stay that way. You have to plan for the worst case,
which is, your program will degenerate into a nasty mess, which is
usually the case, over time.

In many cases this is *not* at the cost of increased development time.

If you use a relatively low-level language such as C++, this certainly
_does_ effect your productivity. I can't understand where you get the
idea that it doesn't. A more expressive language, preferrably one
designed for or adaptable to the problem domain, will in many cases
produce a dramatic improvement in productivity over the kind of manual
stone-breaking one is doing in C++. High-level languages and
domain-specific languages are usually implemented with automatic memory
management because of the more abstract programming models being used.
You can design and write the cleanest, shiniest modular C++ code, some
mediocre programmer using a language that is more expressive for the
task at hand will beat you hands down. Maybe the reason why, as you
allege, some kind of "strict discipline" yields better results in C++ is
because this is the case for tasks where the language isn't that well
suited for, and one simply needs this discipline, otherwise one cannot
get useful results.

lbonafide · Apr 11, 2008

And your point is? The simple fact is, that bad programmers write bad
code, and good programmers write good code, no matter what the language
is or whether memory is managed automatically or manually.

But when Sun's marketing tells bad programmers they don't have to
worry abour memory anymore, it's a disaster. They do have to worry,
just in a different way.

James Kanze · Apr 11, 2008

I understood the argument, and I already said that I don't
feel I'm programming inefficiently. The style of safe modular
C++ programming tends to produce clean designs. In many cases
this is *not* at the cost of increased development time.

But how is this relevant to garbage collection. Safe, modular
programming will reduce development times in all cases.

In fact, if you can reuse your previously created modular
code, the development time may even decrease considerably.

Certainly. The large standard library is certainly a plus for
Java.

In my experience Java tends to lead to "reckless programming"
("why should I encapsulate when there's no need? the GC is
handling everything"), which often doesn't produce reusable
code, so in the long run it may even be counter-productive
with respect to development times.

What does encapsulation have to do with garbage collection?
That's what I don't understand. Encapsulation is encapsulation.
It's a must, period. If I were to judge uniquely by the code
I've actually had to work on, I'd have to say that encapsulation
was better in Java. In fact, I know that you can do even better
in C++, despite the way some people program in C++. And I know
that the one Java project I worked on was an exception (in
general, not just as a Java project) in that it was well
managed, and that encapsulation was an important issue up front,
and that some Java programmers tend to ignore it.

If there's a difference, I'd say that it is more related to the
size of the projects in each language. In small projects,
there's less organization, and more temptation to just ignore
the rules. And for reasons which have nothing to do with
garbage collection, Java doesn't scale well to large projects,
and tends to be used for small, short lived projects. But that
has nothing to do with garbage collection.

Matthias Buelow · Apr 11, 2008

But when Sun's marketing tells bad programmers they don't have to
worry abour memory anymore, it's a disaster. They do have to worry,
just in a different way.

Well... "marketing"... "bad programmers"...

Perhaps some of the more simple-minded specimens of "decision makers"
believe that Sun has solved the "memory problem". Maybe some of the less
experienced programmers believe that, too. However, that's not a problem
with automatic memory management or indeed any other technical issue.
You can't even blame Sun's marketing, it's their job, more or less.

Stefan Ram · Apr 11, 2008

Juha Nieminen said:
My point is that the style of C++ programming which produces safe code
tends to also produce good code in other respects as well. Thus there's
a positive side-effect to not having GC in this case.

»Your essay made me remember an interesting phenomenon I
saw in one system I worked on. There were two versions of
it, one in Lisp and one in C++. The display subsystem of
the Lisp version was faster. There were various reasons,
but an important one was GC: the C++ code copied a lot of
buffers because they got passed around in fairly complex
ways, so it could be quite difficult to know when one
could be deallocated. To avoid that problem, the C++
programmers just copied. The Lisp was GCed, so the Lisp
programmers never had to worry about it; they just passed
the buffers around, which reduced both memory use and CPU
cycles spent copying.«

<[email protected]>

A lot of us thought in the 1990s that the big battle would
be between procedural and object oriented programming, and
we thought that object oriented programming would provide
a big boost in programmer productivity. I thought that,
too. Some people still think that. It turns out we were
wrong. Object oriented programming is handy dandy, but
it's not really the productivity booster that was
promised. The real significant productivity advance we've
had in programming has been from languages which manage
memory for you automatically.

http://www.joelonsoftware.com/articles/APIWar.html

James Kanze · Apr 11, 2008

"The style of safe modular XYZ programming tends to produce clean
designs". Valid for any language. The thing is, even if you(r
programmers) have the time and expertise to produce such a shiny clean
design, it won't stay that way. You have to plan for the worst case,
which is, your program will degenerate into a nasty mess, which is
usually the case, over time.

That's a self fulfilling prophecy. The moment you plan to
produce a nasty mess, you will. What you should do is organize
the projects so that they don't degenerate into a nasty mess.
I've worked on one or two C++ projects where the quality of the
code actually improved each iteration.

If you use a relatively low-level language such as C++, this
certainly _does_ effect your productivity.

If you insist on using the low-level features where they aren't
appropriate, this will definitely have a negative effect on your
productivity. But so will programming in Java without using the
standard library components (except those in java.lang). Modern
languages tend to be more or less modular, with more and more
abstraction pushed off into the library. C++ is an extreme case
in this regard.

I can't understand where you get the idea that it doesn't. A
more expressive language, preferrably one designed for or
adaptable to the problem domain, will in many cases produce a
dramatic improvement in productivity over the kind of manual
stone-breaking one is doing in C++.

Unless you're writing low level libraries, you're not doing
manual stone-breaking in C++. The actual coding I did in Java
was much lower level that what I currently do in C++ (but I've
also done very low level coding in C++).

High-level languages and domain-specific languages are usually
implemented with automatic memory management because of the
more abstract programming models being used.

Modern languages generally provide automatic memory management,
because it is something that can be automated efficiently. It's
helps, but it's not a silver bullet. And of course, some of us
regularly use automatic memory management in C++. Because C++
has value semantics, it's not as essential as in Java, but it
helps.

You can design and write the cleanest, shiniest modular C++
code, some mediocre programmer using a language that is more
expressive for the task at hand will beat you hands down.

Care to bet. Say, write some nice, simple to use components
which use complex arithmetic. (In this case, of course, the
difference isn't garbage collection, but operator overloading.
A necessity in any modern language, but missing in Java, or at
least, it was missing when I used Java.)

Maybe the reason why, as you allege, some kind of "strict
discipline" yields better results in C++ is because this is
the case for tasks where the language isn't that well suited
for, and one simply needs this discipline, otherwise one
cannot get useful results.

In general, a strict discipline is necessary in all programming.
I suspect that Juha's argument is that without this strict
discipline, a C++ program will typically crash (which generally
doesn't go unnoticed), where as a Java program will just give
wrong results. I disagree with his conclusions for several
reasons: most significantly, because with garbage collection in
C++, you can better guarantee the crashes, but also because I
don't see why good programmers should be deprived of a useful
tool simply because there are so many bad programmers around.
But the argument that strict discipline is good can't be
disputed.

Ian Collins · Apr 11, 2008

Razii said:
How about mobile and embedded devices that don't have sophisticated
memory management?

It's not uncommon for such devices to use a static design, so there's no
chance of a leak.

If a C++ application is leaking memory, the memory
might never be returned even after the application is terminated.

You'd be surprised how much effort goes into memory management on
embedded devices. Memory is a more precious commodity in the embedded
world.

Patricia Shanahan · Apr 11, 2008

Razii said:
int main(int argc, char *argv[]) {

clock_t start=clock();
for (int i=0; i<=10000000; i++) {
Test *test = new Test(i);
if (i % 5000000 == 0)
cout << test;
}

Click to expand...

If I add delete test; to this loop it gets faster. huh? what the
exaplanation for this?

2156 ms

and after I add delete test; to the loop

1781 ms

why is that?

Due to caching at various levels of the memory hierarchy, accesses to
recently referenced virtual addresses are often a lot faster than
accesses to new ones. The original C++ code requested 10,000,000
distinct Test-sized memory allocations with no reuse. With "delete
test;" the memory allocator can reissue the same piece of memory for
each "new" operation.

In addition, the sheer amount of memory being allocated in the original
C++ program may have required some system calls to get additional
allocatable memory.

The JVM is free to reuse the virtual memory previously occupied by an
unreachable Test object, so the version with "delete test;" is a bit
more comparable to the Java program.

This illustrates the basic problem with snippet benchmarks. In modern
computers the performance of small operations depends on their context.
Taking them out of context is not realistic.

Patricia

Juha Nieminen · Apr 11, 2008

Stefan said:
»Your essay made me remember an interesting phenomenon I
saw in one system I worked on. There were two versions of
it, one in Lisp and one in C++. The display subsystem of
the Lisp version was faster. There were various reasons,
but an important one was GC: the C++ code copied a lot of
buffers because they got passed around in fairly complex
ways, so it could be quite difficult to know when one
could be deallocated. To avoid that problem, the C++
programmers just copied.

I suppose you can compare incompetent C++ programmers with lisp
programmers (in the sense that lisp may lead, by its very nature, to
efficient code without the need to create complicated designs). However,
generalizing from this that lisp produces faster code than C++ is a bit
unfair.

A lot of us thought in the 1990s that the big battle would
be between procedural and object oriented programming, and
we thought that object oriented programming would provide
a big boost in programmer productivity. I thought that,
too. Some people still think that. It turns out we were
wrong. Object oriented programming is handy dandy, but
it's not really the productivity booster that was
promised. The real significant productivity advance we've
had in programming has been from languages which manage
memory for you automatically.

Claiming that OOP has not improved productivity significantly is quite
far-fetched.

(Personally I feel there has been a counter-movement against OOP: In
the late 80's and early 90's OOP was the fad and the extreme hype. While
it did indeed improve productivity a lot, it was not, however, the final
silver bullet. In other words, in some ways it was a bit of a
disappointment after all that hype. This produced an odd
counter-reaction in some circles who, for some reason, can only see what
OOP did *not* deliver and close their eyes to all that it did. It's a
kind of anti-hype as a post-reaction to the hype. IMO this kind of
counter-movement is stupid and misguided.)

Roedy Green · Apr 11, 2008

I am not sure what you mean by that ... can you post the code?

You can download the source of the JVM if you sign some sort of
agreement. That may be waived now.

All Java "new" has to do is something like this in assembler:

addressOfNewObject = nextFreeSlot;

nextFreeSlot += sizeOfObject;

if ( nextFreeSlot > limit ) { garbageCollect(); }

fill( addressOfNewObject, sizeOfObject, 0 );

Roedy Green · Apr 12, 2008

If thread A allocates 256 bytes, and thread B races in and concurrently
attempts to allocate 128 bytes... Which thread is going to win?

If all threads share a common heap, new code would have to be
synchronised. If you have that synchronisation, you could use a
single global counter to use to generate a hashCode. Keep in mind we
are talking assembler here. This is the very guts of the JVM. You can
take advantage of an assembler atomic memory increment instruction for
very low overhead synchronisation.

You could invent a JVM where each thread has its own mini-heap for
freshly created objects. Then it would not need to synchronise to
allocate an object. Long lived objects could be moved to a common
synchronised heap.

JVMs have extreme latitude to do things any way they please so long
as the virtual machine behaves in a consistent way.

Roedy Green · Apr 12, 2008

Oh yeah...

Get stuffed. If you want people to take time to explain things to
you, get that chip off your shoulder.

Stefan Ram · Apr 12, 2008

Juha Nieminen said:
I suppose you can compare incompetent C++ programmers with lisp

(JFTR: I did not write the paragraphs in my previous post,
but quoted them from the sources given.)

Claiming that OOP has not improved productivity significantly is quite
far-fetched.

If anyone has a proof that OOP has improved productivity,
I would be happy to hear it.

First, one needs to state which other scenario(s) it is being
compared to. One can expect that today's productivity is
larger than the productivity of 20 years ago even without OOP
because of other improvements in the realm of software
engineering and hardware. So one needs to show, that OOP has
improved productivity even beyond that improvement that would
have happened on the avarage.

OOP might have distracted minds from some other approaches
that might have been even more beneficial. But we will never
learn about such alternative histories, so we can not compare
history to them.

Some features are attributed to OOP, but actually are also
parts of other non-OOP approaches. For example, encapsulation
and a compound entity of related operations one data are
features of an ADT (abstract data type). So, one also has to
give a specific definition of OOP and non-OOP before
discussing its effects on productivity.

New C++ garbage collector	57	Oct 25, 2010
When to use a garbage collector?	46	Jun 10, 2008
CPython's cyclic garbage collector (was [Python-ideas] Automaticcontext managers)	0	Apr 26, 2013
garbage collector	5	Sep 22, 2005
A garbage collector for C++	13	Apr 22, 2006
Garbage Collector examples ?	3	Oct 28, 2006
garbage collector	0	Dec 19, 2006
looking for a BIBOP garbage collector library	0	Jun 28, 2011

Do you use a garbage collector?

Juha Nieminen

Stefan Ram

James Kanze

Juha Nieminen

Pascal J. Bourguignon

James Kanze

James Kanze

Matthias Buelow

lbonafide

James Kanze

Matthias Buelow

Stefan Ram

James Kanze

Ian Collins

Patricia Shanahan

Juha Nieminen

Roedy Green

Roedy Green

Roedy Green

Stefan Ram

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads