Queue cleanup

A

Aahz

I've repeatedly asked, both here and elsewhere, why reference counting
isn't "real" garbage collection. Nobody has been able to give me a
satisfactory answer. As far as I can tell, it's a bit of pretentiousness
with no basis in objective fact.

You'll notice that I was very careful to qualify my statement with "as
most people think of it". Also, because CPython has two different memory
management mechanisms, refcounting and cycle detection, and the module
that controls cycle detection is called "gc", I think it's simpler to
follow along with the Python docs -- and critically important to remind
people that there are in fact two different systems.
 
A

Aahz

On the other hand, the reason that CPython still has reference counting
is that the alternatives tried so far are unacceptably for non-threaded
code.

No, it's *a* reason, the other main reason being that refcounting is much
easier for a random C library.
 
H

Hans Mulder

Steven said:
I could be wrong, but how can they not be subject to the same performance
issue? If you have twenty thousand components that all have to be freed,
they all have to be freed whether you do it when the last reference is
cleared, or six seconds later when the gc does a sweep.

Parallelizable garbage collectors have performance issues, but they're
not the same issues as mark&sweep collectors have. Parallelizable GCs
break up their work in a zillion little pieces and allow the VM to do
some real work after each piece. They won't free your twenty thousand
components all in one go and you won't have that embarrassing pause.

Parallelizable garbage collectors require some careful coordination
between the GC and the VM. This takes CPU time, so on the whole they're
slower than traditional garbage collectors. So instead of unpredictable
embarrassing pauses, you have a VM that's consistently slow.
For some applications consistency is more important than raw speed and
for these applications parallelizeable GCs are an improvement.

HTH,

-- HansM
 
P

Paul Rubin

Hans Mulder said:
Parallelizable garbage collectors have performance issues, but they're
not the same issues as mark&sweep collectors have. Parallelizable GCs
break up their work in a zillion little pieces and allow the VM to do
some real work after each piece. They won't free your twenty thousand
components all in one go and you won't have that embarrassing pause.

Quibble: parallel GC just means any GC that runs in multiple threads
simultaneously. A GC that guarantees no pauses (more technically,
specifies a constant such that any GC pause is guaranteed to be shorter
than the constant) is called a "real time" GC, and real-time GC's are
usually single threaded. Parallel real-time GC's were once sort of a
holy grail, though workable ones have been implemented. GHC for example
currently uses a GC that is parallel (runs on multiple cores for speed)
but is not real-time (there can be a pause), and I think the Sun JVM is
the same way.

These days I think the GC pause issue is overrated except for real-time
control applications. GC's no longer frequently make the program freeze
up for seconds at a time or anything like that. It was worse in the old
days when CPU's were slower and memory was scarcer. Serious GC's are
usually generational, with "minor" GC's taking microseconds and less
frequent "major" GC's taking fractions of a second. So in an
interactive program or animation on your desktop you might notice a
little hiccup once in a while. For something like a web server an
occasional slight variation in response time could easily be random
network delays and so forth.
 
P

Paul Rubin

Steven D'Aprano said:
You can add cycle detection to a reference count gc, at the cost of more
complexity.

But then it's not purely a refcount gc. ;)
If you read the Wikipedia article I linked to, tracing algorithms can
also be unsound: [describes "conservative" gc]

Yeah, whether that counts as a real GC is subjective too.
Yes. And?

I thought I made it clear, the purpose of gc is to improve programmer
productivity and program reliability by relieving the programmer from
the burden of memory allocation bookkeeping. If the programmer has to
painstakingly manage refcounts by hand and the resulting programs are
still prone to refcount bugs (both of which are the case with CPython),
it's not really gc in a way that lives up to the name.
That's a problem with the CPython API, not reference counting. The
problem is that the CPython API is written at too low a level, beneath
that at which the garbage collector exists, so naturally you have to
manually manage memory.

Care to give an example of a reference counted system that's written any
other way?
On the other hand, tracing gcs have their own set of problems too, mostly
due to the use of finalizers and attempts to make garbage collection run
more predictably. See here:

I think it's fairly common wisdom that finalizers and gc don't interact
very well. Finalizers in general aren't in the gc spirit, which says
the system should give the illusion that every object stays around
forever. As for stuff like hash tables, a usual way to help out the gc
is by populating the hash table with weak references, rather than by
clearing it out manually when you're done with it. It's also possible
for fancy compilers to use a mixture of gc and stack- or region-based
allocation.
Tracing garbage collectors aren't a panacea. They're software themselves,
and complex software, which means they're subject to bugs like the one ...

You could say the same thing about compilers instead of gc's. The idea
in both cases is yes, they're complex, but they put the complexity in
one place, so that the application program can rely on the complicated
gc or compiler features while itself staying simple.
 
P

Paul Rubin

Steven D'Aprano said:
I could be wrong, but how can they not be subject to the same performance
issue? If you have twenty thousand components that all have to be freed,
they all have to be freed whether you do it when the last reference is
cleared, or six seconds later when the gc does a sweep.

GC's on large machines these days do not sweep at all. They copy the
live data to a new heap, then release the old heap. Because there is
usually more garbage than live data, copying GC's that never touch the
garbage are usually faster than any scheme involving freeing unused
objects one by one.
 
S

Steven D'Aprano

Care to give an example of a reference counted system that's written any
other way?

The complexity of the ref counter is invisible when writing pure Python
code, and I believe it is also invisible when writing code in Cython. The
difficulty of dealing with ref counts is abstracted away.

The argument that "it will work if we always do this" means that it won't
work is a nice quip, but it doesn't stand up. It's possible to defeat
even Pascal compilers' type checking and do unsafe things by dropping
down into assembly. So don't do it! It's not hard to not do something,
although sometimes you might choose to do it anyway, because otherwise
there is no way to get the code you need/want. But such code should be a
rare exception.

I'm not saying that ref counting systems can avoid incrementing and
decrementing the ref counts. That would be silly. But I'm saying that it
is an accident of implementation that writing C extensions requires you
to manually manage the counts yourself. That's a side-effect of
permitting coders to write C extensions in pure C, rather than in some
intermediate language which handles the ref counts for you but otherwise
compiles like C. If people cared enough, they could (probably) make the C
compiler handle it for them, just as it currently handles incrementing
and decrementing the return stack.

There's nothing *fundamental* to the idea of ref counting that says that
you must handle the counts manually -- it depends on how well insulated
you are from the underlying machine.
 
L

Lawrence D'Oliveiro

I'd say [reference-counting is] not real gc because 1) it's unsound
(misses reference cycles), and 2) it requires constant attention from the
mutator to incr and decr the reference counts. So developing modules for
the CPython API means endlessly finding and fixing refcount bugs that lead
to either crashes/security failures, or memory leaks.

I don’t see why that should be so. It seems a very simple discipline to
follow: initialize, allocate, free. Here’s an example snippet from my DVD
Menu Animator <http://github.com/ldo/dvd_menu_animator>:

static void GetBufferInfo
(
PyObject * FromArray,
unsigned long * addr,
unsigned long * len
)
/* returns the address and length of the data in a Python array object. */
{
PyObject * TheBufferInfo = 0;
PyObject * AddrObj = 0;
PyObject * LenObj = 0;
do /*once*/
{
TheBufferInfo = PyObject_CallMethod(FromArray, "buffer_info", "");
if (TheBufferInfo == 0)
break;
AddrObj = PyTuple_GetItem(TheBufferInfo, 0);
LenObj = PyTuple_GetItem(TheBufferInfo, 1);
if (PyErr_Occurred())
break;
Py_INCREF(AddrObj);
Py_INCREF(LenObj);
*addr = PyInt_AsUnsignedLongMask(AddrObj);
*len = PyInt_AsUnsignedLongMask(LenObj);
if (PyErr_Occurred())
break;
}
while (false);
Py_XDECREF(AddrObj);
Py_XDECREF(LenObj);
Py_XDECREF(TheBufferInfo);
} /*GetBufferInfo*/

It’s quite easy to assure yourself that this is never going to leak memory.
More complicated examples can simply nest constructs like these one within
the other to arbitrary depth, while still giving the same assurance at every
level. In short, this technique scales well.
If you program the Java JNI or a typical Lisp FFI, you'll find that real
gc is a lot simpler to use since you avoid all the refcount maintenance
hassles. You allocate memory and shut your eyes, and the gc takes care of
freeing it when it figures out that you are done.

And how do you run such an application? You have to limit it to a
predetermined amount of memory to begin with, otherwise it would easily
gobble up everything you have.

In the old days of “classic†MacOS, every application had to run in a fixed-
size application heap. I have no wish to return to those days.
 
P

Paul Rubin

Lawrence D'Oliveiro said:
I don’t see why that should be so. It seems a very simple discipline to
follow: initialize, allocate, free. Here’s an example snippet from my DVD
Menu Animator <http://github.com/ldo/dvd_menu_animator>:

In practice it has been a problem. If it hasn't happened to you yet,
you're either burning a bunch of effort that programmers of more
automatic systems can put to more productive uses, or else you just
haven't written enough such code to have gotten bitten yet.
And how do you run such an application? You have to limit it to a
predetermined amount of memory to begin with, otherwise it would easily
gobble up everything you have.

No that's usually not a problem-- the runtime system (generational gc)
can figure out enough from your allocation pattern to prevent the heap
from getting overlarge.
 
P

Paul Rubin

Lawrence D'Oliveiro said:
static void GetBufferInfo
( ...
do /*once*/
{
TheBufferInfo = PyObject_CallMethod(FromArray, "buffer_info", "");
if (TheBufferInfo == 0)
break;
AddrObj = PyTuple_GetItem(TheBufferInfo, 0);
LenObj = PyTuple_GetItem(TheBufferInfo, 1);
if (PyErr_Occurred())
break;
...
Py_INCREF(AddrObj);
Py_INCREF(LenObj);
}
while (false);
Py_XDECREF(AddrObj);
Py_XDECREF(LenObj);
Py_XDECREF(TheBufferInfo);
} /*GetBufferInfo*/

It’s quite easy to assure yourself that this is never going to leak memory.

Actually that code looks suspicious. Suppose in

AddrObj = PyTuple_GetItem(TheBufferInfo, 0);
LenObj = PyTuple_GetItem(TheBufferInfo, 1);

the first PyTuple_GetItem succeeds and the second one fails. Then
AddrObj is a borrowed reference to the first tuple element and LenObj is
null, the error flag is set, so you break out of the do/while. You then
decrement the refcount of AddrObj even though you didn't increment it.
Maybe there's an explanation that makes it ok somehow, but it still
looks messy. This is the kind of problem I'm referring to in general.
 
P

Paul Rubin

Steven D'Aprano said:
I'm not saying that ref counting systems can avoid incrementing and
decrementing the ref counts. That would be silly. But I'm saying that it
is an accident of implementation that writing C extensions requires you
to manually manage the counts yourself. That's a side-effect of
permitting coders to write C extensions in pure C, rather than in some
intermediate language which handles the ref counts for you but otherwise
compiles like C. If people cared enough, they could (probably) make the C
compiler handle it for them, just as it currently handles incrementing
and decrementing the return stack.

I guess that is how the so-called smart pointers in the Boost C++
template library work. I haven't used them so I don't have personal
experience with how convenient or reliable they are, or what kinds of
constraints they imposed on programming style. I've always felt a bit
suspicious of them though, and I seem to remember Alex Martelli (I hope
he shows up here again someday) advising against using them.

I don't think a C compiler could really manage automatic decrementing
while still being C. Think especially of the common style of exception
handling in C using longjmp.
 
S

Steven D'Aprano

I don't think a C compiler could really manage automatic decrementing
while still being C. Think especially of the common style of exception
handling in C using longjmp.

You might very well be right. But that's the problem with C -- it's too
low a level language to expect the compiler to protect you much. Or at
all. There will always be some use cases for managing memory yourself, or
even managing the return stack (as you can do in Forth, for example), and
so there will always need to be some sort of high-level assembler like C.
But it astounds me that in 2010 people still routinely use C for normal,
everyday application programming.
 
G

Gregory Ewing

Paul said:
These days I think the GC pause issue is overrated except for real-time
control applications.

Also for games, which are a fairly common application
these days. Even a few milliseconds can be too long when
you're trying to achieve smooth animation.

I'd be disappointed if CPython ditched refcounting and
then became unsuitable for real-time games as a result.
 
L

Lawrence D'Oliveiro

Actually that code looks suspicious. Suppose in

AddrObj = PyTuple_GetItem(TheBufferInfo, 0);
LenObj = PyTuple_GetItem(TheBufferInfo, 1);

the first PyTuple_GetItem succeeds and the second one fails.

Admittedly, I did take a shortcut here: array.buffer_info returns a tuple of
two items, so I’m not expecting one GetItem to succeed and the other to
fail.
 
L

Lawrence D'Oliveiro

In practice it has been a problem.

Maybe people need to learn to write code in a more structured fashion.
If it hasn't happened to you yet, you're either burning a bunch of effort
that programmers of more automatic systems can put to more productive
uses ...

What makes you say that? Avoiding bugs is not a “productive use�
No that's usually not a problem-- the runtime system (generational gc)
can figure out enough from your allocation pattern to prevent the heap
from getting overlarge.

And yet Java apps, for example, are (in)famous for excessive memory usage
compared to those written in non-GC-dependent languages.
 
P

Paul Rubin

Lawrence D'Oliveiro said:
Admittedly, I did take a shortcut here: array.buffer_info returns a tuple of
two items, so I’m not expecting one GetItem to succeed and the other to
fail.

FromArray is a parameter to the function, with no type check to make
sure it's really an array. In fact your code allows for the possibility
that it doesn't support the buffer_info operation (if I understand the
purpose of the null return check after the PyObject_CallMethod) which
means it's prepared for the argument to -not- be an array. In that case
maybe it's some other object with a "buffer_info" operation that returns
a 1-element tuple. If the function is callable from Python code, then
that arg type is completely out of the C code's control. Even if it's
only callable from C, you're still depending on not one but two
different invariants (that the arg is an array, and that
array.buffer_info returns a 2-tuple) that are undocumented and unchecked
in the function. I cannot agree with your claim that the approach
scales.
 
P

Paul Rubin

Lawrence D'Oliveiro said:
What makes you say that? Avoiding bugs is not a “productive use�

Avoiding any particular bug through constant and pervasive vigilance is
far less productive than using a system where causing that particular
type of bug is impossible to begin with. IMO the code you posted has
latent bugs as discussed in the other post. It might work at the moment
you checked it in but it is brittle. I wouldn't have signed off on it
in a code review.
And yet Java apps, for example, are (in)famous for excessive memory
usage compared to those written in non-GC-dependent languages.

I think that may mostly be an issue with the bloated nature of most Java
apps. Certainly Lisp systems have run in production for decades on
machines with much less memory than we would consider acceptable these
days for any substantial Python app.

It's probably true that gc copying-style gc is more memory hungry than
Python's refcount system. Mark-sweep gc should have comparable memory
consumption and better speed than refcounting, though less speed than
copying.

JHC (experimental Haskell compiler) recently started using a mixture of
gc and region inference. It will be interesting to see how that works
out.
 
P

Paul Rubin

Gregory Ewing said:
Also for games, which are a fairly common application
these days. Even a few milliseconds can be too long when
you're trying to achieve smooth animation.

The usual hack with games is you do a major gc when the user advances
between game levels. You can do minor gc's during the screen refresh
interval.
I'd be disappointed if CPython ditched refcounting and
then became unsuitable for real-time games as a result.

Refcounting is susceptable to the same pauses for reasons already
discussed.
 
A

Aahz

The complexity of the ref counter is invisible when writing pure Python
code, and I believe it is also invisible when writing code in Cython. The
difficulty of dealing with ref counts is abstracted away.

That's not completely true. You know perfectly well that it's almost
trivially easy to leak memory with refcounting, and there are certain
ways in which Python leaks memory invisibly if you don't know how it
works. One recent example at work was when someone was arguing with me
about whether we were leaking file handles and I had to prove that you
could leak file handles if you didn't clean up exceptions -- but that
cleaning up the exception *did* close the file handles.

(This code was originally written in Python 2.4, and partly because of
this we are making more of a push to use "with"....)

If you're restricting your claim just to the actual management of the
reference counter, you're correct, but it's especially not clear that
your second sentence is so restricted.
 
L

Lawrence D'Oliveiro

JHC (experimental Haskell compiler) recently started using a mixture of
gc and region inference. It will be interesting to see how that works
out.

That’s what people have been saying about garbage collection for about half
a century: “this new experimental technique will solve all the problems,
just you wait and seeâ€.

Meanwhile, real-world programmers get on to writing real-world code that is
productive and efficient on real-world systems.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top