How to optimize and monitor garbage collection?

K

kj

I'm designing a system that will be very memory hungry unless it
is "garbage-collected" very aggressively.

In the past I have had disappointing results with the gc module:
I noticed practically no difference in memory usage with and without
it. It is possible, however, that I was not measuring memory
consumption adequately.

What's the most accurate way to monitor memory consumption in a
Python program, and thereby ensure that gc is working properly?

Also, are there programming techniques that will result in better
garbage collection? For example, would it help to explicitly call
del on objects that should be gc'd?

TIA!

~kj
 
S

Steve Holden

What's the most accurate way to monitor memory consumption in a
Python program, and thereby ensure that gc is working properly?

Trust me, it is. But don't forget that CPython doesn't actually *use*
the garbage collector until you start to create cyclic data structures.
In their absence reference counting alone is sufficient to ensure that
unused object memory is reclaimed.

regards
Steve

[aside: in general, if you think your program is not working because of
a bug in Python, look harder at your program].
 
T

Terry Reedy

I'm designing a system that will be very memory hungry unless it
is "garbage-collected" very aggressively.

In the past I have had disappointing results with the gc module:
I noticed practically no difference in memory usage with and without
it. It is possible, however, that I was not measuring memory
consumption adequately.

What's the most accurate way to monitor memory consumption in a
Python program, and thereby ensure that gc is working properly?

Also, are there programming techniques that will result in better
garbage collection? For example, would it help to explicitly call
del on objects that should be gc'd?

Python the language is not much concerned with memory. For an
interpreter running on a computer, there are four memory sizes to be
considered: the virtual memory assigned to the process; the physical
memory assigned to the process; the physical memory used by Python
objects; and the the memory used by 'active' objects accessible from
program code. As far as I know, the OS can only see and report on the
first and/or second.

If the gap between the second and third (assigned and used physical
memory) is large and includes 'blocks' that are totally unused, the
interpreter *may* be able to return such blocks. But do not count on it.
When this gap expands because the program deletes objects without
returning blocks, people get fooled by OS reports of assigned memory not
shrinking (even though used memory is).

CPython tries to minimize the gap between all objects and active object
both with reference counting and cyclic garbage collection (gc). Yes,
you can help this along by judicious use of del and gc.collect. The goal
should be to minimize the maximum active memory size. Reusing large
arrays (rather than deleting and creating a new one) can sometimes help
by avoiding fragmentation of allocated memory.

Terry Jan Reedy
 
A

Aahz

[aside: in general, if you think your program is not working because of
a bug in Python, look harder at your program].

Good advice, I certainly agree with you. But sometimes it's not so
simple. Right now, my company is running up against a problem with
CherryPy because cPickle.dumps() doesn't release the GIL. You could
argue that it's our fault for using threads, and you could also argue
that we're having problems because our server has heavy memory
contention (a dumps() that takes a couple of seconds on a laptop takes
more than thirty seconds on the server).

Nevertheless, I think it's also arguably a bug that dumps() doesn't
release the GIL. (cPickle.dump() *does* release the GIL.)

(Fortunately, we're savvy enough that it's easy for us to just make a
local copy of cPickle that releases the GIL. Much easier than finding
the problem in the first place...)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top