best way to discover this process's current memory usage, cross-platform?

A

Alex Martelli

Having fixed a memory leak (not the leak of a Python reference, some
other stuff I wasn't properly freeing in certain cases) in a C-coded
extension I maintain, I need a way to test that the leak is indeed
fixed. Being in a hurry, I originally used a q&d hack...:


if sys.platform in ('linux2', 'darwin'):
def _memsize():
""" this function tries to return a measurement of how much memory
this process is consuming, in some arbitrary unit (if it doesn't
manage to, it returns 0).
"""
gc.collect()
try:
x = int(os.popen('ps -p %d -o vsz|tail -1' % os.getpid()).read())
except:
x = 0
return x
else:
def _memsize():
return 0

Having a _memsize() function available, the test then does:
before = _memsize()
# a lot of repeated executions of code that should not consume
# any net memory, but used to when the leak was there
after = _memsize()
and checks that after==before.

However, that _memsize is just too much of a hack, and I really want to
clean it up. It's also not cross-platform enough. Besides, I got a bug
report from a user on a Linux platform different from those I had tested
myself, and it boils down to the fact that once in a while on his
machine it turns our that after is before+4 (for any large number of
repetitions of the code in the above comment) -- I'm not sure what the
unit of measure is supposed to be (maybe blocks of 512 byte, with a page
size of 2048? whatever...), but clearly an extra page is getting used
somewhere.

So, I thought I'd turn to the "wisdom of crowds"... how would YOU guys
go about adding to your automated regression tests one that checks that
a certain memory leak has not recurred, as cross-platform as feasible?
In particular, how would you code _memsize() "cross-platformly"? (I can
easily use C rather than Python if needed, adding it as an auxiliary
function for testing purposes to my existing extension).


TIA,

Alex
 
S

Steven D'Aprano

Not sure if I should start a new thread or not, but
since this is closely related, I'll just leave it as is.

Alex said:
Having fixed a memory leak (not the leak of a Python reference, some
other stuff I wasn't properly freeing in certain cases) in a C-coded
extension I maintain, I need a way to test that the leak is indeed
fixed.

I would like to investigate how much memory is used by
Python objects. My motive is 98% pure intellectual
curiosity and 2% optimization.

I wonder whether I can do something like this:

obj = something()
bytes_used = sizeof(obj)

(obviously there is no built-in function sizeof...
wait, let me check... nope, not a built-in)

I've read the docs for gc and pdb and nothing stands
out to me as doing anything like this.
 
A

Alex Martelli

Steven D'Aprano said:
Not sure if I should start a new thread or not, but
since this is closely related, I'll just leave it as is.



I would like to investigate how much memory is used by
Python objects. My motive is 98% pure intellectual
curiosity and 2% optimization.

I believe that's the purpose of the PySizer project (one of the "Google
Summer of Code" projects), which was recently announced on this group
(I'm sure any search engine will be able to direct you to it, anyway).

I have not checked it out, because my purpose is different -- mine is
not a Python-related leak at all, just a leak within C code (which
happens coincidentally to be a Python extension module).


Alex
 
N

Neal Norwitz

Alex said:
So, I thought I'd turn to the "wisdom of crowds"... how would YOU guys
go about adding to your automated regression tests one that checks that
a certain memory leak has not recurred, as cross-platform as feasible?
In particular, how would you code _memsize() "cross-platformly"? (I can
easily use C rather than Python if needed, adding it as an auxiliary
function for testing purposes to my existing extension).

If you are doing Unix, can you use getrusage(2)?
import resource
r = resource.getrusage(resource.RUSAGE_SELF)
print r[2:5]

I get zeroes on my gentoo amd64 box. Not sure why. I thought maybe it
was Python, but C gives the same results.

Another possibiity is to call sbrk(0) which should return the top of
the heap. You could then return this value and check it. It requires
a tiny C module, but should be easy and work on most unixes. You can
determine direction heap grows by comparing it with id(0) which should
have been allocated early in the interpreters life.

I realize this isn't perfect as memory becomes fragmented, but might
work. Since 2.3 and beyond use pymalloc, fragmentation may not be much
of an issue. As memory is allocated in a big hunk, then doled out as
necessary.

These techniques could apply to Windows with some caveats. If you are
interested in Windows, see:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnucmg/html/UCMGch09.asp

Can't think of anything fool-proof though.

HTH,
n
 
M

MrJean1

My suggestion would also be to use sbrk() as it provides a high-water
mark for the memory usage of the process.

Below is the function hiwm() I used on Linux (RedHat). MacOS X and
Unix versions are straigthforward. Not sure about Windows.

/Jean Brouwers

#if _LINUX
#include <malloc.h>

size_t hiwm (void) {
/* info.arena - number of bytes allocated
* info.hblkhd - size of the mmap'ed space
* info.uordblks - number of bytes used (?)
*/
struct mallinfo info = mallinfo();
size_t s = (size_t) info.arena + (size_t) info.hblkhd;
return (s);
}

#elif _MAXOSX || _UNIX
#include <unistd.h>

size_t hiwm (void) {
size_t s = (size_t) sbrk(0);
return (s);
}

#elif _WINDOWS
size_t hiwm (void) {
size_t s = (size_t) 0; /* ??? */
return (s);
}

#endif
 
A

Alex Martelli

Neal Norwitz said:
If you are doing Unix, can you use getrusage(2)?

On Unix, I could; on Linux, nope. According to man getrusage on Linux,

"""
The above struct was taken from BSD 4.3 Reno. Not all fields are
meaningful under Linux. Right now (Linux 2.4, 2.6) only the
fields ru_utime, ru_stime, ru_minflt, ru_majflt, and ru_nswap are
maintained.
"""
and indeed the memory-usage parts are zero.

import resource
r = resource.getrusage(resource.RUSAGE_SELF)
print r[2:5]

I get zeroes on my gentoo amd64 box. Not sure why. I thought maybe it
was Python, but C gives the same results.

Yep -- at least, on Linux, this misbehavior is clearly documented in the
manpage; on Darwin, aka MacOSX, you _also_ get zeros but there is no
indication in the manpage leading you to expect that.

Unfortunately I don't have any "real Unix" box around -- only Linux and
Darwin... I could try booting up OpenBSD again to check that it works
there, but given that I know it doesn't work under the most widespread
unixoid systems, it wouldn't be much use anyway, sigh.

Another possibiity is to call sbrk(0) which should return the top of
the heap. You could then return this value and check it. It requires
a tiny C module, but should be easy and work on most unixes. You can

As I said, I'm looking for leaks in a C-coded module, so it's no problem
to add some auxiliary C code to that module to help test it --
unfortunately, this approach doesn't work, see below...
determine direction heap grows by comparing it with id(0) which should
have been allocated early in the interpreters life.

I realize this isn't perfect as memory becomes fragmented, but might
work. Since 2.3 and beyond use pymalloc, fragmentation may not be much
of an issue. As memory is allocated in a big hunk, then doled out as
necessary.

But exactly because of that, sbrk(0) doesn't mean much. Consider the
tiny extension which I've just uploaded to
http://www.aleax.it/Python/memtry.c -- it essentially exposes a type
that does malloc when constructed and free when freed, and a function
sbrk0 which returns sbrk(0). What I see on my MacOSX 10.4, Python
2.4.1, gcc 4.1, is (with a little auxiliary memi.py module that does
from memtry import *
import os
def memsiz():
return int(os.popen('ps -p %d -o vsz|tail -1' % os.getpid()).read())
)...:

Helen:~/memtry alex$ python -ic 'import memi'40900

See? While the process's memory size grows as expected (by 500+ "units"
when allocating one meg, confirming the hypothesis that a unit is
2Kbyte), sbrk(0) just doesn't budge.

As the MacOSX "man sbrk" says,
"""
The brk and sbrk functions are historical curiosities left over from
earlier days before the advent of virtual memory management.
"""
and apparently it's now quite hard to make any USE of those quaint
oddities, in presence of any attempt, anywhere in any library linked
with the process, to do some "smart" memory allocation &c.

These techniques could apply to Windows with some caveats. If you are
interested in Windows, see:
http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnucmg/ht
ml/UCMGch09.asp

Can't think of anything fool-proof though.

Fool-proof is way beyond what I'm looking for now -- I'd settle for
"reasonably clean, works in Linux, Mac and Windows over 90% of the time,
and I can detect somehow when it isn't working";-)


Since people DO need to keep an eye on their code's memory consumption,
I'm getting convinced that the major functional lack in today's Python
standard library is some minimal set of tools to help with that task.
PySizer appears to be a start in the right direction (although it may be
at too early a stage to make sense for the standard library of Python
2.5), but (unless I'm missing something about it) it won't help with
memory leaks not directly related to Python. Maybe we SHOULD have some
function in sys to return the best guess at current memory consumption
of the whole process, implemented by appropriate techniques on each
platform -- right now, though, I'm trying to find out which these
appropriate techniques are on today's most widespread unixoid systems,
Linux and MacOSX. (As I used to be a Win32 API guru in a previous life,
I'm confident that I can find out about _that_ platform by sweating
enough blood on MSDN -- problem here is I don't have any Windows machine
with the appropriate development system to build Python, so testing
would be pretty hard, but maybe I can interest somebody who DOES have
such a setup...;-)


Alex
 
A

Alex Martelli

MrJean1 said:
My suggestion would also be to use sbrk() as it provides a high-water
mark for the memory usage of the process.

That's definitely what I would have used in the '70s -- nowadays, alas,
it ain't that easy.
Below is the function hiwm() I used on Linux (RedHat). MacOS X and
Unix versions are straigthforward. Not sure about Windows.

The MacOSX version using sbrk is indeed straightforward, it just doesn't
work. See my response to Neal's post and my little Python extension
module at http://www.aleax.it/Python/memtry.c -- on a Mac (OSX 10.4,
Python 2.4.1, gcc 4.1) sbrk(0) returns the same value as the process's
virtual memory consumption goes up and down (as revealed by ps). As the
MacOSX's manpage says, "The brk and sbrk functions are historical
curiosities left over from earlier days before the advent of virtual
memory management."

Guess I'll now try the linux version you suggest, with mallinfo:
#if _LINUX
#include <malloc.h>

size_t hiwm (void) {
/* info.arena - number of bytes allocated
* info.hblkhd - size of the mmap'ed space
* info.uordblks - number of bytes used (?)
*/
struct mallinfo info = mallinfo();
size_t s = (size_t) info.arena + (size_t) info.hblkhd;
return (s);
}

and see if and how it works.

I do wonder why both Linux and MacOSX "implemented" getrusage, which
would be the obviously right way to do it, as such a useless empty husk
(as far as memory consumption is concerned). Ah well!-(


Alex
 
M

MrJean1

For some more details on Linux' mallinfo, see
<ftp://gee.cs.oswego.edu/pub/misc/malloc.h> and maybe function mSTATs()
in glibc/malloc/malloc.c (RedHat).

/Jean Brouwers
 
M

MrJean1

For some more details on Linux' mallinfo, see
<ftp://gee.cs.oswego.edu/pub/misc/malloc.h> and maybe function mSTATs()
in glibc/malloc/malloc.c (RedHat).

/Jean Brouwers
 
A

Alex Martelli

matt said:
Perhaps you could extend Valgrind (http://www.valgrind.org) so it works
with python C extensions? (x86 only)

Alas, if it's x86 only I won't even look into the task (which does sound
quite daunting as the way to solve the apparently-elementary question
"how much virtual memory is this process using right now?"...!), since I
definitely cannot drop support for all PPC-based Macs (nor would I WANT
to, since they're my favourite platform anyway).


Alex
 
M

MrJean1

This may work on MacOS X. An initial, simple test does yield credible
values.

However, I am not a MacOS X expert. It is unclear which field of the
malloc_statistics_t struct to use and how malloc_zone_statistics with
zone NULL accumulates the stats for all zones.

/Jean Brouwers

#if _MACOSX
#include <malloc/malloc.h>
/* typedef struct malloc_statistics_t {
unsigned blocks_in_use;
size_t size_in_use;
size_t max_size_in_use; -- high water mark of touched memory
size_t size_allocated; -- reserved in memory
} malloc_statistics_t;
*/
size_t hiwm (
size_t since)
{
size_t s;
malloc_statistics_t t;
/* get cummulative (?) stats for all zones */
malloc_zone_statistics(NULL, &t);
s = t.size_allocated; /* or t.max_size_in_use? */
return (s - since);
}
#endif
 
A

Alex Martelli

MrJean1 said:
This may work on MacOS X. An initial, simple test does yield credible
values.

Definitely looks promising, thanks for the pointer.
However, I am not a MacOS X expert. It is unclear which field of the
malloc_statistics_t struct to use and how malloc_zone_statistics with
zone NULL accumulates the stats for all zones.

It appears that all of this stuff is barely documented (if at all), not
just online but also in books on advanced MacOS X programming. Still, I
can research it further, since, after all, the opendarwin sources ARE
online. Thanks again!


Alex
 
N

Neal Norwitz

Alex said:
Alas, if it's x86 only I won't even look into the task (which does sound
quite daunting as the way to solve the apparently-elementary question
"how much virtual memory is this process using right now?"...!), since I
definitely cannot drop support for all PPC-based Macs (nor would I WANT
to, since they're my favourite platform anyway).

Valgrind actually runs on PPC (32 only?) and amd64, but I don't think
that's the way to go for this problem.

Here's a really screwy thought that I think should be portable to all
Unixes which have dynamic linking. LD_PRELOAD.

You can create your own version of malloc (and friends) and free. You
intercept each call to malloc and free (by making use of LD_PRELOAD),
keep track of the info (pointers and size) and pass the call along to
the real malloc/free. You then have all information you should need.
It increases the scope of the problem, but I think it makes it soluble
and somewhat cross-platform. Using LD_PRELOAD, requires the app be
dynamically linked which shouldn't be too big of a deal. If you are
using C++, you can hook into new/delete directly.

n
 
P

Paul Boddie

Neal said:
Valgrind actually runs on PPC (32 only?) and amd64, but I don't think
that's the way to go for this problem.

+1 for understatement of the week.
Here's a really screwy thought that I think should be portable to all
Unixes which have dynamic linking. LD_PRELOAD.

Similar work is described here:

http://www.hpl.hp.com/personal/Hans_Boehm/gc/leak.html

On the subject of memory statistics, I'm surprised no-one has mentioned
"top" in this thread (as far as I'm aware): I would have thought such
statistics would have been available to "top" and presented by that
program.

Paul
 
N

Nicola Larosa

On the subject of memory statistics, I'm surprised no-one has mentioned
"top" in this thread (as far as I'm aware): I would have thought such
statistics would have been available to "top" and presented by that
program.

Talking about "top", this article may be useful:

On measuring memory usage
http://www.kdedevelopers.org/node/1445

--
Nicola Larosa - (e-mail address removed)

....Linux security has been better than many rivals. However, even
the best systems today are totally inadequate. Saying Linux is
more secure than Windows isn't really addressing the bigger issue
- neither is good enough. -- Alan Cox, September 2005
 
J

Jack Diederich

Valgrind actually runs on PPC (32 only?) and amd64, but I don't think
that's the way to go for this problem.

Here's a really screwy thought that I think should be portable to all
Unixes which have dynamic linking. LD_PRELOAD.

You can create your own version of malloc (and friends) and free. You
intercept each call to malloc and free (by making use of LD_PRELOAD),
keep track of the info (pointers and size) and pass the call along to
the real malloc/free. You then have all information you should need.
It increases the scope of the problem, but I think it makes it soluble
and somewhat cross-platform. Using LD_PRELOAD, requires the app be
dynamically linked which shouldn't be too big of a deal. If you are
using C++, you can hook into new/delete directly.

Electric Fence[1] uses the LD_PRELOAD method. I've successfully used it to
track down leaks in a python C extension. If you look at the setup.py in
probstat[2] you'll see
#libraries = ["efence"] # uncomment to use ElectricFence
which is a holdover from developing.

-Jack

[1] http://perens.com/FreeSoftware/ElectricFence/
[2] http://probstat.sourceforge.net/
 
A

Alex Martelli

Paul Boddie said:
+1 for understatement of the week.


Similar work is described here:

http://www.hpl.hp.com/personal/Hans_Boehm/gc/leak.html

Interesting considerations. Taking a step back, it does feel a bit as
if the amount of infrastructure needed for a process to ask about its
resource consumption is out of whack, though -- I don't understand why
Unix-like systems such as Linux and Darwin can't just fully support some
call such as getrusage. Ah well...

On the subject of memory statistics, I'm surprised no-one has mentioned
"top" in this thread (as far as I'm aware): I would have thought such
statistics would have been available to "top" and presented by that
program.

It seems to me that top, like ps and other platform-dependent programs
such as vmmap on Darwin (MacOSX), tend to be at the very least owned by
group kmem and setgid, if not simply setuid root, because the way they
do their job is rooting through /dev/kmem and that requires privileges.
To let a Python extension module know how much VM the process is
currently using, we'd have to have the executable itself for Python be
setgid kmem or setuid root, which somehow doesn't seem appealing;-)

On MacOSX specifically, I've been pointed to an open-source third-party
utility named MemoryCell which does manage to learn about VM use for any
process w/o needing to be setuid or setgid. It does so in a module
that's 300+ lines of ObjectiveC, so it would require quite a bit of
reverse engineering to integrate into a pure-C Python extension, but at
least it serves as proof of existence;-)


Alex
 
S

sjdevnull

Neal said:
Here's a really screwy thought that I think should be portable to all
Unixes which have dynamic linking. LD_PRELOAD.

You can create your own version of malloc (and friends) and free. You
intercept each call to malloc and free (by making use of LD_PRELOAD),
keep track of the info (pointers and size) and pass the call along to
the real malloc/free. You then have all information you should need.

That'll only get you memory usage from malloc/free calls, which could
be vastly less than the process' memory usage in plausible scenarios
(e.g. a media player that uses mmap() to read the file, or anything
that uses large shared memory segments generated with mmap() or SysV
IPC, etc).

In the real world, malloc() and mmap() are probably sufficient to get a
good picture of process usage for most processes. But I guess defining
exactly what counts as the process' current memory would be a starting
place (specifically how to deal with shared memory).
 
A

Alex Martelli

That'll only get you memory usage from malloc/free calls, which could
be vastly less than the process' memory usage in plausible scenarios
(e.g. a media player that uses mmap() to read the file, or anything
that uses large shared memory segments generated with mmap() or SysV
IPC, etc).

True. But hopefully a cross-platform program's memory leaks will mostly
be based on malloc (it couldn't use SysV's IPC and still be
cross-platform, for example; and while mmap might be a possibility,
perhaps it might be tracked by a similar trick as malloc might).

In the real world, malloc() and mmap() are probably sufficient to get a
good picture of process usage for most processes. But I guess defining
exactly what counts as the process' current memory would be a starting
place (specifically how to deal with shared memory).

Considering that the main purpose is adding regression tests to confirm
that a hopefully-fixed memory leak does not recur, I'm not sure why
shared memory should be a problem. What scenarios would "leak shared
memory"? If some shared library gets loaded once and stays in memory
that doesn't appear to me as something that would normally be called "a
memory leak" -- unless I'm failing to see some cross-platform scenario
that would erroneously re-load the same library over and over again,
taking up growing amounts of shared memory with time?


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,025
Latest member
KetoRushACVFitness

Latest Threads

Top