Why so many references to global variables?

J

J-P

Hi,

I have a Python script interacting with a specialized
C numerical library. The whole program does quite a lot
of number crunching as should be running for a couple
of hours. However, it always seems to run out of memory
after maybe 40-45 minutes (I get a MemoryError from
the python interpreter). I wanted to see what took
that much memory so I printed a reference count
(using sys.getrefcount()) and I was suprised to
see that a global variable called 'myflag' has litteraly
millions of references to it. myflag appears maybe
8 or 10 times in the script, always something like

if myflag:
#do something
else:
#do something else

It appears that every time the interpreter tests for
the value of 'myflag', it keeps a reference to it.
I don't know whether this has something to do with
the garbage collector not doing its job correctly
of me doing something wrong in the code, but I'd really
like to fix this thing.

Any ideas, suggestions or comments greatly appreciated,
as always.

Thanks in advance,
J-P
 
J

J-P

Alexander said:
What makes you think so? This seems rather unlikely to me (not that the
references themselves should eat your memory anyway!).

Well, sys.getrefcount() does tell me there are 3 millions references
to it and other globals. Even though this doesn't eat up all my memory
(how large is a reference object in Python?), I definitely think there's
something fishy with keeping that many references to global variables
that appear here and there in the script.


Chances are, the C extension code doesn't work correctly (C extensions to
python code have to do memory management by hand; increasing and decreasing
reference counts for the python objects they deal with as appropriate; so if a
bit of code forgets to decrease the refcount, the object will stay alive
forever; my guess would be that this it what happens here).

Might be, but the ref count for the objects interacting with the C
library are pretty much what I expect them to be, i.e. a few dozens.
I don't think there are other memory leaks in the bindings to the
library. I've passed it through Purify a couple of times and everything
seems clean.


J-P
 
E

Erik Max Francis

J-P said:
Well, sys.getrefcount() does tell me there are 3 millions references
to it and other globals. Even though this doesn't eat up all my memory
(how large is a reference object in Python?), ...

Reference counts are just maintained internally with a single number
that's incremented or decremented.
I definitely think
there's
something fishy with keeping that many references to global variables
that appear here and there in the script.

Yes, it does. It strongly suggests that the fishiness is in your C
extension.
 
A

Alexander Schmolck

J-P said:
Well, sys.getrefcount() does tell me there are 3 millions references
to it and other globals. Even though this doesn't eat up all my memory
(how large is a reference object in Python?), I definitely think there's
something fishy with keeping that many references to global variables that
appear here and there in the script.


Erik Max Francis has hopefully already sorted your reference count confusion
out (if not, maybe a look under "reference counts" in the C API/extending bits
of the python docu might clarify matters), so I'll just give you a simple
practical tip:

Take one of the suspicious C extension functions, and call it repeatedly from
python (passing and returning data structures that are as large as possible
and that you discard immediately aftewerwards). Then using 'top' or something
equivalent, look at how the memory consumption of your program changes: if you
find that with each couple of calls python swallows a few megabytes, you can
be pretty sure that something is going wrong (at least if you force gc with
gc.collect()).

Once you've isolated the function(s) you're in for some fun debugging the
corresponding C code, paying particular attention to PY_DECREFs and
PY_INCREFs.
Might be, but the ref count for the objects interacting with the C library are
pretty much what I expect them to be, i.e. a few dozens.

I don't think there are other memory leaks in the bindings to the
library. I've passed it through Purify a couple of times and everything
seems clean.

Purify is unlikely to have a deep understanding of python's internal reference
counting (memory management) mechanism, right? So while it will be helpful for
finding memory *violations* (and leaks not due to refcounts) it's quite
unlikely to find problems due to the C extension not *decreasing reference
counts*, which is what I bet is happening in your case.

'as
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top