Hunting a memory leak

Discussion in 'Python' started by Debian User, Aug 29, 2003.

  1. Debian User

    Debian User Guest

    Hi,

    I'm trying to discover a memory leak on a program of mine. I've taken
    several approaches, but the leak still resists to appear.

    First of all, I've tried to use the garbage collector to look for
    uncollectable objects. I've used the next:

    # at the beginning of code
    gc.enable()
    gc.set_debug(gc.DEBUG_LEAK)

    <snip>

    # at the end of code:
    print "\nGARBAGE:"
    gc.collect()

    print "\nGARBAGE OBJECTS:"
    for x in gc.garbage:
    s = str(x)
    print type(x),"\n ", s

    With that, I get no garbage objects.

    Then I've taken an approach that I've seen in python developers list
    contributed by Walter Dörwald, that basically consists in creating a
    debug version of python, create a unitest with the leaking code, and
    modify the unittest.py to extract the increment of total reference
    counting in that code (see
    http://aspn.activestate.com/ASPN/Mail/Message/python-dev/1770868).

    With that, I see that my reference count grows by one each time the
    test execute. But the problem is: is there some way to look at the
    object (or make a memory dump) that is leaking?.

    I've used valgrind (http://developer.kde.org/~sewardj/) to see if it
    could detect the leak. In fact, it detects a bunch of them, but I am
    afraid that they are not related with the leak I'm looking for. I am
    saying that because, when I loop over my leaky code, valgrind always
    report the same amount of leaky memory, independently of the number of
    iterations (while top is telling me that memory use is growing!).

    My code uses extension modules in C, so I am afraid this does not
    contribute to alleviate the problem. I think all the malloc are
    correctly freed, but I can't be sure (however, valgrind does not
    detect nothing wrong in the extension).

    I am sorry, but I cannot be more explicit about the code because it
    is quite complex (it is the PyTables package, http://pytables.sf.net),
    and I was unable to make a simple example to be published
    here. However, if anyone is tempted to have a look at the code, you
    can download it from
    (http://sourceforge.net/project/showfiles.php?group_id=63486). I am
    attaching a unittest that exposes the leak.

    I am a bit desperate. Any hint?

    Francesc Alted

    --

    # Unittest to expose the memory leak
    import sys
    import unittest
    import os
    import tempfile

    from tables import *
    # Next imports are only necessary for this test suite
    #from tables import Group, Leaf, Table, Array

    verbose = 0

    class WideTreeTestCase(unittest.TestCase):

    def test00_Leafs(self):

    import time
    maxchilds = 2
    if verbose:
    print '\n', '-=' * 30
    print "Running %s.test00_wideTree..." % \
    self.__class__.__name__
    print "Maximum number of childs tested :", maxchilds
    # Open a new empty HDF5 file
    file = tempfile.mktemp(".h5")
    #file = "test_widetree.h5"

    fileh = openFile(file, mode = "w")
    if verbose:
    print "Children writing progress: ",
    for child in range(maxchilds):
    if verbose:
    print "%3d," % (child),
    a = [1, 1]
    fileh.createGroup(fileh.root, 'group' + str(child),
    "child: %d" % child)
    # Comment the createArray call to see the leak disapear
    fileh.createArray("/group" + str(child), 'array' + str(child),
    a, "child: %d" % child)
    if verbose:
    print
    # Close the file
    fileh.close()


    #----------------------------------------------------------------------

    def suite():
    theSuite = unittest.TestSuite()
    theSuite.addTest(unittest.makeSuite(WideTreeTestCase))

    return theSuite


    if __name__ == '__main__':
    unittest.main(defaultTest='suite')
     
    Debian User, Aug 29, 2003
    #1
    1. Advertising

  2. Debian User <> writes:

    > I'm trying to discover a memory leak on a program of mine. I've taken
    > several approaches, but the leak still resists to appear.
    >
    > First of all, I've tried to use the garbage collector to look for
    > uncollectable objects.


    [snip]

    > Then I've taken an approach that I've seen in python developers list
    > contributed by Walter Dörwald, that basically consists in creating a
    > debug version of python, create a unitest with the leaking code, and
    > modify the unittest.py to extract the increment of total reference
    > counting in that code (see
    > http://aspn.activestate.com/ASPN/Mail/Message/python-dev/1770868).


    Well, somewhere in that same thread are various references to a
    TrackRefs class. Have you tried using that? It should tell you what
    type of object is leaking, which is a good start.

    > With that, I see that my reference count grows by one each time the
    > test execute. But the problem is: is there some way to look at the
    > object (or make a memory dump) that is leaking?.


    See above :)

    > I've used valgrind (http://developer.kde.org/~sewardj/) to see if it
    > could detect the leak. In fact, it detects a bunch of them, but I am
    > afraid that they are not related with the leak I'm looking for. I am
    > saying that because, when I loop over my leaky code, valgrind always
    > report the same amount of leaky memory, independently of the number of
    > iterations (while top is telling me that memory use is growing!).


    There are various things (interned strings, f'ex) that always tend to
    be alive at the end of a Python program: these are only leaks in a
    very warped sense.

    I don't know if there's a way to get vaglrind to tell you what's
    allocated but not deallocated between two arbitrary points of program
    execution.

    > My code uses extension modules in C, so I am afraid this does not
    > contribute to alleviate the problem.


    Well, in all likelyhood, the bug is IN the C extension module. Have
    you tried stepping through the code in a debugger? Sometime's that's
    a good way of spotting a logic error.

    > I am sorry, but I cannot be more explicit about the code because it
    > is quite complex (it is the PyTables package, http://pytables.sf.net),
    > and I was unable to make a simple example to be published
    > here. However, if anyone is tempted to have a look at the code, you
    > can download it from
    > (http://sourceforge.net/project/showfiles.php?group_id=63486). I am
    > attaching a unittest that exposes the leak.
    >
    > I am a bit desperate. Any hint?


    Not really. Try using TrackRefs.

    Cheers,
    mwh

    --
    I'm about to search Google for contract assassins to go to Iomega
    and HP's programming groups and kill everyone there with some kind
    of electrically charged rusty barbed thing.
    -- http://bofhcam.org/journal/journal.html, 2002-01-08
     
    Michael Hudson, Aug 29, 2003
    #2
    1. Advertising

  3. > I'm trying to discover a memory leak on a program of mine. I've taken
    > several approaches, but the leak still resists to appear.


    First, single-stepping through C code is surprisingly effective. I heartily
    recommend it.

    Here are some ideas you might use if you are truly desperate. You will have
    to do some work to make them useful in your situation.

    1. Keep track of all newly-created objects. Warning: the id trick used in
    this code is not proper because newly allocated objects can have the same
    address as old objects, so you should devise a better way by creating a more
    unique hash. Or just use the code as is and see whether the "invalid" code
    tells you something ;-)

    global lastObjectsDict
    objects = gc.get_objects()

    newObjects = [o for o in objects if not lastObjectsDict.has_key(id(o))]

    lastObjectsDict = {}
    for o in objects:
    lastObjectsDict[id(o)]=o

    2. Keep track of the number of objects.

    def printGc(message=None,onlyPrintChanges=false):

    if not debugGC: return None

    if not message:
    message = callerName(n=2) # Left as an exercise for the reader.

    global lastObjectCount

    try:
    n = len(gc.garbage)
    n2 = len(gc.get_objects())
    delta = n2-lastObjectCount
    if not onlyPrintChanges or delta:
    if n:
    print "garbage: %d, objects: %+6d =%7d %s" %
    (n,delta,n2,message)
    else:
    print "objects: %+6d =%7d %s" %
    (n2-lastObjectCount,n2,message)

    lastObjectCount = n2
    return delta
    except:
    traceback.print_exc()
    return None

    3. Print lots and lots of info...

    def printGcRefs (verbose=true):

    refs = gc.get_referrers(app().windowList[0])
    print '-' * 30

    if verbose:
    print "refs of", app().windowList[0]
    for ref in refs:
    print type(ref)
    if 0: # very verbose
    if type(ref) == type({}):
    keys = ref.keys()
    keys.sort()
    for key in keys:
    val = ref[key]
    if isinstance(val,leoFrame.LeoFrame): # changes as
    needed
    print key,ref[key]
    else:
    print "%d referers" % len(refs)

    Here app().windowList is a key data structure of my app. Substitute your
    own as a new argument.

    Basically, Python will give you all the information you need. The problem
    is that there is way too much info, so you must experiment with filtering
    it. Don't panic: you can do it.

    4. A totally different approach. Consider this function:

    def clearAllIvars (o):

    """Clear all ivars of o, a member of some class."""

    o.__dict__.clear()

    This function will grind concrete walls into grains of sand. The GC will
    then recover each grain separately.

    My app contains several classes that refer to each other. Rather than
    tracking all the interlocking references, when it comes time to delete the
    main data structure my app simply calls clearAllIvars for the various
    classes. Naturally, some care is needed to ensure that calls are made in
    the proper order.

    HTH.

    Edward
    --------------------------------------------------------------------
    Edward K. Ream email:
    Leo: Literate Editor with Outlines
    Leo: http://webpages.charter.net/edreamleo/front.html
    --------------------------------------------------------------------
     
    Edward K. Ream, Aug 29, 2003
    #3
  4. On 2003-08-29, Michael Hudson <> wrote:
    > Debian User <> writes:
    >
    >> I'm trying to discover a memory leak on a program of mine. I've taken
    >> several approaches, but the leak still resists to appear.
    >>
    >> First of all, I've tried to use the garbage collector to look for
    >> uncollectable objects.

    >
    > [snip]
    >
    >> Then I've taken an approach that I've seen in python developers list
    >> contributed by Walter Dörwald, that basically consists in creating a
    >> debug version of python, create a unitest with the leaking code, and
    >> modify the unittest.py to extract the increment of total reference
    >> counting in that code (see
    >> http://aspn.activestate.com/ASPN/Mail/Message/python-dev/1770868).

    >
    > Well, somewhere in that same thread are various references to a
    > TrackRefs class. Have you tried using that? It should tell you what
    > type of object is leaking, which is a good start.
    >
    >> With that, I see that my reference count grows by one each time the
    >> test execute. But the problem is: is there some way to look at the
    >> object (or make a memory dump) that is leaking?.

    >
    > See above :)
    >
    >> I've used valgrind (http://developer.kde.org/~sewardj/) to see if it
    >> could detect the leak. In fact, it detects a bunch of them, but I am
    >> afraid that they are not related with the leak I'm looking for. I am
    >> saying that because, when I loop over my leaky code, valgrind always
    >> report the same amount of leaky memory, independently of the number of
    >> iterations (while top is telling me that memory use is growing!).

    >
    > There are various things (interned strings, f'ex) that always tend to
    > be alive at the end of a Python program: these are only leaks in a
    > very warped sense.
    >
    > I don't know if there's a way to get vaglrind to tell you what's
    > allocated but not deallocated between two arbitrary points of program
    > execution.
    >
    >> My code uses extension modules in C, so I am afraid this does not
    >> contribute to alleviate the problem.

    >
    > Well, in all likelyhood, the bug is IN the C extension module. Have
    > you tried stepping through the code in a debugger? Sometime's that's
    > a good way of spotting a logic error.
    >
    >> I am sorry, but I cannot be more explicit about the code because it
    >> is quite complex (it is the PyTables package, http://pytables.sf.net),
    >> and I was unable to make a simple example to be published
    >> here. However, if anyone is tempted to have a look at the code, you
    >> can download it from
    >> (http://sourceforge.net/project/showfiles.php?group_id=63486). I am
    >> attaching a unittest that exposes the leak.
    >>
    >> I am a bit desperate. Any hint?

    >
    > Not really. Try using TrackRefs.
    >
    > Cheers,
    > mwh
    >
     
    Francesc Alted, Aug 29, 2003
    #4
  5. [Ooops. Something went wrong with my newsreader config ;-)]

    Thanks for the responses!. I started by fetching the TrackRefs() class
    from http://cvs.zope.org/Zope3/test.py and pasted it in my local copy
    of unittest.py. Then, I've modified the TestCase.__call__ try: block
    from the original:


    try:
    testMethod()
    ok = 1

    to read:

    try:
    rc1 = rc2 = None
    #Pre-heating
    for i in xrange(10):
    testMethod()
    gc.collect()
    rc1 = sys.gettotalrefcount()
    track = TrackRefs()
    # Second (first "valid") loop
    for i in xrange(10):
    testMethod()
    gc.collect()
    rc2 = sys.gettotalrefcount()
    print "First output of TrackRefs:"
    track.update()
    print >>sys.stderr, "%5d %s.%s.%s()" % (rc2-rc1,
    testMethod.__module__, testMethod.im_class.__name__,
    testMethod.im_func.__name__)
    # Third loop
    for i in xrange(10):
    testMethod()
    gc.collect()
    rc3 = sys.gettotalrefcount()
    print "Second output of TrackRefs:"
    track.update()
    print >>sys.stderr, "%5d %s.%s.%s()" % (rc3-rc2,
    testMethod.__module__, testMethod.im_class.__name__,
    testMethod.im_func.__name__)
    ok = 1

    However, I'm not sure if I have made a good implementation. My
    understanding is that the first loop is for pre-heating (to avoid
    false count-refs due to cache issues and so). The second loop should
    already give good count references and, thereby, I've made a call to
    track.update(). Finally, I wanted to re-check the results of the
    second loop with a third one. Therefore, I expected more or less the
    same results in second and third loops.

    But... the results are different!. Following are the results of this run:

    $ python2.3 widetree3.py
    First output of TrackRefs:
    <type 'str'> 13032 85335
    <type 'tuple'> 8969 38402
    <type 'Cfunc'> 1761 11931
    <type 'code'> 1215 4871
    <type 'function'> 1180 5189
    <type 'dict'> 841 4897
    <type 'builtin_function_or_method'> 516 2781
    <type 'int'> 331 3597
    <type 'wrapper_descriptor'> 295 1180
    <type 'method_descriptor'> 236 944
    <type 'classobj'> 145 1092
    <type 'module'> 107 734
    <type 'list'> 94 440
    <type 'type'> 86 1967
    <type 'getset_descriptor'> 84 336
    <type 'weakref'> 75 306
    <type 'float'> 73 312
    <type 'member_descriptor'> 70 280
    <type 'ufunc'> 52 364
    <type 'instance'> 42 435
    <type 'instancemethod'> 41 164
    <class 'numarray.ufunc._BinaryUFunc'> 25 187
    <class 'numarray.ufunc._UnaryUFunc'> 24 173
    <type 'frame'> 9 44
    <type 'long'> 7 28
    <type 'property'> 6 25
    <type 'PyCObject'> 4 20
    <class 'unittest.TestSuite'> 3 31
    <type 'file'> 3 23
    <type 'listiterator'> 3 12
    <type 'bool'> 2 41
    <class 'random.Random'> 2 30
    <type '_sre.SRE_Pattern'> 2 9
    <type 'complex'> 2 8
    <type 'thread.lock'> 2 8
    <type 'NoneType'> 1 2371
    <class 'unittest._TextTestResult'> 1 16
    <type 'ellipsis'> 1 12
    <class '__main__.WideTreeTestCase'> 1 11
    <class 'tables.IsDescription.metaIsDescription'> 1 10
    <class 'unittest.TestProgram'> 1 9
    <class 'numarray.ufunc._ChooseUFunc'> 1 8
    <class 'unittest.TestLoader'> 1 7
    <class 'unittest.TrackRefs'> 1 6
    <class 'unittest.TextTestRunner'> 1 6
    <type 'NotImplementedType'> 1 6
    <class 'numarray.ufunc._PutUFunc'> 1 5
    <class 'numarray.ufunc._TakeUFunc'> 1 5
    <class 'unittest._WritelnDecorator'> 1 5
    <type 'staticmethod'> 1 4
    <type 'classmethod'> 1 4
    <type 'classmethod_descriptor'> 1 4
    <type 'unicode'> 1 4
    7 __main__.WideTreeTestCase.test00_Leafs()
    Second output of TrackRefs:
    <type 'int'> 37 218
    <type 'type'> 0 74
    212 __main__.WideTreeTestCase.test00_Leafs()
    ..
    ----------------------------------------------------------------------
    Ran 1 test in 0.689s

    OK
    [21397 refs]
    $

    As you can see, for the second loop (first output of TrackRefs), a lot
    of objects appear, but after the third loop (second output of
    TrackRefs), much less appear (only objects of type "int" and
    "type"). Besides, the increment of the total references for the second
    loop is only 7 while for the third loop is 212. Finally, to add even
    more confusion, these numbers are *totally* independent of the number
    of iterations I put in the loops. You see 10 in the code, but you can
    try with 100 (in one or all the loops) and you get exactly the same
    figures.

    I definitely think that I have made a bad implementation of the try:
    code block, but I can't figure out what's going wrong.

    I would appreciate some ideas.

    Francesc Alted
     
    Francesc Alted, Aug 29, 2003
    #5
  6. > I would appreciate some ideas.

    I doubt many people will be willing to rummage through your app's code to do
    your debugging for you. Here are two general ideas:

    1. Try to simplify the problem. Pick something, no matter how small (and
    the smaller the better) that doesn't seem to be correct and do what it takes
    to find out why it isn't correct. If trackRefs is Python code you can hack
    that code to give you more (or less!) info. Once you discover the answer to
    one mystery, the larger mysteries may become clearer. For example, you can
    concentrate on one particular data structure, one particular data type or
    one iteration of your test suite.

    2. Try to enjoy the problem. The late great Earl Nightingale had roughly
    this advice: Don't worry. Simply consider the problem calmly, and have
    confidence that the solution will eventually come to you, probably when you
    are least expecting it. I've have found that this advice really works, and
    it works for almost any problem. Finding "worthy" bugs is a creative
    process, and creativity can be and should be highly enjoyable.

    In this case, your problem is: "how to start finding my memory leaks".
    Possible answers to this problem might be various strategies for getting
    more (or more focused!) information. Then you have new problems: how to
    implement the various strategies. In all cases, the advice to be calm and
    patient applies. Solving this problem will be highly valuable to you, no
    matter how long it takes :)

    Edward

    P.S. And don't hesitate to ask more questions, especially once you have more
    concrete data or mysteries.

    EKR
    --------------------------------------------------------------------
    Edward K. Ream email:
    Leo: Literate Editor with Outlines
    Leo: http://webpages.charter.net/edreamleo/front.html
    --------------------------------------------------------------------
     
    Edward K. Ream, Aug 30, 2003
    #6
  7. On 2003-08-30, Edward K. Ream <> wrote:
    >> I would appreciate some ideas.

    >
    > I doubt many people will be willing to rummage through your app's code to do
    > your debugging for you. Here are two general ideas:


    Thanks for the words of encouragement. After the weekend I'm more fresh and
    try to follow your suggestions (and those of Earl Nightingale ;-).

    Cheers,

    Francesc Alted
     
    Francesc Alted, Sep 1, 2003
    #7
  8. Francesc Alted <> writes:

    >
    > As you can see, for the second loop (first output of TrackRefs), a lot
    > of objects appear, but after the third loop (second output of
    > TrackRefs), much less appear (only objects of type "int" and
    > "type"). Besides, the increment of the total references for the second
    > loop is only 7 while for the third loop is 212. Finally, to add even
    > more confusion, these numbers are *totally* independent of the number
    > of iterations I put in the loops. You see 10 in the code, but you can
    > try with 100 (in one or all the loops) and you get exactly the same
    > figures.
    >
    > I definitely think that I have made a bad implementation of the try:
    > code block, but I can't figure out what's going wrong.
    >
    > I would appreciate some ideas.


    In my experience of hunting these you want to call gc.collect() and
    track.update() *inside* the loops. Other functions you might want to
    call are things like sre.purge(), _strptime.clear_cache(),
    linecache.clearcache()... there's a seemingly unbounded number of
    caches around that can interfere.

    Cheers,
    mwh

    --
    A difference which makes no difference is no difference at all.
    -- William James (I think. Reference anyone?)
     
    Michael Hudson, Sep 1, 2003
    #8
  9. Debian User

    Will Ware Guest

    Debian User wrote:
    > I'm trying to discover a memory leak on a program of mine...


    Several years ago, I came up with a memory leak detector that I used for
    C extensions with Python 1.5.2. This was before there were gc.* methods
    available, and I'm guessing they probably do roughly the same things.
    Still, in the unlikely event it's helpful:
    http://www.faqts.com/knowledge_base/view.phtml/aid/6006

    Now that I think of it, this might be helpful after all. With this
    approach, you're checking the total refcount at various points in the
    loop in your C code, rather than only in the Python code. Take a look
    anyway.

    Good luck
    Will Ware
     
    Will Ware, Sep 1, 2003
    #9
  10. Edward K. Ream wrote:

    > Here are two general ideas:
    >
    > 1. Try to simplify the problem. Pick something, no matter how small (and
    > the smaller the better) that doesn't seem to be correct and do what it
    > takes to find out why it isn't correct.


    Yeah... using this approach I was finally able to hunt the leak!!!.

    The problem was hidden in C code that is used to access to a C library. I'm
    afraid that valgrind was unable to detect that because the underlying C
    library does not call the standard malloc to create the leaking objects.

    Of course, the Python reference counters were unable to detect that as well
    (although some black points still remain, but not very important).

    Anyway, thanks very much for the advices and encouragement!

    Francesc Alted
     
    Francesc Alted, Sep 2, 2003
    #10
  11. > Yeah... using this approach I was finally able to hunt the leak!!!.
    ....
    > Anyway, thanks very much for the advices and encouragement!


    You are welcome. IMO, if you can track down memory problems in C you can
    debug just about anything, with the notable exception of numeric programs.
    Debugging numeric calculations is hard, and will always remain so.

    Edward
    --------------------------------------------------------------------
    Edward K. Ream email:
    Leo: Literate Editor with Outlines
    Leo: http://webpages.charter.net/edreamleo/front.html
    --------------------------------------------------------------------
     
    Edward K. Ream, Sep 3, 2003
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Richard
    Replies:
    10
    Views:
    656
  2. Steven T. Hatton

    Snipe hunting in C++

    Steven T. Hatton, Nov 5, 2004, in forum: C++
    Replies:
    23
    Views:
    877
    Marcelo Pinto
    Nov 9, 2004
  3. s.subbarayan

    Dynamic memory allocation and memory leak...

    s.subbarayan, Mar 18, 2005, in forum: C Programming
    Replies:
    10
    Views:
    722
    Eric Sosman
    Mar 22, 2005
  4. Replies:
    0
    Views:
    320
  5. Iñigo Serna

    [OT] Python hunting

    Iñigo Serna, May 23, 2009, in forum: Python
    Replies:
    0
    Views:
    268
    Iñigo Serna
    May 23, 2009
Loading...

Share This Page