Tracking memory usage and object life time.

Discussion in 'Python' started by Berteun Damman, Sep 26, 2007.

  1. Hello,

    I have programmed some python script that loads a graph (the
    mathemical one with vertices and edges) into memory, does some
    transformations on it, and then tries to find shortest paths in this
    graph, typically several tens of thousands. This works fine.

    Then I made a test for this, so I could time it, run it several times
    and take a look at the best time, et cetera. But it so happens that
    the first time the test is run, is always the fastest. If I track
    memory usage of Python in top, I see it starts out with around 80 MB
    and slowly grows to 500MB. This might cause the slowdown (which is
    about a factor 5 for large graphs).

    When I run a test, I disable the garbage collection during the test
    run (as is adviced), but just before starting a test I instruct the
    garbage collector to collect. Running the test without disabling the
    garbage collect doesn't show any difference though.

    Where possible I explicitly 'del' some of the larger data structures
    that have been created after I don't need them anymore. I furthermore
    don't really see why there would be references to these larger objects
    left. (I can be mistaken of course).

    I understand this might be a bit of a vague problem, but does someone
    have any idea why the memory usage keeps growing? And whether there is
    some tool that assists me in keeping track of the objects currently
    alive and the amount of memory they occupy?

    The best I now can do is run the whole script several times (from a
    shell script) -- but this also forces Python to reparse the graph
    input again, and do some other stuff it only has to do once. And it's
    also more difficult to examine values and results this way.

    Berteun
    Berteun Damman, Sep 26, 2007
    #1
    1. Advertising

  2. Berteun Damman wrote:

    > When I run a test, I disable the garbage collection during the
    > test run (as is adviced), but just before starting a test I
    > instruct the garbage collector to collect. Running the test
    > without disabling the garbage collect doesn't show any difference
    > though.


    Did you check the return value of gc.collect? Also, try using
    other "insight" facilities provided by the gc module.

    > Where possible I explicitly 'del' some of the larger data
    > structures that have been created after I don't need them anymore.


    You cannot "del" structures, you only "del" names. Objects are
    deleted when they are not bound to any names when and if the
    garbage collector "wants" to delete them.

    > I furthermore don't really see why there would be references to
    > these larger objects left. (I can be mistaken of course).


    Be sure to check for cyclic references, they can be a problem for
    the GC.

    Regards,


    Björn

    --
    BOFH excuse #343:

    The ATM board has run out of 10 pound notes. We are having a whip
    round to refill it, care to contribute ?
    Bjoern Schliessmann, Sep 26, 2007
    #2
    1. Advertising

  3. On Sep 26, 2:31 pm, Bjoern Schliessmann <usenet-
    > wrote:
    > Did you check the return value of gc.collect? Also, try using
    > other "insight" facilities provided by the gc module.

    gc.collect states it cannot find any unreachable objects. Meanwhile
    the number of objects the garbage collector has to keep track of keeps
    increasing.

    > You cannot "del" structures, you only "del" names. Objects are
    > deleted when they are not bound to any names when and if the
    > garbage collector "wants" to delete them.

    I understand, but just before I del the name, I ask for the refererres
    to the object the name indicates, and there's only one object. Since
    it is a local variable, I think this is logical.

    This object is a dictionary which contains strings as keys, and heaps
    as values. This heap consists of tuples. Every string is referenced
    more than once (that's logical), the heaps are only referenced once.
    So I would expect those to be destroyed if I destroy the dictionary. I
    furthermore assume that if I call gc.collect() I force the garbage
    collector to collect? Even if it wouldn't "want" to collect
    otherwise?

    > Be sure to check for cyclic references, they can be a problem for
    > the GC.

    I don't see how these could occur. It's basically something like list
    (of lists possibly) of ints/strings. No list containing itself. I'll
    see whether I can make a stripped down version which exhibits the same
    memory growth.

    Berteun
    Berteun Damman, Sep 26, 2007
    #3
  4. On Sep 26, 8:06 am, Berteun Damman <> wrote:

    > that have been created after I don't need them anymore. I furthermore
    > don't really see why there would be references to these larger objects
    > left. (I can be mistaken of course).


    This could be tricky because you have a graph that (probably) allows
    you to walk its nodes, thus even having a single other reference to
    any of the nodes could keep the entire graph "alive"

    > The best I now can do is run the whole script several times (from a
    > shell script) -- but this also forces Python to reparse the graph
    > input again, and do some other stuff it only has to do once. A


    you could pickle and save the graph once the initial processing is
    done. That way subsequent runs will load substantially faster.

    i.
    Istvan Albert, Sep 26, 2007
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Matthew
    Replies:
    1
    Views:
    535
    Vladimir Kondratyev
    May 4, 2004
  2. Mohammad
    Replies:
    3
    Views:
    316
    Roedy Green
    Dec 19, 2005
  3. hvt
    Replies:
    0
    Views:
    1,209
  4. hvt
    Replies:
    0
    Views:
    1,469
  5. Erik Johnson

    tracking memory usage

    Erik Johnson, Mar 19, 2007, in forum: Python
    Replies:
    1
    Views:
    269
Loading...

Share This Page