Python Memory Manager

Discussion in 'Python' started by Pie Squared, Feb 17, 2008.

  1. Pie Squared

    Pie Squared Guest

    I've been looking at the Python source code recently, more
    specifically trying to figure out how it's garbage collector works.

    I've gathered that it uses refcounting as well as some cycle-detection
    algorithms, but I haven't been able to figure out some other things.

    Does Python actually have a single 'heap' where all the data is
    stored? Because PyObject_HEAD seemed to imply to me it was just a
    linked list of objects, although perhaps I didnt understand this
    correctly.

    Also, if it does, how does it deal with memory segmentation? This
    question bothers me because I've been trying to implement a moving
    garbage collector, and am not sure how to deal with updating all
    program pointers to objects on the heap, and thought perhaps an answer
    to this question would give me some ideas. (Also, if you know any
    resources for things like this, I'd be grateful for links/names)

    Thanks in advance,

    Pie Squared
    Pie Squared, Feb 17, 2008
    #1
    1. Advertising

  2. Pie Squared

    Paul Rubin Guest

    Pie Squared <> writes:
    > Also, if it does, how does it deal with memory segmentation? This
    > question bothers me because I've been trying to implement a moving
    > garbage collector, and am not sure how to deal with updating all
    > program pointers to objects on the heap, and thought perhaps an answer
    > to this question would give me some ideas.


    As I understand it, Python primarily uses reference counting, with a
    mark and sweep scheme for cycle breaking tacked on as an afterthought.
    It doesn't move objects in memory during GC so you can get
    fragmentation. It's probably not feasible to change this in CPython
    without extensive rewriting of CPython and maybe a lot of external C
    modules.

    > (Also, if you know any
    > resources for things like this, I'd be grateful for links/names)


    If you mean about GC in general, the old book by Jones and Lins is
    still standard, I think.
    Paul Rubin, Feb 17, 2008
    #2
    1. Advertising

  3. Pie Squared

    Pie Squared Guest

    On Feb 17, 1:57 pm, Paul Rubin <http://> wrote:
    > Pie Squared <> writes:
    > > Also, if it does, how does it deal with memory segmentation? This
    > > question bothers me because I've been trying to implement a moving
    > > garbage collector, and am not sure how to deal with updating all
    > > program pointers to objects on the heap, and thought perhaps an answer
    > > to this question would give me some ideas.

    >
    > As I understand it, Python primarily uses reference counting, with a
    > mark and sweep scheme for cycle breaking tacked on as an afterthought.
    > It doesn't move objects in memory during GC so you can get
    > fragmentation. It's probably not feasible to change this in CPython
    > without extensive rewriting of CPython and maybe a lot of external C
    > modules.
    >
    > > (Also, if you know any
    > > resources for things like this, I'd be grateful for links/names)

    >
    > If you mean about GC in general, the old book by Jones and Lins is
    > still standard, I think.


    Thanks for the quick reply!

    That answered my question, and I'll check out the book you're
    referring to - it's exactly what I need, I think. Again, thanks!
    Pie Squared, Feb 17, 2008
    #3
  4. Pie Squared

    Guest

    I researched this for some Java I wrote. Try to avoid shuffling
    physical memory - you'll write a lot less code and it will be faster,
    too.

    Use an "allocated" list and an "available" list. Keep them in address
    order. Inserting (moving list elements from insertion point to end)
    and deleting (vice-versa) are near-zero cost operations on Intel
    boxes. ( Two millis to move a million ints at 1GHz 5 years ago when I
    wrote http://www.martinrinehart.com/articles/repz.html - probably half
    that today.)

    The worst choice is the "best fit" allocation algorithm. (Grabbing
    most of a free bit leaves a probably useless small bit. Grab from the
    first big piece you find.) Circular first-fit is probably best.
    (Testing was for compiler-type applications.)

    When space is freed, insert link in ordered chain of free space
    blocks. Then combine with prev/next blocks if they are free.

    And when you find the function in Python that matches Java's
    System.arraycopy(), please tell me what it's called. I'm sure there
    must be one.
    , Feb 17, 2008
    #4
  5. Pie Squared

    Steve Holden Guest

    Paul Rubin wrote:
    > Pie Squared <> writes:
    >> Also, if it does, how does it deal with memory segmentation? This
    >> question bothers me because I've been trying to implement a moving
    >> garbage collector, and am not sure how to deal with updating all
    >> program pointers to objects on the heap, and thought perhaps an answer
    >> to this question would give me some ideas.

    >
    > As I understand it, Python primarily uses reference counting, with a
    > mark and sweep scheme for cycle breaking tacked on as an afterthought.
    > It doesn't move objects in memory during GC so you can get
    > fragmentation. It's probably not feasible to change this in CPython
    > without extensive rewriting of CPython and maybe a lot of external C
    > modules.
    >
    >> (Also, if you know any
    >> resources for things like this, I'd be grateful for links/names)

    >
    > If you mean about GC in general, the old book by Jones and Lins is
    > still standard, I think.


    You also need to be aware that there are certain allocation classes
    which use arenas, areas of storage dedicated to specific object types.
    When those objects are destroyed their space is returned to the arena,
    but the arena is either never returned to the free pool or only returned
    to it when the last allocated item is collected.

    There seem to be more of those kind of tricks coming up in 2.6 and 3.0.

    regards
    Steve
    --
    Steve Holden +1 571 484 6266 +1 800 494 3119
    Holden Web LLC http://www.holdenweb.com/
    Steve Holden, Feb 17, 2008
    #5
  6. Pie Squared

    Pie Squared Guest

    On Feb 17, 3:05 pm, wrote:
    > I researched this for some Java I wrote. Try to avoid shuffling
    > physical memory - you'll write a lot less code and it will be faster,
    > too.
    >
    > Use an "allocated" list and an "available" list. Keep them in address
    > order. Inserting (moving list elements from insertion point to end)
    > and deleting (vice-versa) are near-zero cost operations on Intel
    > boxes. ( Two millis to move a million ints at 1GHz 5 years ago when I
    > wrotehttp://www.martinrinehart.com/articles/repz.html- probably half
    > that today.)


    It seems to me that another, perhaps better strategy, would be to
    allocate a large heap space, then store a pointer to the base of the
    heap, the current heap size, and the beginning of the free memory.
    When you need to 'allocate' more room, just return a pointer to some
    location in the heap and increment the start-of-free-memory pointer.
    That way, allocation really IS free, more or less. Wouldn't that be
    more efficient? Perhaps I'm missing something.

    As a side note, I'm new to Usenet, so I'm not exactly sure... are
    'tangents' like this - since this IS a Python newsgroup, after all -
    okay?

    Anyway, thanks for the suggestion.
    Pie Squared, Feb 17, 2008
    #6
  7. Pie Squared wrote:
    > I've been looking at the Python source code recently, more
    > specifically trying to figure out how it's garbage collector works.
    >
    > I've gathered that it uses refcounting as well as some cycle-detection
    > algorithms, but I haven't been able to figure out some other things.


    Python uses ref counting for all objects and an additional GC for
    container objects like lists and dicts. The cyclic GC depends on ref
    counting.

    > Does Python actually have a single 'heap' where all the data is
    > stored? Because PyObject_HEAD seemed to imply to me it was just a
    > linked list of objects, although perhaps I didnt understand this
    > correctly.


    In release builds PyObject_HEAD only contains the ref count and a link
    to the object type. In Py_DEBUG builds it also contains a double linked
    list of all allocated objects to debug reference counting bugs.

    Python uses its own optimized memory allocator. Be sure you have read
    Object/obmalloc.c!

    > Also, if it does, how does it deal with memory segmentation? This
    > question bothers me because I've been trying to implement a moving
    > garbage collector, and am not sure how to deal with updating all
    > program pointers to objects on the heap, and thought perhaps an answer
    > to this question would give me some ideas. (Also, if you know any
    > resources for things like this, I'd be grateful for links/names)


    I don't think it's possible to implement a moving GC. You'd have to
    chance some fundamental parts of Python.

    Christian
    Christian Heimes, Feb 17, 2008
    #7
  8. Pie Squared

    Paul Rubin Guest

    Pie Squared <> writes:
    > It seems to me that another, perhaps better strategy, would be to
    > allocate a large heap space, then store a pointer to the base of the
    > heap, the current heap size, and the beginning of the free memory.
    > When you need to 'allocate' more room, just return a pointer to some
    > location in the heap and increment the start-of-free-memory pointer.
    > That way, allocation really IS free, more or less. Wouldn't that be
    > more efficient? Perhaps I'm missing something.


    The problem here is with a high allocation rate, you have to GC a lot
    more often, which typically involves copying live data. So now you
    have to figure out how to reduce the amount of copying using
    generational schemes, (these days) come up with ways to make all this
    work in the presence of parallel threads, etc. It sounds like you're
    new to the subject, so for now I'll summarize the situation by saying
    Python uses a fairly simple scheme that's a reasonable match for its
    implementation as a small, medium performance interpreter; however,
    higher performance GC'd language implementations end up doing much
    more complicated things. See the Jones and Lins book that I
    mentioned, and there's an even older one by Appel, and the Wikipedia
    article on garbage collection may have some links you can look at.
    Also, Structure and Interpretation of Computer Programs (full text
    online) includes a simple implementation of a copying GC, that might
    be more tutorial.

    > As a side note, I'm new to Usenet, so I'm not exactly sure... are
    > 'tangents' like this - since this IS a Python newsgroup, after all - okay?


    It's reasonably on topic, compared with last week's long digression
    about the dimensional analysis of the Kessel Run.
    Paul Rubin, Feb 17, 2008
    #8
  9. Pie Squared

    thebjorn Guest

    On Feb 17, 10:01 pm, Pie Squared <> wrote:
    [...]
    > It seems to me that another, perhaps better strategy, would be to
    > allocate a large heap space, then store a pointer to the base of the
    > heap, the current heap size, and the beginning of the free memory.
    > When you need to 'allocate' more room, just return a pointer to some
    > location in the heap and increment the start-of-free-memory pointer.
    > That way, allocation really IS free, more or less. Wouldn't that be
    > more efficient? Perhaps I'm missing something.


    Deallocation?

    > As a side note, I'm new to Usenet, so I'm not exactly sure... are
    > 'tangents' like this - since this IS a Python newsgroup, after all -
    > okay?


    It varies depending on the group, c.l.py is pretty tolerant as long as
    it's interesting ;-)

    To bring it back to Python, I was under the impression that the GC was
    a generational collector and not a simple mark-sweep, but it's been a
    while since I read about it...

    -- bjorn
    thebjorn, Feb 17, 2008
    #9
  10. >> Also, if it does, how does it deal with memory segmentation? This
    >> question bothers me because I've been trying to implement a moving
    >> garbage collector, and am not sure how to deal with updating all
    >> program pointers to objects on the heap, and thought perhaps an answer
    >> to this question would give me some ideas.

    >
    > As I understand it, Python primarily uses reference counting, with a
    > mark and sweep scheme for cycle breaking tacked on as an afterthought.


    That's not exactly true, i.e. it isn't mark-and-sweep, but some similar
    scheme that allows incremental collection without write barriers. This
    particular scheme heavily relies on refcounting itself (specifically,
    an object is garbage in a certain generation when all references to
    it come from the same generation).

    As for the consequences of the scheme (i.e. no compaction), you are
    right.

    Regards,
    Martin
    Martin v. Löwis, Feb 17, 2008
    #10
  11. Pie Squared

    Paul Rubin Guest

    "Martin v. Löwis" <> writes:
    > That's not exactly true, i.e. it isn't mark-and-sweep, but some similar
    > scheme that allows incremental collection without write barriers. This
    > particular scheme heavily relies on refcounting itself (specifically,
    > an object is garbage in a certain generation when all references to
    > it come from the same generation).


    Ah, thanks. I made another post which I guess is also somewhat wrong.
    Paul Rubin, Feb 17, 2008
    #11
  12. Pie Squared

    greg Guest

    Paul Rubin wrote:

    > As I understand it, Python primarily uses reference counting, with a
    > mark and sweep scheme for cycle breaking tacked on as an afterthought.


    It's not mark-and-sweep, it's a cycle detector. It goes through
    all allocated objects of certain types, and all objects reachable
    from those objects, trying to find sets of objects whose reference
    counts are all accounted for by references within the set. Then
    it picks an arbitrary object in each set and clears all the
    references from that object. This breaks the cycle, and allows all
    the objects in it to be reclaimed by their reference counts
    dropping to zero.

    (There's some kind of generational scheme in there as well,
    I think, but I don't know the details.)

    In a sense, this is the opposite of mark-and sweep. A mark-and-
    sweep collector assumes that everything is garbage unless it
    can be proven that it's not. Python's cycle detector, on the other
    hand, assumes that nothing is garbage unless it can be proven
    that it is.

    An advantage of this refcount/cycle detection scheme is that
    there is no "root set" that must be maintained at all costs.
    This makes it fairly easy to interface Python with external
    libraries that have their own notions of how to manage memory.

    > It doesn't move objects in memory during GC so you can get
    > fragmentation.


    It never moves PyObject structs, but variable-sized objects
    such as lists have their contents stored in a separate block
    of memory, that can be moved when its size changes.

    --
    Greg
    greg, Feb 18, 2008
    #12
  13. Pie Squared

    greg Guest

    Christian Heimes wrote:
    > In release builds PyObject_HEAD only contains the ref count and a link
    > to the object type. In Py_DEBUG builds it also contains a double linked
    > list of all allocated objects to debug reference counting bugs.


    There's also a doubly-linked list used by the cycle detector,
    but it doesn't hold all objects, only those that need to
    participate in cyclic GC. Objects such as numbers and strings,
    which don't contain references to other objects, don't need
    to be on this list, since they can never be part of a cycle.

    Some other immutable types such as tuples also don't need to
    be on it, even though they contain references, because it's
    impossible to create a cycle consisting entirely of such
    objects. There has to be at least one mutable object in the
    cycle, and the GC will be able to find the cycle via that
    object.

    --
    Greg
    greg, Feb 18, 2008
    #13
  14. Pie Squared

    Guest

    Paul Rubin wrote:
    > The problem here is with a high allocation rate, you have to GC a lot
    > more often, which typically involves copying live data.


    This is last century's issue. Copying data, RAM to RAM, is nearly free
    using the Intel architecture.

    This short article, http://www.martinrinehart.com/articles/repz.html
    explains why.

    I'd use one int per clock as a rule of thumb for the current copy rate
    in a single-core CPU.
    , Feb 18, 2008
    #14
  15. Pie Squared

    Steve Holden Guest

    wrote:
    >
    > Paul Rubin wrote:
    >> The problem here is with a high allocation rate, you have to GC a lot
    >> more often, which typically involves copying live data.

    >
    > This is last century's issue. Copying data, RAM to RAM, is nearly free
    > using the Intel architecture.
    >
    > This short article, http://www.martinrinehart.com/articles/repz.html
    > explains why.
    >
    > I'd use one int per clock as a rule of thumb for the current copy rate
    > in a single-core CPU.


    You have a strange idea of "nearly free", and I find your analysis a
    little simplistic. You may take advantage of direct memory access, but
    cycles are still consumed. I presume also (though here my knowledge of
    present-day Intel and AMD architectures gets a little sketchy) that many
    data cache lines could still be invalidated by these moves, and that
    also has a significant effect on performance.

    Extending an integer array from 100 to 150 items is a pretty puny
    operation when you compare it with the amount of data that might need to
    be moved during a compactifying garbage collection of a 20MB Python
    program image.

    Not only that, but all pointers to an object have to be updated when it
    is relocated.

    regards
    Steve
    --
    Steve Holden +1 571 484 6266 +1 800 494 3119
    Holden Web LLC http://www.holdenweb.com/
    Steve Holden, Feb 18, 2008
    #15
  16. Pie Squared

    Guest

    Quoting Steve Holden <>:

    > [...]
    > Not only that, but all pointers to an object have to be updated when it
    > is relocated.


    "Any problem in computer science can be solved by another level of indirection"
    -- David John Wheeler

    ;-)

    RB
    , Feb 18, 2008
    #16
  17. Pie Squared

    Jeff Schwab Guest

    wrote:
    >
    > Paul Rubin wrote:
    >> The problem here is with a high allocation rate, you have to GC a lot
    >> more often, which typically involves copying live data.

    >
    > This is last century's issue. Copying data, RAM to RAM, is nearly free
    > using the Intel architecture.


    What's "the Intel architecture?" Do you mean the x86_64 architecture
    that was actually developed by AMD, or x86 for x > some number, or do
    you actually mean IA64?

    >
    > This short article, http://www.martinrinehart.com/articles/repz.html
    > explains why.
    >
    > I'd use one int per clock as a rule of thumb for the current copy rate
    > in a single-core CPU.
    Jeff Schwab, Feb 18, 2008
    #17
  18. Pie Squared

    Guest

    Jeff Schwab wrote:
    > What's "the Intel architecture?" Do you mean the x86_64 architecture
    > that was actually developed by AMD, or x86 for x > some number, or do
    > you actually mean IA64?


    I mean chips of the family that goes back to the 8086 and 8088 chips,
    chips that support the REPZ prefix to the MOVSW instruction.
    , Feb 20, 2008
    #18
  19. Pie Squared

    Paul Rubin Guest

    writes:
    > I mean chips of the family that goes back to the 8086 and 8088 chips,
    > chips that support the REPZ prefix to the MOVSW instruction.


    repz movsw is a pretty lame way to copy data on current x86's.
    Use XMM instead.
    Paul Rubin, Feb 20, 2008
    #19
  20. Pie Squared

    Guest

    Steve Holden wrote:
    > You have a strange idea of "nearly free" ...
    >
    > Extending an integer array from 100 to 150 items is a pretty puny
    > operation when you compare it with the amount of data that might need to
    > be moved during a compactifying garbage collection of a 20MB Python
    > program image.


    20 MBs = 5 M 32-bit words = 1.25 millis to move half of them on a
    2GHz
    machine. Don't know how much a milli costs where you live.
    , Feb 20, 2008
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. padma
    Replies:
    0
    Views:
    337
    padma
    Oct 3, 2007
  2. Metalone
    Replies:
    0
    Views:
    333
    Metalone
    Jan 6, 2010
  3. Jadhav, Alok
    Replies:
    9
    Views:
    308
    Thomas Rachel
    Nov 15, 2012
  4. Jadhav, Alok
    Replies:
    0
    Views:
    246
    Jadhav, Alok
    Sep 17, 2012
  5. Replies:
    4
    Views:
    110
Loading...

Share This Page