Trying to understand the memory occupation of big lists

Discussion in 'Python' started by Michele Simionato, May 3, 2013.

  1. I have a memory leak in a program using big arrays. With the goal of debugging it I run into the memory_profiler module. Then I discovered something which is surprising to me. Please consider the following script:

    $ cat memtest.py
    import gc
    from memory_profiler import profile


    @profile
    def test1():
    a = [0] * 1024 * 1024
    del a
    gc.collect() # nothing change if I comment this


    @profile
    def test2():
    for i in range(10):
    a = [0] * 1024 * 1024
    del a
    gc.collect() # nothing change if I comment this


    test1()
    test2()

    Here is its output, on a Linux 64 bit machine:

    $ python memtest.py
    Filename: memtest.py

    Line # Mem usage Increment Line Contents
    ================================================
    5 @profile
    6 9.250 MB 0.000 MB def test1():
    7 17.246 MB 7.996 MB a = [0] * 1024 * 1024
    8 9.258 MB -7.988 MB del a
    9 9.258 MB 0.000 MB gc.collect() # nothing change if I comment this


    Filename: memtest.py

    Line # Mem usage Increment Line Contents
    ================================================
    12 @profile
    13 9.262 MB 0.000 MB def test2():
    14 17.270 MB 8.008 MB for i in range(10):
    15 17.270 MB 0.000 MB a = [0] * 1024 * 1024
    16 17.270 MB 0.000 MB del a
    17 17.270 MB 0.000 MB gc.collect() # nothing change if I comment this

    In the first case the memory is released (even if strangely not
    completely, 7.996 != 7.988), in the second case the memory is not. Why it is so? I did expect gc.collect() to free the memory but it is completely ininfluent. In the second cases there are 10 lists with 8 MB each, so
    80 MB are allocated and 72 released, but 8 MB are still there apparently.
    It does not look like a problem of mem_profile, this is what observe with
    top too.

    Any ideas?
     
    Michele Simionato, May 3, 2013
    #1
    1. Advertising

  2. Michele Simionato

    Dave Angel Guest

    On 05/03/2013 07:24 AM, Michele Simionato wrote:
    > I have a memory leak in a program using big arrays.


    Actually, big lists. Python also has arrays, and they're entirely
    different.

    With the goal of debugging it I run into the memory_profiler module.
    Then I discovered something which is surprising to me. Please consider
    the following script:
    >
    > $ cat memtest.py
    > import gc
    > from memory_profiler import profile
    >
    >
    > @profile
    > def test1():
    > a = [0] * 1024 * 1024
    > del a
    > gc.collect() # nothing change if I comment this
    >
    >
    > @profile
    > def test2():
    > for i in range(10):
    > a = [0] * 1024 * 1024
    > del a
    > gc.collect() # nothing change if I comment this
    >
    >
    > test1()
    > test2()
    >
    > Here is its output, on a Linux 64 bit machine:
    >
    > $ python memtest.py
    > Filename: memtest.py
    >
    > Line # Mem usage Increment Line Contents
    > ================================================
    > 5 @profile
    > 6 9.250 MB 0.000 MB def test1():
    > 7 17.246 MB 7.996 MB a = [0] * 1024 * 1024
    > 8 9.258 MB -7.988 MB del a
    > 9 9.258 MB 0.000 MB gc.collect() # nothing change if I comment this
    >
    >
    > Filename: memtest.py
    >
    > Line # Mem usage Increment Line Contents
    > ================================================
    > 12 @profile
    > 13 9.262 MB 0.000 MB def test2():
    > 14 17.270 MB 8.008 MB for i in range(10):
    > 15 17.270 MB 0.000 MB a = [0] * 1024 * 1024
    > 16 17.270 MB 0.000 MB del a
    > 17 17.270 MB 0.000 MB gc.collect() # nothing change if I comment this
    >
    > In the first case the memory is released (even if strangely not
    > completely, 7.996 != 7.988), in the second case the memory is not. Why it is so? I did expect gc.collect() to free the memory but it is completely ininfluent. In the second cases there are 10 lists with 8 MB each, so
    > 80 MB are allocated and 72 released, but 8 MB are still there apparently.
    > It does not look like a problem of mem_profile, this is what observe with
    > top too.
    >
    > Any ideas?
    >


    I haven't played with profile, so my comments are limited to the direct
    code.

    gd.collect() has nothing to do in either of these functions, since the
    memory has already been released by the ref-count logic. Only in the
    case of a circular reference is the gc.collect() useful. If you want to
    see gc.collect() in action create two large objects that reference each
    other and a small one that references one of them. Del the first two
    and then the third, and the memory cannot be released since the ref
    counts are nonzero. Then do a gc.collect() which will realize that you
    have no way to reference either of the two large objects.

    I suspect that profile is only looking at the memory from the point of
    view of the OS. No block of memory can be released to the OS unless
    it's entirely freed. My guess is that in the second case the variable i
    (or some other internal one relating to the loop) is in the same block
    with one of those lists. The point is that CPython uses the C malloc()
    and free() functions, and they have their own limitations. Most of the
    time when free() is called, the memory is NOT released to the OS, but is
    still made available within Python for future use.


    --
    DaveA
     
    Dave Angel, May 3, 2013
    #2
    1. Advertising

  3. Michele Simionato

    Maarten Guest

    I made a few changes:

    import gc
    from memory_profiler import profile

    @profile
    def test1():
    a = [0] * 1024**2
    del a
    a = [0] * 1024**2
    del a
    a = [0] * 1024**2
    del a
    a = [0] * 1024**2
    del a
    a = [0] * 1024**2
    del a
    a = [0] * 1024**2
    del a
    a = [0] * 1024**2
    del a
    a = [0] * 1024**2
    del a
    a = [0] * 1024**2
    del a
    a = [0] * 1024**2
    del a
    gc.collect() # nothing change if I comment this


    @profile
    def test2():
    for i in range(10):
    a = [0] * 1024**2
    del a
    del i
    gc.collect() # nothing change if I comment this


    test1()
    test2()

    # end of code

    Output:

    Filename: profile.py

    Line # Mem usage Increment Line Contents
    ================================================
    5 @profile
    6 8.688 MB 0.000 MB def test1():
    7 16.691 MB 8.004 MB a = [0] * 1024**2
    8 8.688 MB -8.004 MB del a
    9 16.680 MB 7.992 MB a = [0] * 1024**2
    10 16.680 MB 0.000 MB del a
    11 16.680 MB 0.000 MB a = [0] * 1024**2
    12 16.680 MB 0.000 MB del a
    13 16.680 MB 0.000 MB a = [0] * 1024**2
    14 16.680 MB 0.000 MB del a
    15 16.680 MB 0.000 MB a = [0] * 1024**2
    16 16.680 MB 0.000 MB del a
    17 16.680 MB 0.000 MB a = [0] * 1024**2
    18 16.680 MB 0.000 MB del a
    19 16.680 MB 0.000 MB a = [0] * 1024**2
    20 16.680 MB 0.000 MB del a
    21 16.680 MB 0.000 MB a = [0] * 1024**2
    22 16.680 MB 0.000 MB del a
    23 16.680 MB 0.000 MB a = [0] * 1024**2
    24 16.680 MB 0.000 MB del a
    25 16.680 MB 0.000 MB a = [0] * 1024**2
    26 16.680 MB 0.000 MB del a
    27 16.680 MB 0.000 MB gc.collect() # nothing change if I comment this


    Filename: profile.py

    Line # Mem usage Increment Line Contents
    ================================================
    30 @profile
    31 16.691 MB 0.000 MB def test2():
    32 16.691 MB 0.000 MB for i in range(10):
    33 16.691 MB 0.000 MB a = [0] * 1024**2
    34 16.691 MB 0.000 MB del a
    35 16.691 MB 0.000 MB del i
    36 16.691 MB 0.000 MB gc.collect() # nothing change if I comment this

    If I make the two functions identical, the behave the same.

    Maarten
     
    Maarten, May 3, 2013
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Oleg
    Replies:
    3
    Views:
    552
    Ralf Hildebrandt
    Mar 5, 2004
  2. Shaguf
    Replies:
    0
    Views:
    524
    Shaguf
    Dec 24, 2008
  3. Shaguf
    Replies:
    0
    Views:
    481
    Shaguf
    Dec 26, 2008
  4. Shaguf
    Replies:
    0
    Views:
    259
    Shaguf
    Dec 26, 2008
  5. Shaguf
    Replies:
    0
    Views:
    236
    Shaguf
    Dec 24, 2008
Loading...

Share This Page