RE: Python garbage collector/memory manager behaving strangely

Discussion in 'Python' started by Jadhav, Alok, Sep 17, 2012.

  1. Jadhav, Alok

    Jadhav, Alok Guest

    I am thinking of calling a new subprocess which will do the memory
    hungry job and then release the memory as specified in the link below

    http://stackoverflow.com/questions/1316767/how-can-i-explicitly-free-mem
    ory-in-python/1316799#1316799

    Regards,
    Alok



    -----Original Message-----
    From: Dave Angel [mailto:]
    Sent: Monday, September 17, 2012 10:13 AM
    To: Jadhav, Alok
    Cc:
    Subject: Re: Python garbage collector/memory manager behaving strangely

    On 09/16/2012 09:07 PM, Jadhav, Alok wrote:
    > Hi Everyone,
    >
    >
    >
    > I have a simple program which reads a large file containing few

    million
    > rows, parses each row (`numpy array`) and converts into an array of
    > doubles (`python array`) and later writes into an `hdf5 file`. I

    repeat
    > this loop for multiple days. After reading each file, i delete all the
    > objects and call garbage collector. When I run the program, First day
    > is parsed without any error but on the second day i get `MemoryError`.

    I
    > monitored the memory usage of my program, during first day of parsing,
    > memory usage is around **1.5 GB**. When the first day parsing is
    > finished, memory usage goes down to **50 MB**. Now when 2nd day starts
    > and i try to read the lines from the file I get `MemoryError`.

    Following
    > is the output of the program.
    >
    >
    >
    >
    >
    > source file extracted at C:\rfadump\au\2012.08.07.txt
    >
    > parsing started
    >
    > current time: 2012-09-16 22:40:16.829000
    >
    > 500000 lines parsed
    >
    > 1000000 lines parsed
    >
    > 1500000 lines parsed
    >
    > 2000000 lines parsed
    >
    > 2500000 lines parsed
    >
    > 3000000 lines parsed
    >
    > 3500000 lines parsed
    >
    > 4000000 lines parsed
    >
    > 4500000 lines parsed
    >
    > 5000000 lines parsed
    >
    > parsing done.
    >
    > end time is 2012-09-16 23:34:19.931000
    >
    > total time elapsed 0:54:03.102000
    >
    > repacking file
    >
    > done
    >
    > >

    s:\users\aaj\projects\pythonhf\rfadumptohdf.py(132)generateFiles()
    >
    > -> while single_date <= self.end_date:
    >
    > (Pdb) c
    >
    > *** 2012-08-08 ***
    >
    > source file extracted at C:\rfadump\au\2012.08.08.txt
    >
    > cought an exception while generating file for day 2012-08-08.
    >
    > Traceback (most recent call last):
    >
    > File "rfaDumpToHDF.py", line 175, in generateFile
    >
    > lines = self.rawfile.read().split('|\n')
    >
    > MemoryError
    >
    >
    >
    > I am very sure that windows system task manager shows the memory usage
    > as **50 MB** for this process. It looks like the garbage collector or
    > memory manager for Python is not calculating the free memory

    correctly.
    > There should be lot of free memory but it thinks there is not enough.
    >
    >
    >
    > Any idea?
    >
    >
    >
    > Thanks.
    >
    >
    >
    >
    >
    > Alok Jadhav
    >
    > CREDIT SUISSE AG
    >
    > GAT IT Hong Kong, KVAG 67
    >
    > International Commerce Centre | Hong Kong | Hong Kong
    >
    > Phone +852 2101 6274 | Mobile +852 9169 7172
    >
    > | www.credit-suisse.com
    > <http://www.credit-suisse.com/>
    >
    >
    >


    Don't blame CPython. You're trying to do a read() of a large file,
    which will result in a single large string. Then you split it into
    lines. Why not just read it in as lines, in which case the large string
    isn't necessary. Take a look at the readlines() function. Chances are
    that even that is unnecessary, but i can't tell without seeing more of
    the code.

    lines = self.rawfile.read().split('|\n')

    lines = self.rawfile.readlines()

    When a single large item is being allocated, it's not enough to have
    sufficient free space, the space also has to be contiguous. After a
    program runs for a while, its space naturally gets fragmented more and
    more. it's the nature of the C runtime, and CPython is stuck with it.



    --

    DaveA


    ===============================================================================
    Please access the attached hyperlink for an important electronic communications disclaimer:
    http://www.credit-suisse.com/legal/en/disclaimer_email_ib.html
    ===============================================================================
    Jadhav, Alok, Sep 17, 2012
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?iso-8859-1?q?Nils Hedstr=f6m

    Stateserver behaving strangely

    =?iso-8859-1?q?Nils Hedstr=f6m, Feb 11, 2005, in forum: ASP .Net
    Replies:
    0
    Views:
    1,031
    =?iso-8859-1?q?Nils Hedstr=f6m
    Feb 11, 2005
  2. Josh Close

    re behaving strangely

    Josh Close, Jun 9, 2005, in forum: Python
    Replies:
    2
    Views:
    292
    Josh Close
    Jun 9, 2005
  3. Jadhav, Alok
    Replies:
    9
    Views:
    312
    Thomas Rachel
    Nov 15, 2012
  4. Dennis Lee Bieber
    Replies:
    0
    Views:
    189
    Dennis Lee Bieber
    Sep 17, 2012
  5. Robert Miles
    Replies:
    0
    Views:
    270
    Robert Miles
    Nov 1, 2012
Loading...

Share This Page