memory use with regard to large pickle files

Discussion in 'Python' started by Catherine Moroney, Oct 15, 2008.

  1. I'm writing a python program that reads in a very large
    "pickled" file (consisting of one large dictionary and one
    small one), and parses the results out to several binary and hdf
    files.

    The program works fine, but the memory load is huge. The size of
    the pickle file on disk is about 900 Meg so I would theoretically
    expect my program to consume about twice that (the dictionary
    contained in the pickle file plus its repackaging into other formats),
    but instead my program needs almost 5 Gig of memory to run.
    Am I being unrealistic in my memory expectations?

    I'm running Python 2.5 on a Linux box (Fedora release 7).

    Is there a way to see how much memory is being consumed
    by a single data structure or variable? How can I go about
    debugging this problem?

    Catherine
     
    Catherine Moroney, Oct 15, 2008
    #1
    1. Advertising

  2. > The program works fine, but the memory load is huge. The size of
    > the pickle file on disk is about 900 Meg so I would theoretically
    > expect my program to consume about twice that (the dictionary
    > contained in the pickle file plus its repackaging into other formats),
    > but instead my program needs almost 5 Gig of memory to run.
    > Am I being unrealistic in my memory expectations?


    I would say so, yes. As you use 5GiB of memory, it seems you are
    running a 64-bit system.

    On such a system, each pointer takes 8 bytes. In addition,
    each object takes at least 16 bytes; if it's variable-sized,
    it takes at least 24 bytes, plus the actual data in the object.

    OTOH, in a pickle, a pointer takes no space, unless it's a
    shared pointer (i.e. backwards reference), which takes
    as many digits as you need to encode the "object number"
    in the pickle. Each primitive object takes only a single byte
    overhead (as opposed to 24), causing quite drastic space
    reductions. Of course, non-primitive objects take more, as
    they need to encode the class they are instances of.

    > Is there a way to see how much memory is being consumed
    > by a single data structure or variable? How can I go about
    > debugging this problem?


    In Python 2.6, there is the sys.getsizeof function. For
    earlier versions, the asizeof package gives similar results.

    Regards,
    Martin
     
    Martin v. Löwis, Oct 15, 2008
    #2
    1. Advertising

  3. In message <gd5mto$q0c$>, Catherine Moroney wrote:

    > I'm writing a python program that reads in a very large
    > "pickled" file (consisting of one large dictionary and one
    > small one), and parses the results out to several binary and hdf
    > files.


    Job for a database?
     
    Lawrence D'Oliveiro, Oct 19, 2008
    #3
  4. Catherine Moroney

    Aaron Brady Guest

    Catherine Moroney wrote:

    > I'm writing a python program that reads in a very large
    > "pickled" file (consisting of one large dictionary and one
    > small one), and parses the results out to several binary and hdf
    > files.
    >
    > The program works fine, but the memory load is huge. The size of
    > the pickle file on disk is about 900 Meg so I would theoretically
    > expect my program to consume about twice that (the dictionary
    > contained in the pickle file plus its repackaging into other formats),
    > but instead my program needs almost 5 Gig of memory to run.
    > Am I being unrealistic in my memory expectations?
    >
    > I'm running Python 2.5 on a Linux box (Fedora release 7).
    >
    > Is there a way to see how much memory is being consumed
    > by a single data structure or variable? How can I go about
    > debugging this problem?
    >
    > Catherine


    There's always the 'shelve' module.
     
    Aaron Brady, Oct 19, 2008
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Atirya Yodha

    Question with regard to Sockets

    Atirya Yodha, Mar 7, 2004, in forum: Java
    Replies:
    1
    Views:
    454
    Thomas Jollans
    Mar 7, 2004
  2. JKop
    Replies:
    28
    Views:
    1,713
    Richard Herring
    Sep 20, 2004
  3. a pickle's pickle

    , Aug 2, 2005, in forum: Python
    Replies:
    4
    Views:
    393
  4. Replies:
    13
    Views:
    524
    James Kanze
    May 27, 2006
  5. Michele Simionato
    Replies:
    2
    Views:
    1,911
    Michele Simionato
    May 23, 2008
Loading...

Share This Page