multiprocessing eats memory

Discussion in 'Python' started by Max Ivanov, Sep 25, 2008.

  1. Max Ivanov

    Max Ivanov Guest

    I'm playing with pyprocessing module and found that it eats lot's of
    memory. I've made small test case to show it. I pass ~45mb of data to
    worker processes and than get it back slightly modified. At any time
    in main process there are shouldn't be no more than two copies of data
    (one original data and one result). I run it on 8-core server and top
    shows me that main process eats ~220 Mb and worker processes eats 90
    -150 mb. Isn't it too much?

    Small test-case is uploaded to pastebin: http://pastebin.ca/1210523
     
    Max Ivanov, Sep 25, 2008
    #1
    1. Advertisements

  2. On Sep 25, 8:40 am, "Max Ivanov" <> wrote:

    > At any time in main process there are shouldn't be no more than two copies of data
    > (one original data and one result).


    From the looks of it you are storing a lots of references to various
    copies of your data via the async set.
     
    Istvan Albert, Sep 26, 2008
    #2
    1. Advertisements

  3. Max Ivanov

    redbaron Guest

    On 26 ÓÅÎÔ, 04:20, Istvan Albert <> wrote:
    > On Sep 25, 8:40šam, "Max Ivanov" <> wrote:
    >
    > > At any time in main process there are shouldn't be no more than two copies of data
    > > (one original data and one result).

    >
    > From the looks of it you are storing a lots of references to various
    > copies of your data via the async set.


    How could I avoid of storing them? I need something to check does it
    ready or not and retrieve results if ready. I couldn't see the way to
    achieve same result without storing asyncs set.
     
    redbaron, Sep 26, 2008
    #3
  4. Max Ivanov

    MRAB Guest

    On Sep 26, 9:52 am, redbaron <> wrote:
    > On 26 ÓÅÎÔ, 04:20, Istvan Albert <> wrote:
    >
    > > On Sep 25, 8:40šam, "Max Ivanov" <> wrote:

    >
    > > > At any time in main process there are shouldn't be no more than two copies of data
    > > > (one original data and one result).

    >
    > > From the looks of it you are storing a lots of references to various
    > > copies of your data via the async set.

    >
    > How could I avoid of storing them? I need something to check does it
    > ready or not and retrieve results if ready. I couldn't see the way to
    > achieve same result without storing asyncs set.


    You could give each worker process an ID and then have them put the ID
    into a queue to signal to the main process when finished.

    BTW, your test-case modifies the asyncs set while iterating over it,
    which is a bad idea.
     
    MRAB, Sep 26, 2008
    #4
  5. Max Ivanov

    redbaron Guest

    On 26 Ñент, 17:03, MRAB <> wrote:
    > On Sep 26, 9:52 am, redbaron <> wrote:
    >
    > > On 26 ÓÅÎÔ, 04:20, Istvan Albert <> wrote:

    >
    > > > On Sep 25, 8:40Å¡am, "Max Ivanov" <> wrote:

    >
    > > > > At any time in main process there are shouldn't be no more than two copies of data
    > > > > (one original data and one result).

    >
    > > > From the looks of it you are storing a lots of references to various
    > > > copies of your data via the async set.

    >
    > > How could I avoid of storing them? I need something to check does it
    > > ready or not and retrieve results if ready. I couldn't see the way to
    > > achieve same result without storing asyncs set.

    >
    > You could give each worker process an ID and then have them put the ID
    > into a queue to signal to the main process when finished.

    And how could I retrieve result from worker process without async?

    >
    > BTW, your test-case modifies the asyncs set while iterating over it,
    > which is a bad idea.

    My fault, there was list(asyncs) originally.
     
    redbaron, Sep 26, 2008
    #5
  6. On Sep 26, 4:52 am, redbaron <> wrote:

    > How could I avoid of storing them? I need something to check does it
    > ready or not and retrieve results if ready. I couldn't see the way to
    > achieve same result without storing asyncs set.


    It all depends on what you are trying to do. The issue that you
    originally brought up is that of memory consumption.

    When processing data in parallel you will use up as much memory as
    many datasets you are processing at any given time. If you need to
    reduce memory use then you need to start fewer processes and use some
    mechanism to distribute the work on them as they become free. (see
    recommendation that uses Queues)
     
    Istvan Albert, Sep 27, 2008
    #6
  7. Max Ivanov

    redbaron Guest

    > When processing data in parallel you will use up as muchmemoryas
    > many datasets you are processing at any given time.

    Worker processes eats 2-4 times more than I pass to them.


    >If you need to
    > reducememoryuse then you need to start fewer processes and use some
    > mechanism to distribute the work on them as they become free. (see
    > recommendation that uses Queues)

    I don't understand how could I use Queue here? If worker process
    finish computing, it puts its' id into Queue, in main process I
    retrieve that id and how could I retrieve result from worker process
    then?
     
    redbaron, Sep 27, 2008
    #7
    1. Advertisements

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jos Vernon

    Re: ASP.NET eats memory too much

    Jos Vernon, Jul 1, 2003, in forum: ASP .Net
    Replies:
    2
    Views:
    473
    Jos Vernon
    Jul 1, 2003
  2. ScottO
    Replies:
    2
    Views:
    708
    ScottO
    Nov 3, 2003
  3. Dale
    Replies:
    4
    Views:
    521
  4. Jarek
    Replies:
    5
    Views:
    575
    Tom Widmer
    Dec 17, 2004
  5. =?Windows-1252?Q?Enrique_Palomo_Jim=E9nez?=

    execfile eats memory

    =?Windows-1252?Q?Enrique_Palomo_Jim=E9nez?=, Sep 19, 2005, in forum: Python
    Replies:
    0
    Views:
    348
    =?Windows-1252?Q?Enrique_Palomo_Jim=E9nez?=
    Sep 19, 2005
  6. Eugene Scripnik

    warning in ruby extension eats memory

    Eugene Scripnik, Jun 30, 2003, in forum: Ruby
    Replies:
    0
    Views:
    152
    Eugene Scripnik
    Jun 30, 2003
  7. Axel Bock
    Replies:
    3
    Views:
    233
    David Masover
    Jul 19, 2008
  8. Elhanan Maayan

    help!! perl eats up my memory!!

    Elhanan Maayan, May 13, 2004, in forum: Perl Misc
    Replies:
    0
    Views:
    171
    Elhanan Maayan
    May 13, 2004
Loading...