Multiprocessing taking too much time

Discussion in 'Python' started by Shailendra, Jul 29, 2010.

  1. Shailendra

    Shailendra Guest

    Hi All,
    I have a following situation.
    ==================PSUDO CODE START==================
    class holds_big_array:
        big_array  #has a big array

        def get_some_element(self, cond) # return some data from the array
    from the big array
    ==================PSUDO CODE END====================
    I wanted to use multiprocessing module to parallelise calling
    "get_some_element". I used following kind of code

    ==================PSUDO CODE START==================
    pool = Pool(processes=2)
    holder =holds_big_array() #class instantiation
    def callback_f(result):
             do something with result
    loop many times
       pool.apply_async(holder.get_some_element,args,callback=callback_f)
    pool.close()
    pool.join()
    ==================PSUDO CODE END====================
    Note: Had to do something to enable instance method being pickled...

    I tested this with less than realistic size of big_array . My parallel
    version works much slower than than the normal serial version (10-20
    sec vs 7-8 min). I was wonder what could be the possible reason. Is it
    something to do that it is a instance method and some locking will
    make other process wait for the locks. Any idea how to trace where the
    program is spending time?

    Let me know if the information give is inadequate.

    Thanks in advance.
    Shailendra Vikas
     
    Shailendra, Jul 29, 2010
    #1
    1. Advertising

  2. Shailendra

    John Nagle Guest

    On 7/29/2010 11:08 AM, Shailendra wrote:
    > Hi All,
    > I have a following situation.
    > ==================PSUDO CODE START==================
    > class holds_big_array:
    > big_array #has a big array
    >
    > def get_some_element(self, cond) # return some data from the array
    > from the big array
    > ==================PSUDO CODE END====================
    > I wanted to use multiprocessing module to parallelise calling
    > "get_some_element". I used following kind of code
    >
    > ==================PSUDO CODE START==================
    > pool = Pool(processes=2)
    > holder =holds_big_array() #class instantiation
    > def callback_f(result):
    > do something with result
    > loop many times
    > pool.apply_async(holder.get_some_element,args,callback=callback_f)
    > pool.close()
    > pool.join()
    > ==================PSUDO CODE END====================
    > Note: Had to do something to enable instance method being pickled...
    >
    > I tested this with less than realistic size of big_array . My parallel
    > version works much slower than than the normal serial version (10-20
    > sec vs 7-8 min). I was wonder what could be the possible reason.


    It's hard to tell from your "PSUDO CODE", but it looks like each
    access to the "big array" involves calling another process.

    Calling a function in another process is done by creating an
    object to contain the request, running it through "pickle" to convert
    it to a stream of bytes, sending the stream of bytes through a socket or
    pipe to the other process, running the byte stream through "unpickle" to
    create an object like the original one, but in a different process, and
    calling a function on the newly created object in the receiving process.
    This entire sequence has to be done again in reverse
    to get a reply back.

    This is hundreds of times slower than a call to a local function.

    The "multiprocessing module" is not a replacement for thread-level
    parallelism. It looks like it is, but it isn't. It's only useful for
    big tasks which require large amounts of computation and little
    interprocess communication. Appropriately-sized tasks to send out
    to another process are things like "parse large web page" or
    "compress video file", not "access element of array".

    John Nagle
     
    John Nagle, Jul 29, 2010
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. RSoIsCaIrLiIoA
    Replies:
    8
    Views:
    289
    Mark McIntyre
    Dec 21, 2005
  2. cpp4ever
    Replies:
    3
    Views:
    398
    Francesco
    Sep 8, 2009
  3. Gregory A. Beamer

    Re: SQL Query is taking too much time

    Gregory A. Beamer, Oct 15, 2009, in forum: ASP .Net
    Replies:
    0
    Views:
    437
    Gregory A. Beamer
    Oct 15, 2009
  4. Phonethics Mobile Media

    urllib2.urlopen taking way too much time

    Phonethics Mobile Media, Apr 19, 2010, in forum: Python
    Replies:
    0
    Views:
    501
    Phonethics Mobile Media
    Apr 19, 2010
  5. Dysgraphic Programmer

    cProfile taking up too much memory?

    Dysgraphic Programmer, Mar 14, 2011, in forum: Python
    Replies:
    0
    Views:
    233
    Dysgraphic Programmer
    Mar 14, 2011
Loading...

Share This Page