multiprocessing and dictionaries

Discussion in 'Python' started by Bjorn Meyer, Jul 12, 2009.

  1. Bjorn Meyer

    Bjorn Meyer Guest

    I am trying to convert a piece of code that I am using the thread module with
    to the multiprocessing module.

    The way that I have it set up is a chunk of code reads a text file and assigns
    a dictionary key multiple values from the text file. I am using locks to write
    the values to the dictionary.
    The way that the values are written is as follows:
    mydict.setdefault(key, []).append(value)

    The problem that I have run into is that using multiprocessing, the key gets
    set, but the values don't get appended.
    I've even tried the Manager().dict() option, but it doesn't seem to work.

    Is this not supported at this time or am I missing something?

    Thanks in advance.

    Bjorn
     
    Bjorn Meyer, Jul 12, 2009
    #1
    1. Advertising

  2. >>>>> Bjorn Meyer <> (BM) wrote:

    >BM> I am trying to convert a piece of code that I am using the thread module with
    >BM> to the multiprocessing module.


    >BM> The way that I have it set up is a chunk of code reads a text file and assigns
    >BM> a dictionary key multiple values from the text file. I am using locks to write
    >BM> the values to the dictionary.
    >BM> The way that the values are written is as follows:
    >BM> mydict.setdefault(key, []).append(value)


    >BM> The problem that I have run into is that using multiprocessing, the key gets
    >BM> set, but the values don't get appended.
    >BM> I've even tried the Manager().dict() option, but it doesn't seem to work.


    >BM> Is this not supported at this time or am I missing something?


    I think you should give more information. Try to make a *minimal* program
    that shows the problem and include it in your posting or supply a
    download link.
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
     
    Piet van Oostrum, Jul 13, 2009
    #2
    1. Advertising

  3. Bjorn Meyer

    Bjorn Meyer Guest

    On Monday 13 July 2009 01:56:08 Piet van Oostrum wrote:

    > >>>>> Bjorn Meyer <> (BM) wrote:

    > >
    > >BM> I am trying to convert a piece of code that I am using the thread
    > > module with BM> to the multiprocessing module.
    > >
    > >BM> The way that I have it set up is a chunk of code reads a text file and
    > > assigns BM> a dictionary key multiple values from the text file. I am
    > > using locks to write BM> the values to the dictionary.
    > >BM> The way that the values are written is as follows:
    > >BM> mydict.setdefault(key, []).append(value)
    > >
    > >BM> The problem that I have run into is that using multiprocessing, the
    > > key gets BM> set, but the values don't get appended.
    > >BM> I've even tried the Manager().dict() option, but it doesn't seem to
    > > work.
    > >
    > >BM> Is this not supported at this time or am I missing something?

    >
    > I think you should give more information. Try to make a *minimal* program
    > that shows the problem and include it in your posting or supply a
    > download link.
    > --
    > Piet van Oostrum <>
    > URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    > Private email:


    Here is what I have been using as a test.
    This pretty much mimics what I am trying to do.
    I put both threading and multiprocessing in the example which shows the output
    that I am looking for.

    #!/usr/bin/env python

    import threading
    from multiprocessing import Manager, Process

    name = ('test1','test2','test3')
    data1 = ('dat1','dat2','dat3')
    data2 = ('datA','datB','datC')

    def thread_test(name,data1,data2, d):
    for nam in name:
    for num in range(0,3):
    d.setdefault(nam, []).append(data1[num])
    d.setdefault(nam, []).append(data2[num])
    print 'Thread test dict:',d

    def multiprocess_test(name,data1,data2, mydict):
    for nam in name:
    for num in range(0,3):
    mydict.setdefault(nam, []).append(data1[num])
    mydict.setdefault(nam, []).append(data2[num])
    print 'Multiprocess test dic:',mydict

    if __name__ == '__main__':
    mgr = Manager()
    md = mgr.dict()
    d = {}

    m = Process(target=multiprocess_test, args=(name,data1,data2,md))
    m.start()
    t = threading.Thread(target=thread_test, args=(name,data1,data2,d))
    t.start()

    m.join()
    t.join()

    print 'Thread test:',d
    print 'Multiprocess test:',md


    Thanks
    Bjorn
     
    Bjorn Meyer, Jul 13, 2009
    #3
  4. >>>>> Bjorn Meyer <> (BM) wrote:

    >BM> Here is what I have been using as a test.
    >BM> This pretty much mimics what I am trying to do.
    >BM> I put both threading and multiprocessing in the example which shows
    >BM> the output that I am looking for.


    >BM> #!/usr/bin/env python


    >BM> import threading
    >BM> from multiprocessing import Manager, Process


    >BM> name = ('test1','test2','test3')
    >BM> data1 = ('dat1','dat2','dat3')
    >BM> data2 = ('datA','datB','datC')


    [snip]

    >BM> def multiprocess_test(name,data1,data2, mydict):
    >BM> for nam in name:
    >BM> for num in range(0,3):
    >BM> mydict.setdefault(nam, []).append(data1[num])
    >BM> mydict.setdefault(nam, []).append(data2[num])
    >BM> print 'Multiprocess test dic:',mydict


    I guess what's happening is this:

    d.setdefault(nam, []) returns a list, initially an empty list ([]). This
    list gets appended to. However, this list is a local list in the
    multi-process_test Process, therefore the result is not reflected in the
    original list inside the manager. Therefore all your updates get lost.
    You will have to do operations directly on the dictionary itself, not on
    any intermediary objects. Of course with the threading the situation is
    different as all operations are local.

    This works:

    def multiprocess_test(name,data1,data2, mydict):
    print name, data1, data2
    for nam in name:
    for num in range(0,3):
    mydict.setdefault(nam, [])
    mydict[nam] += [data1[num]]
    mydict[nam] += [data2[num]]
    print 'Multiprocess test dic:',mydict

    If you have more than one process operating on the dictionary
    simultaneously you have to beware of race conditions!!
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
     
    Piet van Oostrum, Jul 13, 2009
    #4
  5. Bjorn Meyer

    Bjorn Meyer Guest

    On Monday 13 July 2009 13:12:18 Piet van Oostrum wrote:

    > >>>>> Bjorn Meyer <> (BM) wrote:

    > >
    > >BM> Here is what I have been using as a test.
    > >BM> This pretty much mimics what I am trying to do.
    > >BM> I put both threading and multiprocessing in the example which shows
    > >BM> the output that I am looking for.
    > >
    > >BM> #!/usr/bin/env python
    > >
    > >BM> import threading
    > >BM> from multiprocessing import Manager, Process
    > >
    > >BM> name = ('test1','test2','test3')
    > >BM> data1 = ('dat1','dat2','dat3')
    > >BM> data2 = ('datA','datB','datC')

    >
    > [snip]
    >
    > >BM> def multiprocess_test(name,data1,data2, mydict):
    > >BM> for nam in name:
    > >BM> for num in range(0,3):
    > >BM> mydict.setdefault(nam, []).append(data1[num])
    > >BM> mydict.setdefault(nam, []).append(data2[num])
    > >BM> print 'Multiprocess test dic:',mydict

    >
    > I guess what's happening is this:
    >
    > d.setdefault(nam, []) returns a list, initially an empty list ([]). This
    > list gets appended to. However, this list is a local list in the
    > multi-process_test Process, therefore the result is not reflected in the
    > original list inside the manager. Therefore all your updates get lost.
    > You will have to do operations directly on the dictionary itself, not on
    > any intermediary objects. Of course with the threading the situation is
    > different as all operations are local.
    >
    > This works:
    >
    > def multiprocess_test(name,data1,data2, mydict):
    > print name, data1, data2
    > for nam in name:
    > for num in range(0,3):
    > mydict.setdefault(nam, [])
    > mydict[nam] += [data1[num]]
    > mydict[nam] += [data2[num]]
    > print 'Multiprocess test dic:',mydict
    >
    > If you have more than one process operating on the dictionary
    > simultaneously you have to beware of race conditions!!
    > --
    > Piet van Oostrum <>
    > URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    > Private email:


    Excellent. That works perfectly.

    Thank you for your response Piet.

    Bjorn
     
    Bjorn Meyer, Jul 14, 2009
    #5
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    1
    Views:
    1,551
    Roedy Green
    Jan 9, 2006
  2. G. S. Hayes
    Replies:
    5
    Views:
    354
    Greg Chapman
    Jun 25, 2004
  3. lysdexia
    Replies:
    6
    Views:
    504
    John Machin
    Dec 2, 2007
  4. Brandon
    Replies:
    12
    Views:
    490
    Brandon
    Aug 15, 2008
  5. Rouslan Korneychuk
    Replies:
    8
    Views:
    604
    Rouslan Korneychuk
    Feb 10, 2011
Loading...

Share This Page