multiprocessing module (PEP 371)

Discussion in 'Python' started by sturlamolden, Jun 4, 2008.

  1. sturlamolden

    sturlamolden Guest

    I sometimes read python-dev, but never contribute. So I'll post my
    rant here instead.

    I completely support adding this module to the standard lib. Get it in
    as soon as possible, regardless of PEP deadlines or whatever.

    I don't see pyprocessing as a drop-in replacement for the threading
    module. Multi-threading and multi-processing code tend to be
    different, unless something like mutable objects in shared memory is
    used as well (cf. Python Shared Objects). If this limitation can
    educate Python programmers to use queues instead of locks and mutable
    objects, even multi-threaded Python programs may actually benefit.
    Some API differences between threading and multiprocessing do not
    matter. Programmers should not consider processes as a drop-in
    replacement for threads.

    One limitation not discussed on python-dev is the lack of fork on
    Win32. This makes the pyprocessing module particularly inefficient at
    creating processes on this platform, as it depends on serializing
    (pickling and de-pickling) a lot of Python objects. Even a non-COWfork
    would be preferred. I will strongly suggest something is done to add
    support for os.fork to Python on Windows. Either create a full cow
    fork using ZwCreateProcess (ntdll.dll does support COWforking, but
    Win32 API does not expose it), or do the same as Cygwin is doing to
    fork a process without COW. Although a non-cow fork a la Cygwin is not
    as efficient as a fork on Linux/FreeBSD/Unix, it is still better than
    what pyprocessing is doing.
     
    sturlamolden, Jun 4, 2008
    #1
    1. Advertising

  2. sturlamolden

    Paul Boddie Guest

    On 4 Jun, 20:06, sturlamolden <> wrote:
    >
    > Even a non-COWfork
    > would be preferred. I will strongly suggest something is done to add
    > support for os.fork to Python on Windows. Either create a full cow
    > fork using ZwCreateProcess (ntdll.dll does support COWforking, but
    > Win32 API does not expose it), or do the same as Cygwin is doing to
    > fork a process without COW. Although a non-cow fork a la Cygwin is not
    > as efficient as a fork on Linux/FreeBSD/Unix, it is still better than
    > what pyprocessing is doing.


    You seem to know more about this matter than the average person, I
    would wager, so it might be an idea if you more than "strongly
    suggest" something. ;-) I've looked at this situation briefly, I've
    seen the different Cygwin-based techniques, and I've even gone as far
    to investigate whether it's possible to write the necessary code using
    the mingw32 stuff, although I don't think it actually worked when I
    tested the executable on Windows. COW (copy-on-write, for those still
    thinking that we're talking about dairy products) would be pretty
    desirable if it's feasible, though.

    Having said all this, I don't care about Windows myself, and my own
    contribution to the collection of available libraries in this domain
    has never been targeted at standard library adoption (nor thread API
    compatibility) and thus has no need to run on Windows without Cygwin.

    Paul
     
    Paul Boddie, Jun 4, 2008
    #2
    1. Advertising

  3. sturlamolden

    sturlamolden Guest

    On Jun 4, 11:29 pm, Paul Boddie <> wrote:

    > tested the executable on Windows. COW (copy-on-write, for those still
    > thinking that we're talking about dairy products) would be pretty
    > desirable if it's feasible, though.


    There is a well known C++ implementation of cow-fork on Windows, which
    I have slightly modified and ported to C. But as the new WDK (Windows
    driver kit) headers are full of syntax errors, the compiler choke on
    it. :( I am seriously considering re-implementing the whole cow fork
    in pure Python using ctypes.

    If you pass NULL as section handle to ZwCreateProcess (or
    NtCreateProcess) you do get a rudimentary cow fork. But the new
    process image has no context and no active threads. The NT kernel is
    designed to support several subsystems. Both the OS/2 and SUA
    subsystems provide a functional COW fork, but the Win32 subsystem do
    not expose the functionality. I honestly don't understand why, but
    maybe it is backwards compatibility that prevents it (it's backlog
    goes back to DOS, in which forking was impossible due to single-
    tasking.)

    But anyway ... what I am trying to say is that pyprocessing is
    somewhat inefficient (and limited) on Windows due to lack of a fork
    (cow or not).
     
    sturlamolden, Jun 4, 2008
    #3
  4. sturlamolden schrieb:
    > There is a well known C++ implementation of cow-fork on Windows, which
    > I have slightly modified and ported to C. But as the new WDK (Windows
    > driver kit) headers are full of syntax errors, the compiler choke on
    > it. :( I am seriously considering re-implementing the whole cow fork
    > in pure Python using ctypes.


    Can you provide a C implementation that compiles under VS 2008? Python
    2.6 and 3.0 are using my new VS 2008 build system and we have dropped
    support for 9x, ME and NT4. If you can provide us with an implementation
    we *might* consider using it.

    Christian
     
    Christian Heimes, Jun 4, 2008
    #4
  5. sturlamolden

    pataphor Guest

    In article <877a5774-d3cc-49d3-bb64-5cab8505a419
    @m3g2000hsc.googlegroups.com>, says...

    > I don't see pyprocessing as a drop-in replacement for the threading
    > module. Multi-threading and multi-processing code tend to be
    > different, unless something like mutable objects in shared memory is
    > used as well (cf. Python Shared Objects). If this limitation can
    > educate Python programmers to use queues instead of locks and mutable
    > objects, even multi-threaded Python programs may actually benefit.
    > Some API differences between threading and multiprocessing do not
    > matter. Programmers should not consider processes as a drop-in
    > replacement for threads.


    This is probably not very central to the main intention of your post,
    but I see a terminology problem coming up here. It is possible for
    python objects to share a reference to some other object. This has
    nothing to do with threads or processes, although it can be used as a
    *mechanism* for threads and processes to share data. Another mechanism
    would be some copying and synchronization scheme, which is what posh
    seems to do. Or maybe not, I haven't used posh yet, I just read some
    docs (and I already hate the "if process.fork():" idiom, what are they
    trying to do, reintroduce c-style assignment and swiching?).

    By the way I haven't done much thread and process programming, but the
    things I *have* done often combine threads and processes, like starting
    a console oriented program in a background process, redirecting the IO
    and communicate with it using an event loop in a thread. I gets more
    complicated when a gui thread is also involved, for example when
    retrofitting a gui interface to an existing terminal based chess or go
    playing program.

    P.
     
    pataphor, Jun 5, 2008
    #5
  6. sturlamolden

    sturlamolden Guest

    On Jun 5, 11:02 am, pataphor <> wrote:

    > This is probably not very central to the main intention of your post,
    > but I see a terminology problem coming up here. It is possible for
    > python objects to share a reference to some other object. This has
    > nothing to do with threads or processes, although it can be used as a
    > *mechanism* for threads and processes to share data.


    It is complicated in the case of processes, because the object must be
    kept in shared memory. The complicating factor is that the base
    address of the memory mapping, which is not guaranteed to be the same
    in the virtual address space of different processes.
     
    sturlamolden, Jun 5, 2008
    #6
  7. sturlamolden

    John Nagle Guest

    sturlamolden wrote:
    > On Jun 5, 11:02 am, pataphor <> wrote:
    >
    >> This is probably not very central to the main intention of your post,
    >> but I see a terminology problem coming up here. It is possible for
    >> python objects to share a reference to some other object. This has
    >> nothing to do with threads or processes, although it can be used as a
    >> *mechanism* for threads and processes to share data.

    >
    > It is complicated in the case of processes, because the object must be
    > kept in shared memory. The complicating factor is that the base
    > address of the memory mapping, which is not guaranteed to be the same
    > in the virtual address space of different processes.


    Introducing shared memory in Python would be a terrible idea,
    for many reasons, including the need for interprocess garbage
    collection and locking. Don't go there. Use message passing instead.

    John Nagle
     
    John Nagle, Jun 8, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Christoph Becker-Freyseng

    PEP for new modules (I read PEP 2)

    Christoph Becker-Freyseng, Jan 15, 2004, in forum: Python
    Replies:
    3
    Views:
    399
    Gerrit Holl
    Jan 16, 2004
  2. Lie
    Replies:
    25
    Views:
    782
    Dafydd Hughes
    Dec 18, 2007
  3. Replies:
    9
    Views:
    1,063
  4. Graham Dumpleton
    Replies:
    4
    Views:
    629
    Graham Dumpleton
    Feb 22, 2009
  5. dmitrey
    Replies:
    1
    Views:
    356
    Terry Reedy
    Mar 14, 2009
Loading...

Share This Page