Re: basic thread question

Discussion in 'Python' started by Jan Kaliszewski, Aug 18, 2009.

  1. 18-08-2009 o 22:10:15 Derek Martin <> wrote:

    > I have some simple threaded code... If I run this
    > with an arg of 1 (start one thread), it pegs one cpu, as I would
    > expect. If I run it with an arg of 2 (start 2 threads), it uses both
    > CPUs, but utilization of both is less than 50%. Can anyone explain
    > why?
    >
    > I do not pretend it's impeccable code, and I'm not looking for a
    > critiqe of the code per se, excepting the case where what I've written
    > is actually *wrong*. I hacked this together in a couple of minutes,
    > with the intent of pegging my CPUs. Performance with two threads is
    > actually *worse* than with one, which is highly unintuitive. I can
    > accomplish my goal very easily with bash, but I still want to
    > understand what's going on here...
    >
    > The OS is Linux 2.6.24, on a Ubuntu base. Here's the code:


    Python threads can't benefit from multiple processors (because of GIL,
    see: http://docs.python.org/glossary.html#term-global-interpreter-lock).

    'multiprocessing' module is what you need:

    http://docs.python.org/library/multiprocessing.html

    Cheers,
    *j

    --
    Jan Kaliszewski (zuo) <>
     
    Jan Kaliszewski, Aug 18, 2009
    #1
    1. Advertising

  2. Jan Kaliszewski

    John Nagle Guest

    Jan Kaliszewski wrote:
    > 18-08-2009 o 22:10:15 Derek Martin <> wrote:
    >
    >> I have some simple threaded code... If I run this
    >> with an arg of 1 (start one thread), it pegs one cpu, as I would
    >> expect. If I run it with an arg of 2 (start 2 threads), it uses both
    >> CPUs, but utilization of both is less than 50%. Can anyone explain
    >> why?
    >>
    >> I do not pretend it's impeccable code, and I'm not looking for a
    >> critiqe of the code per se, excepting the case where what I've written
    >> is actually *wrong*. I hacked this together in a couple of minutes,
    >> with the intent of pegging my CPUs. Performance with two threads is
    >> actually *worse* than with one, which is highly unintuitive. I can
    >> accomplish my goal very easily with bash, but I still want to
    >> understand what's going on here...
    >>
    >> The OS is Linux 2.6.24, on a Ubuntu base. Here's the code:

    >
    > Python threads can't benefit from multiple processors (because of GIL,
    > see: http://docs.python.org/glossary.html#term-global-interpreter-lock).


    This is a CPython implementation restriction. It's not inherent in
    the language.

    Multiple threads make overall performance worse because Python's
    approach to thread locking produces a large number of context switches.
    The interpreter unlocks the "Global Interpreter Lock" every N interpreter
    cycles and on any system call that can block, which, if there is a
    thread waiting, causes a context switch.

    Multiple Python processes can run concurrently, but each process
    has a copy of the entire Python system, so the memory and cache footprints are
    far larger than for multiple threads.

    John Nagle
     
    John Nagle, Aug 24, 2009
    #2
    1. Advertising

  3. On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle <>
    declaimed the following in gmane.comp.python.general:

    > Multiple Python processes can run concurrently, but each process
    > has a copy of the entire Python system, so the memory and cache footprints are
    > far larger than for multiple threads.
    >

    One would think a smart enough OS would be able to share the
    executable (interpreter) code, and only create a new stack/heap
    allocation for data.
    --
    Wulfraed Dennis Lee Bieber KD6MOG
    HTTP://wlfraed.home.netcom.com/
     
    Dennis Lee Bieber, Aug 24, 2009
    #3
  4. >>>>> Dennis Lee Bieber <> (DLB) wrote:

    >DLB> On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle <>
    >DLB> declaimed the following in gmane.comp.python.general:


    >>> Multiple Python processes can run concurrently, but each process
    >>> has a copy of the entire Python system, so the memory and cache footprints are
    >>> far larger than for multiple threads.
    >>>

    >DLB> One would think a smart enough OS would be able to share the
    >DLB> executable (interpreter) code, and only create a new stack/heap
    >DLB> allocation for data.


    Of course they do, but a significant portion of a Python system consists
    of imported modules and these are data as far as the OS is concerned.
    Only the modules written in C which are loaded as DLL's (shared libs)
    and of course the interpreter executable will be shared.
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
     
    Piet van Oostrum, Aug 24, 2009
    #4
  5. Jan Kaliszewski

    Dave Angel Guest

    Dennis Lee Bieber wrote:
    > On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle <>
    > declaimed the following in gmane.comp.python.general:
    >
    >
    >> Multiple Python processes can run concurrently, but each process
    >> has a copy of the entire Python system, so the memory and cache footprints are
    >> far larger than for multiple threads.
    >>
    >>

    > One would think a smart enough OS would be able to share the
    > executable (interpreter) code, and only create a new stack/heap
    > allocation for data.
    >

    That's what fork is all about. (See os.fork(), available on most
    Unix/Linux) The two processes start out sharing their state, and only
    the things subsequently written need separate swap space.

    In Windows (and probably Unix/Linux), the swapspace taken by the
    executable and DLLs(shared libraries) is minimal. Each DLL may have a
    "preferred location" and if that part of the address space is available,
    it takes no swapspace at all, except for static variables, which are
    usually allocated together. I don't know whether the standard build of
    CPython (python.exe and the pyo libraries) uses such a linker option,
    but I'd bet they do. It also speeds startup time.

    On my system, a minimal python program uses about 50k of swapspace. But
    I'm sure that goes way up with lots of imports.


    DaveA
     
    Dave Angel, Aug 24, 2009
    #5
  6. >>>>> Dave Angel <> (DA) wrote:

    >DA> Dennis Lee Bieber wrote:
    >>> On Sun, 23 Aug 2009 22:14:17 -0700, John Nagle <>
    >>> declaimed the following in gmane.comp.python.general:
    >>>
    >>>
    >>>> Multiple Python processes can run concurrently, but each process
    >>>> has a copy of the entire Python system, so the memory and cache footprints are
    >>>> far larger than for multiple threads.
    >>>>
    >>>>
    >>> One would think a smart enough OS would be able to share the
    >>> executable (interpreter) code, and only create a new stack/heap
    >>> allocation for data.
    >>>

    >DA> That's what fork is all about. (See os.fork(), available on most
    >DA> Unix/Linux) The two processes start out sharing their state, and only the
    >DA> things subsequently written need separate swap space.


    But os.fork() is not available on Windows. And I guess refcounts et al.
    will soon destroy the sharing.
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
     
    Piet van Oostrum, Aug 24, 2009
    #6
  7. Jan Kaliszewski

    sturlamolden Guest

    On 24 Aug, 13:21, Piet van Oostrum <> wrote:

    > But os.fork() is not available on Windows. And I guess refcounts et al.
    > will soon destroy the sharing.


    Well, there is os.fork in Cygwin and SUA (SUA is the Unix subsytem in
    Windows Vista Professional). Cygwin's fork is a bit sluggish.

    Multiprocessing works on Windows and Linux alike.

    Apart from that, how are you going to use threads? The GIL will not be
    a problem if it can be released. Mostly, the GIL is a hypothetical
    problem. It is only a problem for compute-bound code written in pure
    Python. But very few use Python for that. However, if you do and can
    afford the 200x speed penalty from using Python (instead of C, C++,
    Fortran, Cython), you can just as well accept that only one CPU is
    used.


    Sturla Molden
     
    sturlamolden, Aug 24, 2009
    #7
  8. >>>>> sturlamolden <> (s) wrote:

    >s> On 24 Aug, 13:21, Piet van Oostrum <> wrote:
    >>> But os.fork() is not available on Windows. And I guess refcounts et al.
    >>> will soon destroy the sharing.


    >s> Well, there is os.fork in Cygwin and SUA (SUA is the Unix subsytem in
    >s> Windows Vista Professional). Cygwin's fork is a bit sluggish.


    That's because it doesn't use copy-on-write. Thereby losing most of its
    advantages. I don't know SUA, but I have vaguely heard about it.
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
     
    Piet van Oostrum, Aug 25, 2009
    #8
  9. Jan Kaliszewski

    sturlamolden Guest

    On 25 Aug, 01:26, Piet van Oostrum <> wrote:

    > That's because it doesn't use copy-on-write. Thereby losing most of its
    > advantages. I don't know SUA, but I have vaguely heard about it.


    SUA is a version of UNIX hidden inside Windows Vista and Windows 7
    (except in Home and Home Premium), but very few seem to know of it.
    SUA (Subsystem for Unix based Applications) is formerly known as
    Interix, which is a certified version of UNIX based on OpenBSD. If you
    go to http://www.interopsystems.com (a website run by Interop Systems
    Inc., a company owned by Microsoft), you will find a lot of common
    unix tools prebuilt for SUA, including Python 2.6.2.

    The NT-kernel supports copy-on-write fork with a special system call
    (ZwCreateProcess in ntdll.dll), which is what SUA's implementation of
    fork() uses.
     
    sturlamolden, Aug 25, 2009
    #9
  10. >>>>> sturlamolden <> (s) wrote:

    >s> On 25 Aug, 01:26, Piet van Oostrum <> wrote:
    >>> That's because it doesn't use copy-on-write. Thereby losing most of its
    >>> advantages. I don't know SUA, but I have vaguely heard about it.


    >s> SUA is a version of UNIX hidden inside Windows Vista and Windows 7
    >s> (except in Home and Home Premium), but very few seem to know of it.
    >s> SUA (Subsystem for Unix based Applications) is formerly known as
    >s> Interix, which is a certified version of UNIX based on OpenBSD. If you
    >s> go to http://www.interopsystems.com (a website run by Interop Systems
    >s> Inc., a company owned by Microsoft), you will find a lot of common
    >s> unix tools prebuilt for SUA, including Python 2.6.2.


    >s> The NT-kernel supports copy-on-write fork with a special system call
    >s> (ZwCreateProcess in ntdll.dll), which is what SUA's implementation of
    >s> fork() uses.


    I have heard about that also, but is there a Python implementation that
    uses this? (Just curious, I am not using Windows.)
    --
    Piet van Oostrum <>
    URL: http://pietvanoostrum.com [PGP 8DAE142BE17999C4]
    Private email:
     
    Piet van Oostrum, Aug 25, 2009
    #10
  11. Jan Kaliszewski

    sturlamolden Guest

    On 25 Aug, 13:33, Piet van Oostrum <> wrote:

    > I have heard about that also, but is there a Python implementation that
    > uses this? (Just curious, I am not using Windows.)


    On Windows we have three different versions of Python 2.6:

    * Python 2.6 for Win32/64 (from python.org) does not have os.fork.

    * Python 2.6 for Cygwin has os.fork, but it is non-COW and sluggish.

    * Python 2.6 for SUA has a fast os.fork with copy-on-write.

    You get Python 2.6.2 for SUA prebuilt by Microsoft from http://www.interopsystems.com.

    Using Python 2.6 for SUA is not without surprices: For example, the
    process is not executed from the Win32 subsystem, hence the Windows
    API is inaccessible. That means we cannot use native Windows GUI.
    Instead we must run an X11 server on the Windows subsystem (e.g. X-
    Win32), and use the Xlib SUA has installed. You can compare SUA to a
    stripped down Linux distro, on which you have to build and install
    most of the software you want to use. I do not recommend using Python
    for SUA instead of Python for Windows unless you absolutely need a
    fast os.fork or have a program that otherwise requires Unix. But for
    running Unix apps on Windows, SUA is clearly superior to Cygwin.
    Licencing is also better: Programs compiled against Cygwin libraries
    are GPL (unless you buy a commercial license). Program compiled
    against SUA libraries are not.



    Sturla Molden
     
    sturlamolden, Aug 25, 2009
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. john
    Replies:
    4
    Views:
    588
    Lee Fesperman
    Jun 3, 2005
  2. Engineer
    Replies:
    6
    Views:
    654
    Jeremy Bowers
    May 1, 2005
  3. Replies:
    0
    Views:
    461
  4. Derek Martin

    basic thread question

    Derek Martin, Aug 18, 2009, in forum: Python
    Replies:
    2
    Views:
    276
    sturlamolden
    Aug 24, 2009
  5. Derek Martin

    Re: basic thread question

    Derek Martin, Aug 18, 2009, in forum: Python
    Replies:
    2
    Views:
    290
    Sean DiZazzo
    Aug 20, 2009
Loading...

Share This Page