Multicore-programming?

Discussion in 'Python' started by cnb, Sep 6, 2008.

  1. cnb

    cnb Guest

    If I buy a multicore computer and I have really intensive program. How
    would that be distributed across the cores?

    Will algorithms always have to be programmed and told specifically to
    run on several cores so if not told it will only utilize one core?

    So is the free lunch really over or is this just an overhyped
    phenomena?

    Is threading with Python hard? Can you start several processes with
    Python or just threads?
    cnb, Sep 6, 2008
    #1
    1. Advertising

  2. cnb

    Terry Reedy Guest

    cnb wrote:
    > If I buy a multicore computer and I have really intensive program. How
    > would that be distributed across the cores?
    >
    > Will algorithms always have to be programmed and told specifically to
    > run on several cores so if not told it will only utilize one core?


    I believe that has always been true.
    >
    > So is the free lunch really over or is this just an overhyped
    > phenomena?
    >
    > Is threading with Python hard?


    Opinions vary, mostly depending on experience. But Python threads do
    not distribute across processors.

    >Can you start several processes with
    > Python or just threads?


    import subprocess
    and read the manual for that module
    Terry Reedy, Sep 7, 2008
    #2
    1. Advertising

  3. cnb

    Paul Boddie Guest

    On 7 Sep, 00:06, cnb <> wrote:
    > If I buy a multicore computer and I have really intensive program. How
    > would that be distributed across the cores?


    It typically depends on how the work done by the program is performed.

    > Will algorithms always have to be programmed and told specifically to
    > run on several cores so if not told it will only utilize one core?


    Some algorithms lend themselves to parallelisation; others do not.
    Sometimes tools and runtimes can help by executing some instructions
    in parallel.

    > So is the free lunch really over or is this just an overhyped
    > phenomena?


    The free lunch ended a few minutes ago. ;-)

    > Is threading with Python hard? Can you start several processes with
    > Python or just threads?


    You can start both processes and threads with Python, although the
    effect of starting many threads - the actual concurrency - will depend
    on which implementation of Python you're using and where the bulk of
    the work is performed.

    If you're spending a lot of CPU time in processing data, and if that
    processing is taking place in Python code, then for the most effective
    threading you should consider an implementation like Jython or
    IronPython which supports free-threading. If most of the work happens
    in extension code (where you'd probably have little choice about using
    CPython, anyway), then it might be the case that the extension
    releases the global interpreter lock in CPython and you might then be
    able to benefit from having many threads doing work simultaneously,
    although I imagine that the extension would itself need to be thread-
    safe, too.

    If you're spending a lot of time moving data around, performing
    communication, and so on, then multiple threads may still be effective
    in CPython, since some of them might be getting a system call to read
    or write data, thus freeing the CPU for the others. These kinds of
    situations lend themselves to other approaches such as asynchronous
    processing of data, however. It doesn't sound like this is like your
    program, if by "intensive" you mean high levels of CPU activity.

    As you note, the alternative to threads is processes, and many people
    advocate multi-process, "shared nothing" solutions. Here's a list
    which covers multicore and SMP-related solutions as well as high-end
    clustering solutions:

    http://wiki.python.org/moin/ParallelProcessing

    Although the processing module is part of Python 2.6/3.0 as the
    multiprocessing module, you might want to at least look at the pp,
    pprocess and papyros solutions. My aim with pprocess was to target
    multicore UNIX-like systems with an unintrusive API; pp and papyros,
    on the other hand, seek to cover larger scale systems as well, and I
    think that the way papyros has been done has some merit, mostly
    because if you wanted to combine convenience with distributed
    processing, you'd want to choose distributed object technologies as
    the foundation (CORBA would have been good for this, too, at least for
    multi-language support, but its APIs can also seem quite
    intimidating).

    Paul
    Paul Boddie, Sep 7, 2008
    #3
  4. cnb

    John Machin Guest

    On Sep 7, 8:06 am, cnb <> wrote:
    > If I buy a multicore computer and I have really intensive program. How
    > would that be distributed across the cores?


    AFAIK, a single process wouldn't be distributed automatically.

    > Will algorithms always have to be programmed and told specifically to
    > run on several cores so if not told it will only utilize one core?


    AFAIK, yes. See (for example) http://www.parallelpython.com/

    > So is the free lunch really over


    There is no such thing as a free lunch. Something which has never
    existed can't be over.

    > or is this just an overhyped
    > phenomena?


    These days, every IT phenomenon is over-hyped.

    If you have a CPU-intensive Python program, you may want to consider:
    (1) checking that there are not faster/better algorithms for doing
    what you want in Python, either built-in or in a 3rd-party library
    (2) using psyco
    (3) checking your code for sub-optimal constructs

    HTH,
    John
    John Machin, Sep 7, 2008
    #4
  5. cnb

    Tim Roberts Guest

    cnb <> wrote:
    >
    >So is the free lunch really over or is this just an overhyped
    >phenomena?


    Remember that your computer is not running one single program. An idle
    computer on either Windows or Linux typically has dozens of processes
    running. Even if all of those programs are single-threaded, you'll still
    be able to keep all of the cores busy.
    --
    Tim Roberts,
    Providenza & Boekelheide, Inc.
    Tim Roberts, Sep 7, 2008
    #5
  6. cnb

    sturlamolden Guest

    On 7 Sep, 00:06, cnb <> wrote:

    > If I buy a multicore computer and I have really intensive program. How
    > would that be distributed across the cores?


    Distribution of processes and threads across processors (cores or
    CPUs) is managed by the operating system.


    > Will algorithms always have to be programmed and told specifically to
    > run on several cores so if not told it will only utilize one core?


    One particular program has to be programmed for concurrency to utilize
    multiple cores. But you typically have more than one program running.


    > So is the free lunch really over or is this just an overhyped
    > phenomena?


    Two slow cores are better than one fast for most purposes. For one
    thing, it saves power. It's good for the battries and environment
    alike.


    > Is threading with Python hard?


    It's not harder than with other systems. You just subclass
    threading.Thread, which has almost the same interface as Java threads.
    Threading with Python is perhaps a bit easier than with other common
    platforms, due to the Queue.Queue object and the lack of volatile
    objects.


    > Can you start several processes with Python or just threads?


    You can do both. However, remember that Python threads only do what
    threads were designed to do back in the 1990s. That is asynchrony for
    I/O and UIs, not concurrency on multiple processors for CPU bound
    computing. This is due to the "Global Interpreter Lock". The GIL is
    better than fine-grained locks for single-threading and concurrency
    with multiple processes, but prevent python threads form being used
    for concurrency (just as well).

    You can do concurrency with Java threads or Win32 threads, but this is
    merely a side-effect. You will often see claims form novice
    programmers that threads are the only route to concurrency on multi-
    core CPUs. In addition to the existence of processes, direct use of
    threads from Java, .NET, POSIX, or Win32 APIs is not even the
    preferred way of programming for concurrency. Tinkering with low-level
    threading APIs for concurrency is error-prone and inefficient. You
    will spend a lot of time cleansing your code of dead-locks, live-
    locks, volatile objects not being declared volatile, and race
    conditions. In addition to that, chances are your code will not
    perform or scale very well due to memory contention, cache line
    misses, inefficient use of registers due to volatile objects, etc. The
    list is endless. That is why Java 6 and .NET 3.5 provide other
    abstractions for multi-core concurrency, such as ForkJoin and
    Parallel.For. This is also the rationale for using an OpenMP enabled
    compiler for C or Fortran, auto-vectorizing C or Fortran compilers,
    and novel languages like cilk and erlang.

    Traditionally, concurrency om parallel computers have been solved
    using tools like BSPlib, MPI, vectorizing Fortran compilers, and even
    "ebarassingly parallel" (running multiple instances of the same
    program on different data). OpenMP is a recent addition to the
    concurrency toolset for SMP type parallel computers (to which multi-
    core x86 processors belong).

    If you really need concurrency with Python, look into MPI (PyMPI,
    PyPAR, mpi4py), Python/BSP, subprocess module, os.fork (excluding
    Windows), pyprocessing package, or Parallel Python. BSP is probably
    the least error-prone paradigm for multi-core concurrency, albeit not
    the most efficient.

    If you decide to move an identified bottleneck from Python to C or
    Fortran, you also have the option of using OpenMP or cilk to ease the
    work of programming for concurrency. This is my preferred way of
    dealing with bad bottlenecks in numerical computing. Remember that you
    need not learn the overly complex Python C API. Cython, ctypes, f2py,
    or scipy.weave will do just as well. This approach will require you to
    manually release the GIL, which can be done in several ways:

    - In C extensions between Py_BEGIN_ALLOW_THREADS and
    Py_END_ALLOW_THREADS macros.

    - When calling DLL methods using ctypes.cdll or ctypes.windll (not
    ctypes.pydll).

    - In a "with nogil:" block in a Cython/Pyrex extension.

    - With f2py or SWIG, although I have not looked at the details. (I
    don't use them.)


    Other things to consider:

    - Programs that run fast enough run fast enough, even if they only
    utilize one core. To qoute C.A.R. Hoare and Donald Knuth, "premature
    optimization is the root of all evil in computer programming."

    - Psyco, a Python JIT compiler, will often speed up algorithmic code.
    Using psyco require to change to your code. Try it and see if your
    programs runs fast enough afterwards. YouTube is rumoured to use psyco
    to speed ut their Python backend.

    - Always use NumPy or SciPy if you do numerical work. They make
    numerical code easier to program. The numerical code also runs a lot
    faster than a pure python equivalent.

    - Sometimes Python is faster than your hand-written C. This is
    particularly the case for Python code that make heavy use of built-in
    primitives and objects from the standard library. You will spend a lot
    of time tuning a linked list or dynamic array to match the performance
    of a Python list. Chances are you'll never come up with a sort as fast
    as Python's timsort. You'll probably never make your own hash table
    that can compete with Pythons dictionaries and sets, etc. Even if you
    can, the benefit will be minute and certainly not worth the effort.

    - You will get tremendous speedups (often x200 over pure Python) if
    you can move a computational bottleneck to C, C++, Fortran, Cython, or
    a third-party library (FFTW, LAPACK, Intel MKL, etc.)

    - Portions of your Python code that do not constitute important
    bottlenecks can just be left in Python. You will not gain anything
    substantial from migrating these parts to C, as other parts of your
    code dominate. Use a profiler to indentify computational bottlenecks.
    It will save you a lot of grief fiddling with premature
    optimizations.

    That's my fifty cents on Python coding for speed.
    sturlamolden, Sep 7, 2008
    #6
  7. cnb

    sturlamolden Guest

    On 7 Sep, 06:24, sturlamolden <> wrote:

    > - Psyco, a Python JIT compiler, will often speed up algorithmic code.
    > Using psyco require to change to your code.


    Typo. It should say "Using psyco does not require you to change your
    code."
    sturlamolden, Sep 7, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Dimitri Ognibene

    java vm and multicore

    Dimitri Ognibene, Apr 27, 2006, in forum: Java
    Replies:
    1
    Views:
    497
    Daniel Dyer
    Apr 27, 2006
  2. Neo
    Replies:
    4
    Views:
    369
    Joe Seigh
    Jan 31, 2008
  3. finecur
    Replies:
    12
    Views:
    1,215
    Juha Nieminen
    Feb 3, 2008
  4. Gernot Frisch
    Replies:
    1
    Views:
    327
    Mirco Wahab
    Jul 23, 2008
  5. Obnoxious User

    Using a multicore-processor

    Obnoxious User, Aug 29, 2008, in forum: C++
    Replies:
    5
    Views:
    343
    Juha Nieminen
    Aug 29, 2008
Loading...

Share This Page