Re: Best processor (i386) for Python performance?

Discussion in 'Python' started by Brett C., Aug 26, 2004.

  1. Brett C.

    Brett C. Guest

    In terms of multithreading, an I/O intensive app that is threaded can
    make use of dual procs. Otherwise threaded apps can't for technical
    reasons (GIL and such but don't need to get into those details).
    Brett C., Aug 26, 2004
    #1
    1. Advertising

  2. On 2004-08-26, Brett C. <> wrote:

    > In terms of multithreading, an I/O intensive app that is
    > threaded can make use of dual procs. Otherwise threaded apps
    > can't for technical reasons (GIL and such but don't need to
    > get into those details).


    That's rather dissappointing. If I write a multi-threaded app
    in C it can utilize multiple processors, but the same app in
    Python can't?

    Not that I _have_ any SMP machines...

    --
    Grant Edwards grante Yow! I Know A Joke!!
    at
    visi.com
    Grant Edwards, Aug 26, 2004
    #2
    1. Advertising

  3. Brett C.

    Dave Brueck Guest

    Grant Edwards wrote:
    > On 2004-08-26, Brett C. <> wrote:
    >
    >
    >>In terms of multithreading, an I/O intensive app that is
    >>threaded can make use of dual procs. Otherwise threaded apps
    >>can't for technical reasons (GIL and such but don't need to
    >>get into those details).

    >
    >
    > That's rather dissappointing. If I write a multi-threaded app
    > in C it can utilize multiple processors, but the same app in
    > Python can't?


    Depends on what the multithreaded app _does_. If multiple processors are
    present then Python will use them, but how well they get used depends on
    how much and for what reasons the GIL gets released.

    I/O is the most common reason, so adding another processor to an I/O
    bound program can give you a good performance boost (in our lab I've
    seen easily 75% improvement over a single proc box for a program that
    was very I/O bound, but I haven't measured it to see if it's closer to
    75% or to 100% improvement).

    Another easy boost comes if your app already calls out to a
    GIL-releasing C function for CPU-intensive work, then adding a CPU can
    give similar speed boosts - we have only one such case and although
    there was noticable speedup in dual vs single processors, I never
    attempted to quantify it. And the normal restrictions on parallel
    computing apply - if whatever you're doing can't be done in parallel
    anyway, then adding a CPU isn't helpful. :)

    FWIW I haven't noticed a case where adding a CPU improved performance by
    *less* than ~25%, probably because the GIL gets released here and there
    for various operations anyway, and having an existing multithreaded app
    where multiple threads are CPU bound is somewhat uncommon.

    But then again very few of the projects I work on end up having CPU as
    the most scarce resource so the machines that do have multiple CPUs are
    that way because they are running oodles of other processes as well.

    -Dave
    Dave Brueck, Aug 26, 2004
    #3
  4. On 2004-08-26, Dave Brueck <> wrote:

    >>>In terms of multithreading, an I/O intensive app that is
    >>>threaded can make use of dual procs. Otherwise threaded apps
    >>>can't for technical reasons (GIL and such but don't need to get
    >>>into those details).

    >>
    >> That's rather dissappointing. If I write a multi-threaded app
    >> in C it can utilize multiple processors, but the same app in
    >> Python can't?

    >
    > Depends on what the multithreaded app _does_. If multiple
    > processors are present then Python will use them, but how well
    > they get used depends on how much and for what reasons the GIL
    > gets released.
    >
    > I/O is the most common reason, so adding another processor to
    > an I/O bound program can give you a good performance boost (in
    > our lab I've seen easily 75% improvement over a single proc
    > box for a program that was very I/O bound, but I haven't
    > measured it to see if it's closer to 75% or to 100%
    > improvement).


    Now that I think about it, in my multi-threaded apps all the
    threads almost always end up blocking on I/O. A couple years
    back I even added a GIL release to some of the termios() calls
    so that I could get more parallelism when threads are waiting
    for serial ports to drain or flush.

    --
    Grant Edwards grante Yow! I wonder if there's
    at anything GOOD on tonight?
    visi.com
    Grant Edwards, Aug 26, 2004
    #4
  5. Brett C.

    David Bolen Guest

    Dave Brueck <> writes:

    > Grant Edwards wrote:

    (...)
    > I/O is the most common reason, so adding another processor to an I/O
    > bound program can give you a good performance boost (in our lab I've
    > seen easily 75% improvement over a single proc box for a program that
    > was very I/O bound, but I haven't measured it to see if it's closer to
    > 75% or to 100% improvement).


    I don't doubt the performance gains, but I'd argue that if you are
    seeing that sort of improvement, then you clearly don't have an I/O
    bound program at all, but a compute bound one. By definition, an I/O
    bound program's performance is gated by the I/O operations, and not
    CPU usage, so adding more CPU shouldn't really change anything. After
    all, if your program is "very I/O bound" it means it is waiting on I/O
    virtually all of the time (and thus not executing any Python code
    using the CPU), so where would adding CPU time gain anything?

    I do think it can be tricky to determine just what case an application
    falls into (and many oscillate between I/O and CPU bound modes), and
    indeed a purely CPU bound Python application (if in Python code and
    not a well-behaving extension module) isn't going to be helped at all.

    But to see benefit from additional CPUs for a Python application, I
    believe you're really looking for a multithreaded application that is
    technically compute bound - certainly on a instant to instant basis if
    not overall - but which performs a lot of (or at least regular) I/O
    operations (or as you note, other extension calls which release the
    GIL). The good news is that I believe many applications do fall into
    this category, even if from the outside they might be considered I/O
    bound, if only because it doesn't take too much executing Python code
    to process the I/O responses to create a CPU bottleneck at a given
    instant.

    (...)
    > But then again very few of the projects I work on end up having CPU as
    > the most scarce resource so the machines that do have multiple CPUs
    > are that way because they are running oodles of other processes as
    > well.


    This is an excellent point since even if the only advantage to the
    extra CPUs was to free up more of a single CPU for a Python
    application, you'd still see a net gain for that application when
    running in its real world environment.

    -- David
    David Bolen, Aug 26, 2004
    #5
  6. Brett C.

    Dave Brueck Guest

    David Bolen wrote:
    > I don't doubt the performance gains, but I'd argue that if you are
    > seeing that sort of improvement, then you clearly don't have an I/O
    > bound program at all, but a compute bound one.


    Ugh, yes. Thanks for the correction! The application in question was an
    object layer in front of a database - it spent most of its time pickling
    and unpickling objects, so the bulk of the performance gains probably
    came from the database speeding up (it was on the same box).

    >>But then again very few of the projects I work on end up having CPU as
    >>the most scarce resource so the machines that do have multiple CPUs
    >>are that way because they are running oodles of other processes as
    >>well.

    >
    > This is an excellent point since even if the only advantage to the
    > extra CPUs was to free up more of a single CPU for a Python
    > application, you'd still see a net gain for that application when
    > running in its real world environment.


    Good call.

    Thanks,
    Dave
    Dave Brueck, Aug 26, 2004
    #6
  7. Brett C.

    Ville Vainio Guest

    Message queues [Re: Best processor (i386) for Python performance?]

    >>>>> "David" == David Bolen <> writes:

    David> I do think it can be tricky to determine just what case an
    David> application falls into (and many oscillate between I/O and
    David> CPU bound modes), and indeed a purely CPU bound Python
    David> application (if in Python code and not a well-behaving
    David> extension module) isn't going to be helped at all.

    The sensible thing to do then is to use multiple processes, not just
    multiple threads. Many Python apps use Queue.Queue anyway, and such an
    approach is often easy to convert over to use processes instead of
    threads.

    In fact, it might be fun to have a trivial message queue
    implementation in the standard library:

    # server code

    frow mq import *

    q,results = MQueue(),MQueue()

    # file has just a handle, like
    # mq:123.12.12.54:67

    q.publish(open("~/jobs","w"))
    results.publish(open("~/result","w"))
    spawn_server_if_needed()

    while 1:
    job = q.get()
    res = my_handle_job( job )
    results.put(res)


    # client code

    ....

    req, results = MQueue(open("~/job")), MQueue(open("~/results"))

    req.put( ("easyjob", 34, 2.44) )
    req.put( ("easyjob", 213, 2.44) )

    ....


    Obviously these trivial mqueues could still be wrapped with additional
    rendezvous functionality:

    job = mq.Job(("hello",2))

    rv = mq.Rendezvous(q,resultqueue)

    rv.put(job)

    res = job.result() # blocks until result is ready

    Though this might be more in the territory of external
    libs/frameworks... but hey, we've already got xml-rpc and web server
    functionality ;-).

    Inter-language systems should obviously something like Corba for this.

    --
    Ville Vainio http://tinyurl.com/2prnb
    Ville Vainio, Aug 26, 2004
    #7
  8. Brett C.

    David Bolen Guest

    Re: Message queues [Re: Best processor (i386) for Python performance?]

    Ville Vainio <> writes:

    > >>>>> "David" == David Bolen <> writes:

    >
    > David> I do think it can be tricky to determine just what case an
    > David> application falls into (and many oscillate between I/O and
    > David> CPU bound modes), and indeed a purely CPU bound Python
    > David> application (if in Python code and not a well-behaving
    > David> extension module) isn't going to be helped at all.
    >
    > The sensible thing to do then is to use multiple processes, not just
    > multiple threads. Many Python apps use Queue.Queue anyway, and such an
    > approach is often easy to convert over to use processes instead of
    > threads.


    Well, "sensible" may depend on your needs and environment. I'm far
    less a fan of multi-process situations under Windows than I am under
    Unix systems for example. In Windows process creation is far less
    efficient, and proper parent/child relationships don't always work
    properly (particularly when it comes to killing processes off) and
    such. Threading, on the other hand, just plain works extremely well,
    at least on the WinNT/2K/XP variants. That's almost backwards to the
    way I feel about things under Unix, where the various thread
    implementations and support for them on different systems can make
    separate processes more attractive.

    But you're right that multi-process solutions are certainly something
    to keep in the toolbox as available options.

    -- David
    David Bolen, Aug 26, 2004
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Federico
    Replies:
    0
    Views:
    257
    Federico
    Jun 25, 2004
  2. Tom Locke
    Replies:
    5
    Views:
    469
    Michael Hudson
    Aug 27, 2004
  3. brahatha
    Replies:
    1
    Views:
    647
  4. Replies:
    1
    Views:
    345
  5. Replies:
    0
    Views:
    128
Loading...

Share This Page