using threads with for-loops

Discussion in 'Python' started by Klaus Neuner, Sep 28, 2004.

  1. Klaus Neuner

    Klaus Neuner Guest

    Hello,

    I wrote a program that does essentially the following:

    for rule in rules:
    for line in line_list:
    line = my_apply(rule, line)

    line_list contains the lines of some input text.

    To "apply a rule" always means to

    1. do some regex matches on line
    2. substitute something in line


    My question is: Given this "architecture", does it make sense
    to use threads? And if so, how?

    Klaus
    Klaus Neuner, Sep 28, 2004
    #1
    1. Advertising

  2. Klaus Neuner wrote:
    > Hello,
    >
    > I wrote a program that does essentially the following:
    >
    > for rule in rules:
    > for line in line_list:
    > line = my_apply(rule, line)
    >
    > line_list contains the lines of some input text.
    >
    > To "apply a rule" always means to
    >
    > 1. do some regex matches on line
    > 2. substitute something in line
    >
    >
    > My question is: Given this "architecture", does it make sense
    > to use threads? And if so, how?
    >
    > Klaus


    It depends on how many items are in rules and how long my_apply() takes.
    Rembrandt Q Einstein, Sep 28, 2004
    #2
    1. Advertising

  3. Klaus Neuner

    Peter Hansen Guest

    Klaus Neuner wrote:
    > Hello,
    >
    > I wrote a program that does essentially the following:
    >
    > for rule in rules:
    > for line in line_list:
    > line = my_apply(rule, line)
    >
    > line_list contains the lines of some input text.
    >
    > To "apply a rule" always means to
    >
    > 1. do some regex matches on line
    > 2. substitute something in line
    >
    > My question is: Given this "architecture", does it make sense
    > to use threads? And if so, how?


    The code is (based on what you give above) "CPU bound",
    which means you will not see any advantage in using
    threads. Threads don't magically make anything go
    faster, and in fact have a certain overhead for the context
    switch, so no, it makes no sense to use threads here.

    -Peter
    Peter Hansen, Sep 28, 2004
    #3
  4. Peter Hansen wrote:

    > Klaus Neuner wrote:
    >> Hello,
    >>
    >> I wrote a program that does essentially the following:
    >>
    >> for rule in rules:
    >> for line in line_list:
    >> line = my_apply(rule, line)
    >>
    >> line_list contains the lines of some input text.
    >>
    >> To "apply a rule" always means to
    >>
    >> 1. do some regex matches on line
    >> 2. substitute something in line
    >>
    >> My question is: Given this "architecture", does it make sense
    >> to use threads? And if so, how?

    >
    > The code is (based on what you give above) "CPU bound",
    > which means you will not see any advantage in using
    > threads.


    Have you ever seen machines with more than one CPU? The code above is
    perfectly suited for parallelization.

    Mathias
    Mathias Waack, Sep 28, 2004
    #4
  5. Klaus Neuner

    Peter Hansen Guest

    Mathias Waack wrote:
    > Peter Hansen wrote:
    >>Klaus Neuner wrote:
    >>>I wrote a program that does essentially the following:
    >>>
    >>>for rule in rules:
    >>>for line in line_list:
    >>>line = my_apply(rule, line)
    >>>
    >>>line_list contains the lines of some input text.
    >>>
    >>>To "apply a rule" always means to
    >>>
    >>> 1. do some regex matches on line
    >>> 2. substitute something in line
    >>>
    >>>My question is: Given this "architecture", does it make sense
    >>>to use threads? And if so, how?

    >>
    >>The code is (based on what you give above) "CPU bound",
    >>which means you will not see any advantage in using
    >>threads.

    >
    > Have you ever seen machines with more than one CPU?


    Why yes, I have! And have *you* seen an implementation
    of Python which will effectively use those multiple CPUs
    in code like that above which runs in a single process?

    And do you think it likely that the OP is dealing with
    a multiple CPU situation, but managed to forget to
    mention it? I didn't think it likely, which is why when
    I considered the multi-CPU situation I discarded the
    idea. Perhaps, however, I was too quick to judge...

    > The code above is perfectly suited for parallelization.


    Yes, it is. Once you take into account implementation
    issues (e.g. the GIL), would you still think so?

    -Peter
    Peter Hansen, Sep 28, 2004
    #5
  6. Peter Hansen wrote:

    > Mathias Waack wrote:
    >> Have you ever seen machines with more than one CPU?

    >
    > Why yes, I have! And have *you* seen an implementation
    > of Python which will effectively use those multiple CPUs
    > in code like that above which runs in a single process?


    Ok, I've lost: I haven't seen such implementation and don't know much
    about the thread-layer of Python.

    >> The code above is perfectly suited for parallelization.

    >
    > Yes, it is. Once you take into account implementation
    > issues (e.g. the GIL), would you still think so?


    Depends. Which means: don't know. If I would start thinking about
    creating threads to gain a speedup, I would even think about
    switching to another programming language.

    Mathias
    Mathias Waack, Sep 28, 2004
    #6
  7. Klaus Neuner

    Peter Hansen Guest

    Mathias Waack wrote:
    > I haven't seen such implementation and don't know much
    > about the thread-layer of Python.


    Unfortunately, there is something called the Global
    Interpreter Lock (GIL), which means that even though
    native threads are (generally) used for Python threads,
    only one of those threads can be active in the interpreter
    at any time, even if there are multiple CPUs present.

    > Depends. Which means: don't know. If I would start thinking about
    > creating threads to gain a speedup, I would even think about
    > switching to another programming language.


    I believe some work has been done in this area to make
    Python take advantage of multiple CPU systems, but
    I believe your approach (switch languages) is still one of
    the best options. Another is to arrange your application
    to run as multiple processes, but this isn't quite as
    simple as just using multiple threads.

    -Peter
    Peter Hansen, Sep 28, 2004
    #7
  8. "Peter Hansen" <> wrote in message
    news:...

    > > The code above is perfectly suited for parallelization.

    >
    > Yes, it is. Once you take into account implementation
    > issues (e.g. the GIL), would you still think so?


    Orrr .... you could cheat and use f.ex. PYRO to disguise a set of *Python
    Applications* as Threads; The Applications could then be distributed however
    which way you want ;-)
    Frithiof Andreas Jensen, Sep 29, 2004
    #8
  9. Peter Hansen wrote:
    > Mathias Waack wrote:
    >> Depends. Which means: don't know. If I would start thinking about
    >> creating threads to gain a speedup, I would even think about
    >> switching to another programming language.

    >
    > I believe some work has been done in this area to make
    > Python take advantage of multiple CPU systems, but
    > I believe your approach (switch languages) is still one of
    > the best options. Another is to arrange your application
    > to run as multiple processes, but this isn't quite as
    > simple as just using multiple threads.


    The java people have done a lot to speed up java threads. Without any
    real success (just my opinion) - java programs are just slow. There
    are classes of problems which can be easily solved using python, and
    there are problems not very well suited for pythonic solutions.
    Thats a fact and nobody should waste her time to force python into
    the wrong direction.
    And I think its fair to let other languages live. We should be fair
    winners;)

    Mathias
    Mathias Waack, Sep 29, 2004
    #9
  10. Mathias Waack <> wrote:
    ...
    > >> Have you ever seen machines with more than one CPU?

    > >
    > > Why yes, I have! And have *you* seen an implementation
    > > of Python which will effectively use those multiple CPUs
    > > in code like that above which runs in a single process?

    >
    > Ok, I've lost: I haven't seen such implementation and don't know much
    > about the thread-layer of Python.


    There are, as far as I know, three complete implementations of Python
    (plus several add-on bits and pieces and unfinished ones): CPython,
    Jython, and IronPython. CPython uses its own dedicated virtual machine,
    and its threads are subject to a global per-interpreter lock.

    However, in lieu of dedicated virtual machines, Jython relies on the
    JVM, and IronPython relies on Microsoft CLR's, and I believe both of
    those VMs have no global interpreter lock. I have no multi-CPU machine
    at hand that can run Microsoft's CLR, but I do have a Powermac with two
    CPUs, MacOSX 10.3.5, and a JVM (1.4.2 is the latest one, I believe).
    So, if you can suggest a test to show whether Jython there can in fact
    effectively use both CPU's, I'll be glad to run it and let everybody
    know (I'm a bit rusty on recent Java VMs, so I don't know if I need any
    special incantations to tell them to run on many CPUs, or what). I'm
    not sure IronPython runs fully on Mono, and neither am I sure the
    current release of Mono on MacOSX is able to use multiple CPUs for
    threading, but if somebody can find out and suggest a definitive test on
    the matter, again I'll be glad to run it and report to the list.

    Net of such niggling issues, one might say that _most_ (hey, 2 out of 3,
    right?-) current complete implementations of Python can do "free
    threading" with no global per-interpreter lock, and thus in theory
    should be able to use multiple CPUs productively in multiple CPU-bound
    threads of a sinble process -- assuming, say, Java or C# can do so, I
    see no reason, in principle, why Python shouldn't be able to, when it
    runs on the same underlying VM as Java or C# respectively.


    > >> The code above is perfectly suited for parallelization.

    > >
    > > Yes, it is. Once you take into account implementation
    > > issues (e.g. the GIL), would you still think so?

    >
    > Depends. Which means: don't know. If I would start thinking about
    > creating threads to gain a speedup, I would even think about
    > switching to another programming language.


    ....or another implementation of Python, if you're currently using
    CPython and some limitation in it is a big problem for you...


    Alex
    Alex Martelli, Sep 29, 2004
    #10
  11. Klaus Neuner

    Klaus Neuner Guest

    Thaks to all who participated in this thread.

    Klaus
    Klaus Neuner, Oct 5, 2004
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Marcus Liddle
    Replies:
    2
    Views:
    472
  2. python-list

    threads and loops

    python-list, Jan 18, 2004, in forum: Python
    Replies:
    1
    Views:
    364
  3. yoda
    Replies:
    2
    Views:
    426
    =?utf-8?Q?Bj=C3=B6rn_Lindstr=C3=B6m?=
    Aug 1, 2005
  4. Emil Sandin

    Threads and loops

    Emil Sandin, Aug 16, 2007, in forum: Ruby
    Replies:
    2
    Views:
    76
    Luis Parravicini
    Aug 16, 2007
  5. Me
    Replies:
    2
    Views:
    243
Loading...

Share This Page