Solving the problem of mutual recursion


P

Peter Brooks

I'm not sure if this'll interest anybody, but I expect that I'm going
to get some mutual recursion in my simulation, so I needed to see how
python handled it. Unfortunately, it falls over once it detects a
certain level of recursion. This is reasonable as, otherwise, the
stack eventually over-fills.

Mostly the recommendation, when this happens, is to re-write the code
as loops. Of course, that's not as satisfactory as using recursion.

The problem really is that the methods or functions called recursively
never exit because they're waiting for the return of the function
they've called. Now, if your recursion relies upon knowing what's
returned, then this solution won't help you. Often, though, you're not
interested in what's returned and would just like the routine to exit
once it's called itself, or another process, recursively.

If that's the case, this solution, using threads, allows you to have
functions call themselves, or each other, indefinitely. It works OK on
a macbook pro and the thread count varies from 2-3, so it handles the
thread queuing well. I'm not sure how well this will work on other
architecture - it'd be interesting to know if anybody tries it.

#!/usr/bin/python
# Two functions, Jim and Fred, call each other recursively and
indefinitely, while the main program continues to execute as well
import threading, time

def jim(arg,count):
print "Running jim:", arg,count," Thread Count
",threading.active_count()
thread = threading.Thread(target=fred, args=(" From Jim ",count
+1))
thread.start()
print count," Jim complete. Thread
Count",threading.active_count()

def fred(arg,counter):
print "Running fred:", arg,counter
thread = threading.Thread(target=jim, args=(" From Fred
",counter+1))
thread.start()
print counter,"Fred complete",threading.currentThread()


thread = threading.Thread(target=jim, args=(" From Main",0))
thread.start()
print "Jim run from main"
count = 0
while True:
count += 1
print 'In main program',count
 
Ad

Advertisements

J

Jussi Piitulainen

Peter said:
I'm not sure if this'll interest anybody, but I expect that I'm
going to get some mutual recursion in my simulation, so I needed to ....
returned, then this solution won't help you. Often, though, you're
not interested in what's returned and would just like the routine to
exit once it's called itself, or another process, recursively.

If that's the case, this solution, using threads, allows you to have
functions call themselves, or each other, indefinitely. It works OK
on a macbook pro and the thread count varies from 2-3, so it handles

.... hard to resist ... but somehow I manage ... whew ...

A light-weighter way is to have each task end by assigning the next
task and returning, instead of calling the next task directly. When a
task returns, a driver loop will call the assigned task, which again
does a bounded amount of work, assigns the next task, and returns.
Tasks can even pass parameters in the same way.

Like so, Dr. Fred keeps adding to a pile as long as there is a pile,
and Mr. Jim keeps taking from it as long as it's there to take from:

from random import choice

def fred():
global fun, arg
print('Fred adds 1')
fun, arg = choice((fred, jim)), arg + 1

def jim():
global fun, arg
print('Jim takes 1')
fun, arg = choice((fred, jim)), arg - 1

if __name__ == '__main__':
fun, arg = choice((fred, jim)), 3
while arg:
print('Pile is', arg, end = '\t')
fun()
else:
print('Pile is gone')
 
R

Roy Smith

Jussi Piitulainen said:
A light-weighter way is to have each task end by assigning the next
task and returning, instead of calling the next task directly. When a
task returns, a driver loop will call the assigned task, which again
does a bounded amount of work, assigns the next task, and returns.
Tasks can even pass parameters in the same way.

Yup. I've used this pattern for building state machines. Each state is
a function which returns the next state (or, sometimes, a (next_state,
output) tuple). The top level loop ends up looking very much like yours:

state = start
while state != end:
state, output = state(get_next_input())
print output
 
P

Peter Brooks

A light-weighter way is to have each task end by assigning the next
task and returning, instead of calling the next task directly. When a
task returns, a driver loop will call the assigned task, which again
does a bounded amount of work, assigns the next task, and returns.
Tasks can even pass parameters in the same way.
Yes, that's true - there are a number of ways of making it linear.

What I'm particularly pleased about with my method is the parallelism
that it achieves - with so little effort! The simulation is going to
be computationally intense and this is going to make sure that the
CPUs are all giving it their best shot. When I run this on my macbook,
the python interpreter takes over 140% of CPU - with a bit of fine-
tuning, it should be possible to have more concurrent threads and to
use the four cores optimally.

Naturally I'll need to be careful with the concurrency, but this is so
simple and clean that it should be easy to avoid the main problems
with accessing the same variables.
 
C

Carlos Nepomuceno

----------------------------------------
Date: Sun, 26 May 2013 10:21:05 -0700
Subject: Re: Solving the problem of mutual recursion
From: (e-mail address removed)
To: (e-mail address removed)


Yes, that's true - there are a number of ways of making it linear.

What I'm particularly pleased about with my method is the parallelism
that it achieves - with so little effort! The simulation is going to
be computationally intense and this is going to make sure that the
CPUs are all giving it their best shot. When I run this on my macbook,
the python interpreter takes over 140% of CPU - with a bit of fine-
tuning, it should be possible to have more concurrent threads and to
use the four cores optimally.

Naturally I'll need to be careful with the concurrency, but this is so
simple and clean that it should be easy to avoid the main problems
with accessing the same variables.

Python threads run in the same process and won't run concurrently:

"CPython implementation detail: In CPython, due to the Global InterpreterLock, only one thread can execute Python code at once (even though certain performance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing. However, threading is still an appropriate model if you want to run multiple I/O-boundtasks simultaneously."[1]

How can you get 140% of CPU? IS that a typo??

[1] http://docs.python.org/2/library/threading.html
 
P

Peter Brooks

----------------------------------------








Date: Sun, 26 May 2013 10:21:05 -0700
Subject: Re: Solving the problem of mutual recursion
From: (e-mail address removed)
To: (e-mail address removed)
On May 26, 5:09 pm, Jussi Piitulainen <[email protected]>
wrote:
Yes, that's true - there are a number of ways of making it linear.
What I'm particularly pleased about with my method is the parallelism
that it achieves - with so little effort! The simulation is going to
be computationally intense and this is going to make sure that the
CPUs are all giving it their best shot. When I run this on my macbook,
the python interpreter takes over 140% of CPU - with a bit of fine-
tuning, it should be possible to have more concurrent threads and to
use the four cores optimally.
Naturally I'll need to be careful with the concurrency, but this is so
simple and clean that it should be easy to avoid the main problems
with accessing the same variables.

Python threads run in the same process and won't run concurrently:

"CPython implementation detail: In CPython, due to the Global InterpreterLock, only one thread can execute Python code at once (even though certainperformance-oriented libraries might overcome this limitation). If you want your application to make better use of the computational resources of multi-core machines, you are advised to use multiprocessing. However, threading is still an appropriate model if you want to run multiple I/O-bound taskssimultaneously."[1]

How can you get 140% of CPU? IS that a typo??
No, on a multi-core machine it's normal. The machine shows python
running multiple threads - and the number of threads change as the
program runs. Perhaps the OS/X implementation of python does allow
concurrency when others don't. It certainly looks like it!
 
Ad

Advertisements

C

Carlos Nepomuceno

----------------------------------------
Date: Sun, 26 May 2013 11:13:12 -0700
Subject: Re: Solving the problem of mutual recursion
From: (e-mail address removed)
To: (e-mail address removed) [...]
How can you get 140% of CPU? IS that a typo??
No, on a multi-core machine it's normal. The machine shows python
running multiple threads - and the number of threads change as the
program runs. Perhaps the OS/X implementation of python does allow
concurrency when others don't. It certainly looks like it!


I pretty sure it doesn't run on multiple cores on Linux and Windows.

I've tested it and have been trying to find a better way achieve concurrency in Python. One of the ways is the multiprocessing module[1].

Do Mac OS shows "140%" CPU load when more than a single core is been used? lol Apple sucks!!! lol


[1] http://docs.python.org/2/library/multiprocessing.html
 
P

Peter Brooks

----------------------------------------
Date: Sun, 26 May 2013 11:13:12 -0700
Subject: Re: Solving the problem of mutual recursion
From: (e-mail address removed)
To: (e-mail address removed) [...]
How can you get 140% of CPU? IS that a typo??
No, on a multi-core machine it's normal. The machine shows python
running multiple threads - and the number of threads change as the
program runs. Perhaps the OS/X implementation of python does allow
concurrency when others don't. It certainly looks like it!

I pretty sure it doesn't run on multiple cores on Linux and Windows.

I've tested it and have been trying to find a better way achieve concurrency in Python. One of the ways is the multiprocessing module[1].

Do Mac OS shows "140%" CPU load when more than a single core is been used? lol Apple sucks!!! lol
It's not uncommon - HP-UX does exactly the same thing.
 
I

Ian Kelly

No, on a multi-core machine it's normal. The machine shows python
running multiple threads - and the number of threads change as the
program runs. Perhaps the OS/X implementation of python does allow
concurrency when others don't. It certainly looks like it!

I'm pretty sure that CPython uses the GIL regardless of platform. And
yes you can have multiple OS-level threads, but because of the GIL
only one will actually be running at a time. Other possibilities
include:

1) You're using a different implementation of Python that does not
have a GIL, e.g. Jython or IronPython (well, probably not the latter).
I believe PyPy also has a GIL-less version, although I don't think
this is in the current release yet.

2) You're using a fork of CPython that removes the GIL. There are a
number of these, but none to my knowledge that are able to maintain
the performance of CPython for a single thread.

3) You're mistakenly looking at multiple Python processes that are
running simultaneously and adding their usages together.

4) The utility you're using is reporting the process CPU usage incorrectly.

5) Maybe for some reason the Mac OS X build of CPython is using
spin-locks in the GIL. I can't imagine why this would be the case.
 
C

Chris Angelico

I'm pretty sure that CPython uses the GIL regardless of platform. And
yes you can have multiple OS-level threads, but because of the GIL
only one will actually be running at a time. Other possibilities
include:

6) It's spinning in a function that has released the GIL. Python
threads can certainly execute concurrently; they just can't be
manipulating Python objects. Most of the I/O functions will release
the GIL before doing a potentially-blocking operation, and some
CPU-heavy functions can do the same (I'm given to understand that
numpy does this) - just depends on having a long job that involves no
refcounted objects.

ChrisA
 
P

Peter Brooks

6) It's spinning in a function that has released the GIL. Python
threads can certainly execute concurrently; they just can't be
manipulating Python objects. Most of the I/O functions will release
the GIL before doing a potentially-blocking operation, and some
CPU-heavy functions can do the same (I'm given to understand that
numpy does this) - just depends on having a long job that involves no
refcounted objects.
This makes complete sense - any atomic action should be atomic, so two
threads can't be doing it at the same time. They can be doing anything
else though.

If two threads create a new object at the same time, for example,
there's potentially the problem that they'll get the same space, so
the object will be owned by both. To prevent this, the new object call
should be run in only one thread.

If you have two objects running their methods on their own local
variables, then there's no potential for conflict, so there's no need
for them to be blocked.

This is an interesting subject.. There's nothing wrong with the tool
I'm using to report threads - 'Activity Monitor' is the standard
process monitor. It counts cores as 'CPUs', which seems perfectly
reasonable to me. As I said, other Unixes, such as HP-UX, do the same
thing.

I'm not quite sure of the best way to examine the exact behaviour.
Since the blocking works at the level of atomic operations, it's
difficult to detect - a call to an atomic operation may block all
other threads, but any snapshot of the number of active threads won't
be running at this time, so will only report when there are multiple
threads active.
 
Ad

Advertisements

I

Ian Kelly

This makes complete sense - any atomic action should be atomic, so two
threads can't be doing it at the same time. They can be doing anything
else though.

If two threads create a new object at the same time, for example,
there's potentially the problem that they'll get the same space, so
the object will be owned by both. To prevent this, the new object call
should be run in only one thread.

If you have two objects running their methods on their own local
variables, then there's no potential for conflict, so there's no need
for them to be blocked.

That's not the way it works. The CPython interpreter always runs with
the GIL held; the alternative would be to have individual mutex locks
on every Python object, which is expensive for performance due to the
reference counting mechanism. Python functions can't release the GIL.
C functions that are called from the interpreter *can* release the
GIL to allow concurrency, but are only permitted to do so as long as
they're not working with Python objects, e.g. waiting on I/O or
performing a long calculation on C data.

There are some more detailed slides on how the GIL works at:

http://www.dabeaz.com/python/UnderstandingGIL.pdf

Note that the description in Part 1 describes how the GIL worked prior
to Python 3.2. The new GIL is described in Part 4, but the basic
underlying concept is the same.
This is an interesting subject.. There's nothing wrong with the tool
I'm using to report threads - 'Activity Monitor' is the standard
process monitor. It counts cores as 'CPUs', which seems perfectly
reasonable to me. As I said, other Unixes, such as HP-UX, do the same
thing.

I have no problem with that. It's also the default in Linux, where I
believe it is called "IRIX mode" (as opposed to "Solaris mode", where
the difference is just a factor equal to the number of cores). What I
was questioning was whether the actual number being reported was
correct. If it's the standard tool for the OS, then it probably is.
 
I

Ian Kelly

6) It's spinning in a function that has released the GIL. Python
threads can certainly execute concurrently; they just can't be
manipulating Python objects. Most of the I/O functions will release
the GIL before doing a potentially-blocking operation, and some
CPU-heavy functions can do the same (I'm given to understand that
numpy does this) - just depends on having a long job that involves no
refcounted objects.

7) Since the program being tested does basically nothing except start
and exit threads, the extra 40% probably represents the overhead of
all that starting and stopping, which would be done outside the GIL.
 
I

Ian Kelly

7) Since the program being tested does basically nothing except start
and exit threads, the extra 40% probably represents the overhead of
all that starting and stopping, which would be done outside the GIL.

To test this, I tried running the script in Python 2.7 in Linux with
the print statements removed and verified that it was using about 135%
of the CPU. However, top also told me that only about 95% of that was
user processes; the other 40% was kernel usage. The 40% doesn't seem
to be threading overhead though, because I tried adding large xrange
loops to slow down the thread creation process and it had no effect on
the stats.

Then I tried running the same program in Python 3.2, and I got the
more expected 100% CPU usage with minimal kernel time. So I'm
thinking now that the extra 40% may actually be overhead induced by
the GIL. If that's the case then wow, the old GIL really did suck.
 
Ad

Advertisements

C

Chris Angelico

This makes complete sense - any atomic action should be atomic, so two
threads can't be doing it at the same time. They can be doing anything
else though.

If two threads create a new object at the same time, for example,
there's potentially the problem that they'll get the same space, so
the object will be owned by both. To prevent this, the new object call
should be run in only one thread.

If you have two objects running their methods on their own local
variables, then there's no potential for conflict, so there's no need
for them to be blocked.

That's not the way it works. [snip details]

You're actually both saying the same thing, except that Peter went for
finer granularity than Ian and CPython did. CPython figures that, with
a refcounted heap, there's not a lot of point having separate mutex
locks for different operations. Other language interpreters have made
other choices. But Peter's analysis is still correct; it's just that
guarding it is simplified down to a binary state: either you have the
GIL, or you don't.

ChrisA
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top