Proper tail recursion

  • Thread starter Christopher T King
  • Start date
C

Christopher T King

Is it feasable, and/or desirable to have Python optimize tail-recursive
calls, similar to Scheme and gcc -O2? i.e. effectively compile this:

def foo(n):
return foo(n-1)

into this:

def foo(n):
goto foo(n-1)

This would naturally only be enabled in optimized bytecode.

On an unrelated, Scheme-ish note, is there any way for a generator to
access itself? i.e. I want to be able to write this:

def gen():
yield get_current_continuation(),someothergenerator()

Thanks,
Chris
 
D

Duncan Booth

Is it feasable, and/or desirable to have Python optimize tail-recursive
calls, similar to Scheme and gcc -O2? i.e. effectively compile this:

def foo(n):
return foo(n-1)

into this:

def foo(n):
goto foo(n-1)

This would naturally only be enabled in optimized bytecode.

On an unrelated, Scheme-ish note, is there any way for a generator to
access itself? i.e. I want to be able to write this:

def gen():
yield get_current_continuation(),someothergenerator()

Thanks,
Chris

Just because a function looks recursive to you, doesn't mean it actually is
recursive when it gets called.

How do you propose to handle this?

def foo(n):
return foo(n-1)

def baz(n):
return 0

bar = foo
foo = baz
bar(3)

One solution might be to provide a keyword which refers to the currently
executing function: this would be useful not only for recursion but also to
access attributes on the function. These help with methods and generators
which as you noted find it hard to reference themselves.

However, I don't believe there is at present any easy way to get hold of
the current executing function or generator object.
 
T

Thomas Bellman

Just because a function looks recursive to you, doesn't mean it actually is
recursive when it gets called.
How do you propose to handle this?
def foo(n):
return foo(n-1)
def baz(n):
return 0
bar = foo
foo = baz
bar(3)

Instead of doing tail recursion elimination, one could implement
general tail call optimization. It shouldn't be too difficult to
make the compiler recognize tail calls, i.e

return spam(params)

outside any try statement. It could then generate the byte code
TAIL_CALL instead of CALL_FUNCTION. TAIL_CALL would throw away
the current stack frame before calling spam().

This doesn't give you the performance improvement of not making a
function call, but at least it makes a tail recursive function
run in constant space.

Of course, some care needs to be taken to ensure that the current
stack frame isn't thrown away if the called function is a local
function accessing the namespace of the calling function. I.e,
in the case

def parrot(x):
def spam(n):
return n + y
y = x*x
return spam(5)

the call to 'return spam(5)' must make sure that 'y' is
accessible to spam(). That might need a runtime check in many
cases. (In the common case of there being no local functions at
all in the caller, it can be determined statically, of course.)
Hmm, there is an attribute 'func_closure' on function objects,
that at a quick glance seems to be None if it doesn't access the
surrounding namespace.

However, I don't believe there is at present any easy way to get hold of
the current executing function or generator object.

Raising and catching an exception, and digging through the
traceback object might give you enough information to do
that. But it's not very pretty, and I'm not sure that it
is portable between C-Python, Jython, PyPy, Psyco, and any
other Python compiler/interpreter.
 
C

Christopher T King

Instead of doing tail recursion elimination, one could implement
general tail call optimization. It shouldn't be too difficult to
make the compiler recognize tail calls, i.e

return spam(params)

outside any try statement. It could then generate the byte code
TAIL_CALL instead of CALL_FUNCTION. TAIL_CALL would throw away
the current stack frame before calling spam().

This doesn't give you the performance improvement of not making a
function call, but at least it makes a tail recursive function
run in constant space.

What he said. :) I don't mean to optimize recursive functions into loops,
but simply to replace normal calls at the end of functions with tail
calls.
Of course, some care needs to be taken to ensure that the current
stack frame isn't thrown away if the called function is a local
function accessing the namespace of the calling function.
[snip]

I'm just guessing here (my knowledge of the internals isn't too great),
but in the case of closures, isn't it okay to throw away the stack space,
since any variables needed are kept in the function's func_closure
attribute? To modify your example:

def parrot(x):
def spam(n):
return n + y
y = x*x
return spam

print parrot(3)(5)

This returns 28, as expected, and is (roughly) functionally no different
from a tail-callified version of your parrot() function.
Raising and catching an exception, and digging through the
traceback object might give you enough information to do
that. But it's not very pretty, and I'm not sure that it
is portable between C-Python, Jython, PyPy, Psyco, and any
other Python compiler/interpreter.

I don't need it that badly ;) -- it just would have made a neat
generaliztion in my code, rather than special-casing a yield of 'None' to
refer to the generator that yielded it.
 
D

Duncan Booth

Instead of doing tail recursion elimination, one could implement
general tail call optimization. It shouldn't be too difficult to
make the compiler recognize tail calls, i.e

return spam(params)

outside any try statement. It could then generate the byte code
TAIL_CALL instead of CALL_FUNCTION. TAIL_CALL would throw away
the current stack frame before calling spam().

That ought to be doable. The main disadvantages I can see are that stack
backtraces will be incomplete (which could be confusing), and unbounded
recursion won't be caught (which may or may not matter).
This doesn't give you the performance improvement of not making a
function call, but at least it makes a tail recursive function
run in constant space.

Of course, some care needs to be taken to ensure that the current
stack frame isn't thrown away if the called function is a local
function accessing the namespace of the calling function. I.e,
in the case

def parrot(x):
def spam(n):
return n + y
y = x*x
return spam(5)

the call to 'return spam(5)' must make sure that 'y' is
accessible to spam().

No it doesn't. References to 'y' in parrot are handled indirectly by
storing the value in a cell object. The stack frame only holds a reference
to the cell object. So long as the cell still has another reference the
value remains valid.
Raising and catching an exception, and digging through the
traceback object might give you enough information to do
that. But it's not very pretty, and I'm not sure that it
is portable between C-Python, Jython, PyPy, Psyco, and any
other Python compiler/interpreter.
I'm pretty sure that there is nothing useful in the traceback object. You
can find out which instruction was executing, but this doesn't help in
identifying the function since several functions could share the same code
(an active generator being an obvious example where this happens).
 
C

Christopher T King

Poking around in ceval.c, I discovered the PREDICT macros, and plan to
make a preliminary implementation of tail calls using a similar test (use
tail code call for any CALL_FUNCTION followed by a RETURN_VALUE). But I
was just wondering, is there any reason why LOAD_CONST and friends don't
check for a following RETURN_VALUE to avoid the push/loop/pop cycle and
rather simply return the value?
 
C

Christopher T King

Poking around in ceval.c, I discovered the PREDICT macros, and plan to
make a preliminary implementation of tail calls using a similar test (use

I take that back. The intertwinement of C functions and Python calls on
the stack makes this nearly impossible -- there's no easy way to pop both
the C and Python stacks before calling the next function (it could be
partially done popping only the Python stack, though). Methinks tail calls
will have to wait for Stackless.
 
C

Chris King

Christopher said:
I take that back. The intertwinement of C functions and Python calls on
the stack makes this nearly impossible -- there's no easy way to pop both
the C and Python stacks before calling the next function (it could be
partially done popping only the Python stack, though). Methinks tail calls
will have to wait for Stackless.

Here I am, replying to myself again :)

I'm trying a new implementation, making good use of a loop around the
entirety of PyEval_EvalFrame(). Unfortunately, my changes cause the
interpreter to segfault after a couple of tail calls. A debugger shows
that 0xffffffff is getting stuck into a pool->freeblock in
PyObject_Free(), and then is subsequently dereferenced in
PyObject_Malloc() (causing the segfault).

Could this be caused by my code Py_DECREFing an object too many times,
but leaving a pointer to it somewhere? (My changes don't explicitly set
anything to 0xffffffff or -1.) Or am I just in way over my head? :p
 
T

Tim Peters

[Chris King]
I'm trying a new implementation, making good use of a loop around the
entirety of PyEval_EvalFrame(). Unfortunately, my changes cause the
interpreter to segfault after a couple of tail calls. A debugger shows
that 0xffffffff is getting stuck into a pool->freeblock in
PyObject_Free(), and then is subsequently dereferenced in
PyObject_Malloc() (causing the segfault).

Could this be caused by my code Py_DECREFing an object too many times,
but leaving a pointer to it somewhere? (My changes don't explicitly set
anything to 0xffffffff or -1.) Or am I just in way over my head? :p

WHen fiddling with Python internals, build Python in debug mode. That
enables many checks that will save you weeks of debugging. For
example, if you decref an object too many times, the expansion of the
Py_DECREF macro in a debug build will catch that and complain the
instant the refcount goes negative. In a debug build pymalloc also
pads both ends of allocated regions with special byte patterns,
initializes newly allocated memory with another special byte pattern,
and overwrites newly freed memory with a third special byte pattern.
That's very effective at catching many kinds of problems too. Read
Misc/SpecialBuilds.txt.

And yes, of course you're in way over your head. But that's how you
learn to swim, so enjoy it <wink>.
 
C

Chris King

Tim said:
WHen fiddling with Python internals, build Python in debug mode. That
enables many checks that will save you weeks of debugging.

Thanks! I found the bug -- I was just forgetting to reset the stack
after a call_function.

I have a preliminary implementation ready, the patch is against 2.4a1:
http://users.wpi.edu/~squirrel/temp/tailcall.diff.gz

This patch only works for simple functions (i.e. those that can be
handled by fast_function), but extension to other function types should
be trivial. I haven't fully tested this with regards to exception
handling and whatnot, so I want to stick with a simple case for now.

The patch works roughly as follows:
1. When a CALL_FUNCTION opcode is encountered, check if the following
opcode is RETURN_VALUE. If so, execute the tail call code.
2. The tail call code calls a modified version of call_function that
sets up and returns a frame object for the function to be called (if
possible), rather than recursively calling PyEval_EvalFrame. This frame
is stored in a temporary variable, and a RETURN_VALUE is simulated to
exit the loop.
3. After cleaning up the current frame, PyEval_EvalFrame loops back up
to the top, now using the temporarily stored frame as the current frame.

Of course, instead of a loop, gcc's tail-call optimization feature could
be used, but this would be non-portable to other compilers.

An example of the patch in action:

# Note default arguments aren't supported in the current patch
def fact2(n,v):
if n:
return fact2(n-1,v*n)
else:
return v

def fact(n):
return fact2(n,1)

Without the patch:RuntimeError: maximum recursion depth exceeded

With the patch:<really really huge number>

Any feedback would be greatly appreciated!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top