Why are functions atomic?

D

Duncan Booth

What measurements show you that...?
....

brain:~ alex$ python -mtimeit -s'import powi; p=powi.powerfactory1(3)'
'p(27)'
1000000 loops, best of 3: 0.485 usec per loop

brain:~ alex$ python -mtimeit -s'import powi; p=powi.powerfactory2(3)'
'p(27)'
1000000 loops, best of 3: 0.482 usec per loop

Your own benchmark seems to support Michael's assertion although the
difference in performance is so slight that it is unlikely ever to
outweigh the loss in readability.

Modifying powi.py to reduce the weight of the function call overhead and
the exponent operation indicates that using default arguments is faster,
but you have to push it to quite an extreme case before it becomes
significant:

def powerfactory1(exponent, plus):
def inner(x):
for i in range(1000):
res = x+exponent+plus
return res
return inner

def powerfactory2(exponent, plus):
def inner(x, exponent=exponent, plus=plus):
for i in range(1000):
res = x+exponent+plus
return res
return inner

C:\Temp>\python25\python -mtimeit -s "import powi; p=powi.powerfactory1
(3,999)" "p(27)"
10000 loops, best of 3: 159 usec per loop

C:\Temp>\python25\python -mtimeit -s "import powi; p=powi.powerfactory2
(3,999)" "p(27)"
10000 loops, best of 3: 129 usec per loop
 
M

Michael

Is there a reason for using the closure here? Using function defaults
seems to give better performance:[...]

It does? Not as far as I can measure it to any significant degree on my
computer.

I agree the performance gains are minimal. Using function defaults
rather than closures, however, seemed much cleaner an more explicit to
me. For example, I have been bitten by the following before:
.... def g():
.... x = x + 1
.... return x
.... return gTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in g
UnboundLocalError: local variable 'x' referenced before assignment

If you use default arguments, this works as expected:.... def g(x=x):
.... x = x + 1
.... return x
.... return g4

The fact that there also seems to be a performance gain (granted, it
is extremely slight here) led me to ask if there was any advantage to
using closures. It seems not.
An overriding theme in this thread is that you are greatly concerned
with the speed of your solution rather than the structure and
readability of your code.

Yes, it probably does seem that way, because I am burying this code
deeply and do not want to revisit it when profiling later, but my
overriding concern is reliability and ease of use. Using function
attributes seemed the best way to achieve both goals until I found out
that the pythonic way of copying functions failed. Here was how I
wanted my code to work:

@define_options(first_option='abs_tol')
def step(f,x,J,abs_tol=1e-12,rel_tol=1e-8,**kwargs):
"""Take a step to minimize f(x) using the jacobian J.
Return (new_x,converged) where converged is true if the tolerance
has been met.
"""
<compute dx and check convergence>
return (x + dx, converged)

@define_options(first_option='min_h')
def jacobian(f,x,min_h=1e-6,max_h=0.1):
"""Compute jacobian using a step min_h < h < max_h."""
<compute J>
return J

class Minimizer(object):
"""Object to minimize a function."""
def __init__(self,step,jacobian,**kwargs):
self.options = step.options + jacobian.options
self.step = step
self.jacobian = jacobian

def minimize(self,f,x0,**kwargs):
"""Minimize the function f(x) starting at x0."""
step = self.step
jacobian = self.jacobian

step.set_options(**kwargs)
jacobian.set_options(**kwargs)

converged = False
while not converged:
J = jacobian(f,x)
(x,converged) = step(f,x,J)

return x

@property
def options(self):
"""List of supported options."""
return self.options

The idea is that one can define different functions for computing the
jacobian, step etc. that take various parameters, and then make a
custom minimizer class that can provide the user with information
about the supported options etc.

The question is how to define the decorator define_options?

1) I thought the cleanest solution was to add a method f.set_options()
which would set f.func_defaults, and a list f.options for
documentation purposes. The docstring remains unmodified without any
special "wrapping", step and jacobian are still "functions" and
performance is optimal.

2) One could return an instance f of a class with f.__call__,
f.options and f.set_options defined. This would probably be the most
appropriate OO solution, but it makes the decorator much more messy,
or requires the user to define classes rather than simply define the
functions as above. In addition, this is at least a factor of 2.5
timese slower on my machine than option 1) because of the class
instance overhead. (This is my only real performance concern because
this is quite a large factor. Otherwise I would just use this
method.)

3) I could pass generators to Minimize and construct the functions
dynamically. This would have the same performance, but would require
the user to define generators, or require the decorator to return a
generator when the user appears to be defining a function. This just
seems much less elegant.

....
@define_options_generator(first_option='min_h')
def jacobian_gen(f,x,min_h=1e-6,max_h=0.1):
"""Compute jacobian using a step min_h < h < max_h."""
<compute J>
return J

class Minimizer(object):
"""Object to minimize a function."""
def __init__(self,step_gen,jacobian_gen,**kwargs):
self.options = step_gen.options + jacobian_gen.options
self.step_gen = step_gen
self.jacobian_gen = jacobian_gen

def minimize(self,f,x0,**kwargs):
"""Minimize the function f(x) starting at x0."""
step = self.step_gen(**kwargs)
jacobian = self.jacobian_gen(**kwargs)

converged = False
while not converged:
J = jacobian(f,x)
(x,converged) = step(f,x,J)

return x
...

4) Maybe there is a better, cleaner way to do this, but I thought that
my option 1) was the most clear, readable and fast. I would
appreciate any suggestions. The only problem is that it does use
mutable functions, and so the user might be tempted to try:

new_step = copy(step)

which would fail (because modifying new_step would also modify step).
I guess that this is a pretty big problem (I could provide a custom
copy function so that

new_step = step.copy()

would work) and I wondered if there was a better solution (or if maybe
copy.py should be fixed. Checking for a defined __copy__ method
*before* checking for pre-defined mutable types does not seem to break
anything.)

Thanks again everyone for your suggestions, it is really helping me
learn about python idioms.

Michael.
 
D

Dustan

Is there a reason for using the closure here? Using function defaults
seems to give better performance:[...]
It does? Not as far as I can measure it to any significant degree on my
computer.

I agree the performance gains are minimal. Using function defaults
rather than closures, however, seemed much cleaner an more explicit to
me. For example, I have been bitten by the following before:

... def g():
... x = x + 1
... return x
... return g>>> g = f(3)
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in g
UnboundLocalError: local variable 'x' referenced before assignment

If you use default arguments, this works as expected:>>> def f(x):

... def g(x=x):
... x = x + 1
... return x
... return g

4

4
4

The fact that there also seems to be a performance gain (granted, it
is extremely slight here) led me to ask if there was any advantage to
using closures. It seems not.
An overriding theme in this thread is that you are greatly concerned
with the speed of your solution rather than the structure and
readability of your code.

Yes, it probably does seem that way, because I am burying this code
deeply and do not want to revisit it when profiling later, but my
overriding concern is reliability and ease of use. Using function
attributes seemed the best way to achieve both goals until I found out
that the pythonic way of copying functions failed. Here was how I
wanted my code to work:

@define_options(first_option='abs_tol')
def step(f,x,J,abs_tol=1e-12,rel_tol=1e-8,**kwargs):
"""Take a step to minimize f(x) using the jacobian J.
Return (new_x,converged) where converged is true if the tolerance
has been met.
"""
<compute dx and check convergence>
return (x + dx, converged)

@define_options(first_option='min_h')
def jacobian(f,x,min_h=1e-6,max_h=0.1):
"""Compute jacobian using a step min_h < h < max_h."""
<compute J>
return J

class Minimizer(object):
"""Object to minimize a function."""
def __init__(self,step,jacobian,**kwargs):
self.options = step.options + jacobian.options
self.step = step
self.jacobian = jacobian

def minimize(self,f,x0,**kwargs):
"""Minimize the function f(x) starting at x0."""
step = self.step
jacobian = self.jacobian

step.set_options(**kwargs)
jacobian.set_options(**kwargs)

converged = False
while not converged:
J = jacobian(f,x)
(x,converged) = step(f,x,J)

return x

@property
def options(self):
"""List of supported options."""
return self.options

The idea is that one can define different functions for computing the
jacobian, step etc. that take various parameters, and then make a
custom minimizer class that can provide the user with information
about the supported options etc.

The question is how to define the decorator define_options?

1) I thought the cleanest solution was to add a method f.set_options()
which would set f.func_defaults, and a list f.options for
documentation purposes. The docstring remains unmodified without any
special "wrapping", step and jacobian are still "functions" and
performance is optimal.

2) One could return an instance f of a class with f.__call__,
f.options and f.set_options defined. This would probably be the most
appropriate OO solution, but it makes the decorator much more messy,
or requires the user to define classes rather than simply define the
functions as above. In addition, this is at least a factor of 2.5
timese slower on my machine than option 1) because of the class
instance overhead. (This is my only real performance concern because
this is quite a large factor. Otherwise I would just use this
method.)

3) I could pass generators to Minimize and construct the functions
dynamically. This would have the same performance, but would require
the user to define generators, or require the decorator to return a
generator when the user appears to be defining a function. This just
seems much less elegant.

...
@define_options_generator(first_option='min_h')
def jacobian_gen(f,x,min_h=1e-6,max_h=0.1):
"""Compute jacobian using a step min_h < h < max_h."""
<compute J>
return J

class Minimizer(object):
"""Object to minimize a function."""
def __init__(self,step_gen,jacobian_gen,**kwargs):
self.options = step_gen.options + jacobian_gen.options
self.step_gen = step_gen
self.jacobian_gen = jacobian_gen

def minimize(self,f,x0,**kwargs):
"""Minimize the function f(x) starting at x0."""
step = self.step_gen(**kwargs)
jacobian = self.jacobian_gen(**kwargs)

converged = False
while not converged:
J = jacobian(f,x)
(x,converged) = step(f,x,J)

return x
...

4) Maybe there is a better, cleaner way to do this, but I thought that
my option 1) was the most clear, readable and fast. I would
appreciate any suggestions. The only problem is that it does use
mutable functions, and so the user might be tempted to try:

new_step = copy(step)

which would fail (because modifying new_step would also modify step).
I guess that this is a pretty big problem (I could provide a custom
copy function so that

new_step = step.copy()

would work) and I wondered if there was a better solution (or if maybe
copy.py should be fixed. Checking for a defined __copy__ method
*before* checking for pre-defined mutable types does not seem to break
anything.)

Thanks again everyone for your suggestions, it is really helping me
learn about python idioms.

Michael.
 
C

Chris Mellon

Is there a reason for using the closure here? Using function defaults
seems to give better performance:[...]

It does? Not as far as I can measure it to any significant degree on my
computer.

I agree the performance gains are minimal. Using function defaults
rather than closures, however, seemed much cleaner an more explicit to
me. For example, I have been bitten by the following before:
... def g():
... x = x + 1
... return x
... return gTraceback (most recent call last):
File "<stdin>", line 1, in <module>
File "<stdin>", line 3, in g
UnboundLocalError: local variable 'x' referenced before assignment

If you use default arguments, this works as expected:... def g(x=x):
... x = x + 1
... return x
... return g4

You aren't getting "bit" by any problem with closures - this is a
syntax problem.
Assigning to x within the scope of g() makes x a local variable. x =
x+1 doesn't work, then, because "x+1" can't be evaluated.

If you just used "return x+1" or if you named the local variable in g
something other than "x", the closure approach would also work as
expected.
 
J

John Nagle

I agree the performance gains are minimal. Using function defaults
rather than closures, however, seemed much cleaner an more explicit to
me. For example, I have been bitten by the following before:


... def g():
... x = x + 1

Too cute. Don't nest functions in Python; the scoping model
isn't really designed for it.
....

@define_options(first_option='abs_tol')
def step(f,x,J,abs_tol=1e-12,rel_tol=1e-8,**kwargs):
"""Take a step to minimize f(x) using the jacobian J.
Return (new_x,converged) where converged is true if the tolerance
has been met.
"""

Python probably isn't the right language for N-dimensional optimization
if performance is a major concern. That's a very compute-intensive operation.
I've done it in C++, with heavy use of inlines, and had to work hard to
get the performance up. (I was one of the first to do physics engines for
games and animation, which is a rather compute-intensive problem.)

If you're doing number-crunching in Python, it's essential to use
NumPy or some other C library for matrix operations, or it's going to
take way too long.

John Nagle
 
M

Michael

You aren't getting "bit" by any problem with closures - this is a
syntax problem.

I understand that it is not closures that are specifically biting me.
However, I got bit, it was unplesant and I don't want to be bit
again;-)

Thus, whenever I need to pass information to a function, I use default
arguments now. Is there any reason not to do this other than the fact
that it is a bit more typing?

Michael
 
M

Michael

Too cute. Don't nest functions in Python; the scoping model
isn't really designed for it.

How can you make generators then if you don't nest?
Python probably isn't the right language for N-dimensional optimization
if performance is a major concern. That's a very compute-intensive operation.
I've done it in C++, with heavy use of inlines, and had to work hard to
get the performance up. (I was one of the first to do physics engines for
games and animation, which is a rather compute-intensive problem.)

If you're doing number-crunching in Python, it's essential to use
NumPy or some other C library for matrix operations, or it's going to
take way too long.

I know. I am trying to flesh out a modular optimization proposal for
SciPy. Using C++ would defeat the purpose of making it easy to extend
the optimizers. I just want to make things as clean and efficient as
possible when I stumbled on this python copy problem.

Michael.
 
C

Chris Mellon

I understand that it is not closures that are specifically biting me.
However, I got bit, it was unplesant and I don't want to be bit
again;-)

Thus, whenever I need to pass information to a function, I use default
arguments now. Is there any reason not to do this other than the fact
that it is a bit more typing?

There are different semantics when the thing you're passing is
mutable. There's also different semantics when it's rebound within the
calling scope, but then the default argument technique is probably
what you want.
 
C

Chris Mellon

How can you make generators then if you don't nest?

There's all kinds of good reasons to nest functions, and the "scoping
model isn't really designed for it" somewhat overstates the case -
it's not relevant to many of the reasons you might nest functions, and
it's not (much) of a problem for the rest of them. What you can't do
is rebind values in the enclosing scope, unless the enclosing scope is
global. That's a real, but fairly minor, limitation and you'll be able
to explicitly address your enclosing scope in 3k (or perhaps sooner).
 
M

Michael


Okay, so it is a bad design, but it illustrates the point. What is
happening is that in the body of the function f, a new function is
defined using the value of x passed as an argument to f. Thus, after
the call g = f(3), the body of f is equivalent to

def g(x=3):
x = x + 1
return x

This function is returned, so the call g() uses the default argument
x=3, then computes x = x+1 = 3+1 = 4 and returns 4. Every call is
equivalent to g() == g(3) = 4. Inside g, x is a local variable: it
does not maintain state between function calls. (You might think that
the first example would allow you to mutate the x in the closure, but
this is dangerous and exactly what python is trying to prevent by
making x a local variable when you make assignments in g. This is why
the interpreter complains.)

If you actually want to maintain state, you have to use a mutable
object like a list. The following would do what you seem to expect.
.... def g(x=[x0]):
.... x[0] = x[0] + 1
.... return x[0]
.... return g
....6
 
A

Alex Martelli

Michael said:
Thus, whenever I need to pass information to a function, I use default
arguments now. Is there any reason not to do this other than the fact
that it is a bit more typing?

You're giving your functions a signature that's different from the one
you expect it to be called with, and so making it impossible for the
Python runtime to diagnose certain errors on the caller's part.

For example, consider:

def makecounter_good():
counts = {}
def count(item):
result = counts[item] = 1 + counts.get(item, 0)
return result
return count

c = makecounter_good()
for i in range(3): print c(23)

def makecounter_hmmm():
counts = {}
def count(item, counts=counts):
result = counts[item] = 1 + counts.get(item, 0)
return result
return count

cc = makecounter_hmmm()
for i in range(3): print cc(23)

print cc(23, {})

print c(23, {})


Counters made by makecounter_good take exactly one argument, and
properly raise exceptions if incorrectly called with two; counters made
by makecounter_hmmm take two arguments (of which one is optional), and
thus hide some runtime call errors.

From "import this":
"""
Errors should never pass silently.
Unless explicitly silenced.
"""

The miniscule "optimization" of giving a function an argument it's not
_meant_ to have somewhat breaks this part of the "Zen of Python", and
thus I consider it somewhat unclean.


Alex
 
M

Michael

You're giving your functions a signature that's different from the one
you expect it to be called with, and so making it impossible for the
Python runtime to diagnose certain errors on the caller's part. ....
The miniscule "optimization" of giving a function an argument it's not
_meant_ to have somewhat breaks this part of the "Zen of Python", and
thus I consider it somewhat unclean.

That is a pretty good reason in some contexts. Usually, the arguments
I pass are values that the user might like to change, so the kwarg
method often serves an explicit purpose allowing parameters to be
modified, but I can easily imagine cases where the extra arguments
should really not be there. I still like explicitly stating the
dependencies of a function, but I suppose I could do that with
decorators.

Thanks,
Michael.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,680
Members
48,796
Latest member
Greg L.

Latest Threads

Top