Bizarre behavior with mutable default arguments

D

Dennis Lee Bieber

Just because it's well known doesn't mean we shouldn't think about it.
For example, in the same list you linked, "3. Integer division" is
being fixed in py3k.
IMHO -- Py3K is /breaking/ integer division... as the division of
two integers will differ from what happens in all the other languages I
have used... All the others, to get a floating result from dividing two
integers requires one to explicitly convert at least one term to a float
first -- as I would do with the current Python. The forthcoming change
is going to require one to remember that if they want an integer result
from two integers, they must use a different operator instead.

But that is a lost argument for me...

In the situation of default arguments, I've only encountered one
other language that supports them, not multitudes of languages, so there
is no preponderance of results defining "expected behavior" to me.
--
Wulfraed Dennis Lee Bieber KD6MOG
(e-mail address removed) (e-mail address removed)
HTTP://wlfraed.home.netcom.com/
(Bestiaria Support Staff: (e-mail address removed))
HTTP://www.bestiaria.com/
 
S

Steven D'Aprano

BTW, it's silly not to 'allow' globals when they're called for,
otherwise we wouldn't need the 'global' keyword.

Nobody argues against allowing globals variables *when they're called
for*, just pointing out that ninety-nine times out of a hundred, people
use them when they're not called for and are positively harmful.

And according to functional programmers, they're NEVER called for.
 
S

Steven D'Aprano

IMHO -- Py3K is /breaking/ integer division... as the division of
two integers will differ from what happens in all the other languages I
have used... All the others, to get a floating result from dividing two
integers requires one to explicitly convert at least one term to a float
first

You need to use more languages :)

Prolog uses / for division and // for integer division, just like Python.

Apple's Hypertalk (and derivatives) don't distinguish between integer and
floating point division. The / operator returns an integer result if the
floating point result happens to be an integer.

e.g. 10.0/5.0 => 2 while 11.0/5.0 => 2.2)

I believe that Javascript behaves the same way.

According to this table here:
http://msdn2.microsoft.com/en-us/library/2hxce09y.aspx

VisualBasic uses / for floating point division and \ for integer
division, and both JScript and Visual FoxPro don't even offer integer
division at all.

No doubt there are others...


-- as I would do with the current Python. The forthcoming change
is going to require one to remember that if they want an integer result
from two integers, they must use a different operator instead.

How is that different from needing to remember to use a different
algorithm if you want a floating point result?
 
B

bukzor

Nobody argues against allowing globals variables *when they're called
for*, just pointing out that ninety-nine times out of a hundred, people
use them when they're not called for and are positively harmful.

And according to functional programmers, they're NEVER called for.


I think you struck at the heart of the matter earlier when you noted
that this is the simplest way to declare a static variable in python.
Using the 'global' keyword is the other way, and is much more
explicit, and much more widely used. I also see this as the main use
of the 'notlocal' keyword to be introduced in py3k (it also fixes the
example given by Istvan above).

If the main value of this behavior is to declare a static variable, it
seems like an argument to create a more explicit syntax for static
variables. In the example above, the function only needed a static
integer, but created a one-length list instead because this quirk
doesn't work for immutable values.
 
S

Steven D'Aprano

I think you struck at the heart of the matter earlier when you noted
that this is the simplest way to declare a static variable in python.
Using the 'global' keyword is the other way, and is much more explicit,
and much more widely used. I also see this as the main use of the
'notlocal' keyword to be introduced in py3k (it also fixes the example
given by Istvan above).

There doesn't appear to be any reference to a "notlocal" keyword in
Python 3 that I can find. Have I missed something? It sounds like an
April Fool's gag to me. Do you have a reference to a PEP or other
official announcement?
 
G

Gabriel Genellina

En Mon, 31 Dec 2007 05:01:51 -0200, Steven D'Aprano
There doesn't appear to be any reference to a "notlocal" keyword in
Python 3 that I can find. Have I missed something? It sounds like an
April Fool's gag to me. Do you have a reference to a PEP or other
official announcement?

No, it's a real keyword in python 3, but it's spelled "nonlocal".
See http://www.python.org/dev/peps/pep-3104/
 
O

Odalrick

Note that the FAQ mainly explains *what* happens, not *why* was this
decision taken. Although it shows an example where "this feature can
be useful", it's neither the only way to do it nor is memoization as
common as wanting fresh default arguments on every call.

I'm surprised noone has said anything about the why of default
mutables. I think it is becasue it isn't easy to do it an other way.

def some_function( an_integer=1,pointless_list=[],
random_fuction_value=random_function()):
pass

To you and me it is obvious that this is an integer, a list and a
function call, but to python it is just 3 objects. Python'd have to
check each argument carefully to determine if it is mutable or not. Or
always copy each object, adding additional overhead to function calls,
and making passing arguments to functions expensive.

Even if these problems were solved, it would only make the problem
less common, not extinct.

# hypothetical
def another_function( still_alive=([],) ):
still_alive[0].append('spam')
print still_alive
another_function() (['spam'],)
another_function()
(['spam', 'spam'],)

(Could of course be solved by always making deep copies of all
arguments.)

While I would welcome making mutable defaults work differently, I
don't see any way to make such a change without making unacceptable
tradeoffs.


--

Incidentally, I wrote a program a while back, with a bug caused by
mutable defaults. Never bothered to change it, it was the behaviour I
wanted, just not the one I thought I had implemented. -- Python, so
good even the bugs make the program better.
 
A

Arnaud Delobelle

I'm surprised noone has said anything about the why of default
mutables. I think it is becasue it isn't easy to do it an other way.

[...]

There is an easy enough way: evaluate default values when the function
is called rather than when it is defined. This behaviour comes with
its own caveats as well I imagine, and it's not 'as easy' to implement
as the current one.

What's good about the current behaviour is that it is easy to reason
with (once you know what happens), even though you almost have to get
bitten once. But using this to have static variable is extremely ugly
IMHO.
 
O

Odalrick

I'm surprised noone has said anything about the why of default
mutables. I think it is becasue it isn't easy to do it an other way.

[...]

There is an easy enough way: evaluate default values when the function
is called rather than when it is defined. This behaviour comes with
its own caveats as well I imagine, and it's not 'as easy' to implement
as the current one.

Adding overhead to *all* function calls, even the ones without mutable
defaults. That doesn't sound like an attractive tradeoff.
 
C

Chris Mellon

On 30 Dec, 17:26, George Sakkis <[email protected]> wrote:
On Dec 29, 9:14 pm, bukzor <[email protected]> wrote:
It looks like Guido disagrees with me, so the discussion is closed.
Note that the FAQ mainly explains *what* happens, not *why* was this
decision taken. Although it shows an example where "this feature can
be useful", it's neither the only way to do it nor is memoization as
common as wanting fresh default arguments on every call.
I'm surprised noone has said anything about the why of default
mutables. I think it is becasue it isn't easy to do it an other way.

[...]

There is an easy enough way: evaluate default values when the function
is called rather than when it is defined. This behaviour comes with
its own caveats as well I imagine, and it's not 'as easy' to implement
as the current one.

Adding overhead to *all* function calls, even the ones without mutable
defaults. That doesn't sound like an attractive tradeoff.

And also removing the only way you can currently do early binding in
Python. I agree that it's a gotcha, but unless someone comes up with
an answer to the following questions, I'll stick with the status quo
(Note that this is not blind Python group-think as a previous poster
implied, but a pragmatic decision that this is the most practical
solution):

a) If we don't evaluate default arguments at function compilation,
when do we do it?
b) If you do it at call time, how do you implement early binding?
c) If you want to introduce new syntax for the current behavior, what
is it and can you justify it?
d) What are the performance implications of your proposal versus the
current behavior?

Note that the desired behavior can be implemented under the current
behavior, at the expense of verbosity - using factories and sentinel
values as the default arguments, and then expanding them in the
function. It's not possible to implement the current behavior of
early-bound arguments if default arguments are evaluated with every
call. This alone is a good reason to keep the current behavior until
someone actually has a good alternative that covers the current use
cases and isn't just upset by the behavior.
 
N

NickC

I'm surprised noone has said anything about the why of default
mutables. I think it is becasue it isn't easy to do it an other way.

[...]

There is an easy enough way: evaluate default values when the function
is called rather than when it is defined. This behaviour comes with
its own caveats as well I imagine, and it's not 'as easy' to implement
as the current one.

As Odalrick notes, there is no way to give different calls to a
function their own copies of mutable default arguments without re-
evaluating the defaults every time the function is called. The
horrendous performance implications mean that that simply isn't going
to happen. So the status quo, where the defaults are calculated once
when the function is defined and the result cached in the function
object is unlikely to change.
What's good about the current behaviour is that it is easy to reason
with (once you know what happens), even though you almost have to get
bitten once. But using this to have static variable is extremely ugly
IMHO.

The only thing it doesn't give you is a static variable that isn't
visible to the caller. Py3k's keyword-only arguments (PEP 3102) will
make those cases a little tidier, since it won't be possible to
accidentally replace the static variables by providing too many
positional arguments.

I believe the suggestion of permitting static variables after the **
entry in a function's parameter list was raised during the PEP 3102
discussions, but never gained much traction over a '_cache={}' keyword-
only argument approach (and the latter has the distinct advantage of
being *much* easier to test, since you can override the cache from the
test code to ensure it is being handled correctly).

Cheers,
Nick.
 
A

Arnaud Delobelle

I'm surprised noone has said anything about the why of default
mutables. I think it is becasue it isn't easy to do it an other way.

There is an easy enough way: evaluate default values when the function
is called rather than when it is defined.  This behaviour comes with
its own caveats as well I imagine, and it's not 'as easy' to implement
as the current one.

As Odalrick notes, there is no way to give different calls to a
function their own copies of mutable default arguments without re-
evaluating the defaults every time the function is called. The
horrendous performance implications mean that that simply isn't going
to happen. So the status quo, where the defaults are calculated once
when the function is defined and the result cached in the function
object is unlikely to change.

I'm in no way advocating a change, in fact I wouldn't like things to
change. I was just saying that it was not difficult (technically) to
alter the behaviour, but that this change wouldn't be desirable
because it would make code more difficult to reason on. OTOH a very
common idiom in python is

def foo(x, y, z=None):
if z is None: z = ['a', 'mutable', 'object']
# stuff that foo does

This the current way to say "I want the default value of z to be
reevaluated each time it is used". I use this much more often than

def bar(x, y, z=ExpensiveImmutableCreation())

So I'm not so convinced with the performance argument at face value
(though it's probably pertinent:)
The only thing it doesn't give you is a static variable that isn't
visible to the caller. Py3k's keyword-only arguments (PEP 3102) will
make those cases a little tidier, since it won't be possible to
accidentally replace the static variables by providing too many
positional arguments.

I was always a bit puzzled by this PEP. If this is one of the
underlying reasons for it, then I am even more puzzled.
I believe the suggestion of permitting static variables after the **
entry in a function's parameter list was raised during the PEP 3102
discussions, but never gained much traction over a '_cache={}' keyword-
only argument approach (and the latter has the distinct advantage of
being *much* easier to test, since you can override the cache from the
test code to ensure it is being handled correctly).

Well I'm glad that didn't go through, argument lists in function
definitions are complicated enough already!
 
B

bukzor

Here's the answer to the question:http://www.python.org/doc/faq/general/#why-are-default-values-shared-...
It looks like Guido disagrees with me, so the discussion is closed.
Note that the FAQ mainly explains *what* happens, not *why* was this
decision taken. Although it shows an example where "this feature can
be useful", it's neither the only way to do it nor is memoization as
common as wanting fresh default arguments on every call.
I'm surprised noone has said anything about the why of default
mutables. I think it is becasue it isn't easy to do it an other way.
[...]
There is an easy enough way: evaluate default values when the function
is called rather than when it is defined. This behaviour comes with
its own caveats as well I imagine, and it's not 'as easy' to implement
as the current one.
Adding overhead to *all* function calls, even the ones without mutable
defaults. That doesn't sound like an attractive tradeoff.

And also removing the only way you can currently do early binding in
Python. I agree that it's a gotcha, but unless someone comes up with
an answer to the following questions, I'll stick with the status quo
(Note that this is not blind Python group-think as a previous poster
implied, but a pragmatic decision that this is the most practical
solution):

a) If we don't evaluate default arguments at function compilation,
when do we do it?
b) If you do it at call time, how do you implement early binding?
c) If you want to introduce new syntax for the current behavior, what
is it and can you justify it?
d) What are the performance implications of your proposal versus the
current behavior?

Note that the desired behavior can be implemented under the current
behavior, at the expense of verbosity - using factories and sentinel
values as the default arguments, and then expanding them in the
function. It's not possible to implement the current behavior of
early-bound arguments if default arguments are evaluated with every
call. This alone is a good reason to keep the current behavior until
someone actually has a good alternative that covers the current use
cases and isn't just upset by the behavior.

I'm confused by what you mean by 'early binding'. Can you give a quick-
n-dirty example?

Thanks,
--Buck
 
B

bukzor

Here's the answer to the question:http://www.python.org/doc/faq/general/#why-are-default-values-shared-...
It looks like Guido disagrees with me, so the discussion is closed.
Note that the FAQ mainly explains *what* happens, not *why* was this
decision taken. Although it shows an example where "this feature can
be useful", it's neither the only way to do it nor is memoization as
common as wanting fresh default arguments on every call.
I'm surprised noone has said anything about the why of default
mutables. I think it is becasue it isn't easy to do it an other way.
[...]
There is an easy enough way: evaluate default values when the function
is called rather than when it is defined. This behaviour comes with
its own caveats as well I imagine, and it's not 'as easy' to implement
as the current one.
Adding overhead to *all* function calls, even the ones without mutable
defaults. That doesn't sound like an attractive tradeoff.
And also removing the only way you can currently do early binding in
Python. I agree that it's a gotcha, but unless someone comes up with
an answer to the following questions, I'll stick with the status quo
(Note that this is not blind Python group-think as a previous poster
implied, but a pragmatic decision that this is the most practical
solution):
a) If we don't evaluate default arguments at function compilation,
when do we do it?
b) If you do it at call time, how do you implement early binding?
c) If you want to introduce new syntax for the current behavior, what
is it and can you justify it?
d) What are the performance implications of your proposal versus the
current behavior?
Note that the desired behavior can be implemented under the current
behavior, at the expense of verbosity - using factories and sentinel
values as the default arguments, and then expanding them in the
function. It's not possible to implement the current behavior of
early-bound arguments if default arguments are evaluated with every
call. This alone is a good reason to keep the current behavior until
someone actually has a good alternative that covers the current use
cases and isn't just upset by the behavior.

I'm confused by what you mean by 'early binding'. Can you give a quick-
n-dirty example?

Thanks,
--Buck

Is an 'early bound' variable synonymous with a 'static' variable (in
C)?
 
G

Gabriel Genellina

Is an 'early bound' variable synonymous with a 'static' variable (in
C)?

No. It means, in which moment the name gets its value assigned. Usually
Python does "late binding", that is, names are resolved at the time the
code is executed, not when it's compiled or defined.
Consider this example:

z = 1
def foo(a)
print a+z
foo(3) # prints 4
z = 20
foo(3) # prints 23

The second time it prints 23, not 4, because the value for z is searched
when the code is executed, so the relevant value for z is 20.
Note that if you later assign a non-numeric value to z, foo(3) will fail.

If you want to achieve the effect of "early binding", that is, you want to
"freeze" z to be always what it was at the time the function was defined,
you can do that using a default argument:

z = 1
def foo(a, z=z)
print a+z
z = None
foo(3) # prints 4

This way, foo(3) will always print 4, independently of the current value
of z. Moreover, you can `del z` and foo will continue to work.

This is what I think Chris Mellon was refering to. This specific default
argument semantics allows one to achieve the effect of "early binding" in
a language which is mostly "late binding". If someone changes this, he has
to come with another way of faking early binding semantics at least as
simple as this, else we're solving an [inexistant for me] problem but
creating another one.
 
A

Arnaud Delobelle

En Tue, 01 Jan 2008 15:45:00 -0200, bukzor <[email protected]>  
escribi�: [...]
Is an 'early bound' variable synonymous with a 'static' variable (in
C)?

No. It means, in which moment the name gets its value assigned. Usually  
Python does "late binding", that is, names are resolved at the time the  
code is executed, not when it's compiled or defined.
Consider this example:

z = 1
def foo(a)
   print a+z
foo(3) # prints 4
z = 20
foo(3) # prints 23

The second time it prints 23, not 4, because the value for z is searched  
when the code is executed, so the relevant value for z is 20.
Note that if you later assign a non-numeric value to z, foo(3) will fail.

If you want to achieve the effect of "early binding", that is, you want to  
"freeze" z to be always what it was at the time the function was defined,  
you can do that using a default argument:

z = 1
def foo(a, z=z)
   print a+z
z = None
foo(3) # prints 4

This way, foo(3) will always print 4, independently of the current value  
of z. Moreover, you can `del z` and foo will continue to work.

This is what I think Chris Mellon was refering to. This specific default  
argument semantics allows one to achieve the effect of "early binding" in  
a language which is mostly "late binding". If someone changes this, he has  
to come with another way of faking early binding semantics at least as  
simple as this, else we're solving an [inexistant for me] problem but  
creating another one.

Let me say again that I believe the current behaviour to be the
correct one. But I don't think this 'early binding' is critical for
this sort of example. There are lots of ways to solve the problem of
having persistent state across function calls, for example:

* using classes
* using modules
* or simply nested functions:

def getfoo(z):
def foo(a):
print a + z
return foo
4

And with nonlocal, we could even modify z inside foo and this change
would persist across calls. This will be a much cleaner solution than
the current def bar(x, y, _hidden=[startvalue]).

Also, note that it's easy to implement default arguments in pure
python-without-default-arguments using a decorator:

def default(**defaults):
defaults = defaults.items()
def decorator(f):
def decorated(*args, **kwargs):
for name, val in defaults:
kwargs.setdefault(name, val)
return f(*args, **kwargs)
return decorated
return decorator

Here is your example:
... def foo(a, z):
... print a + z
...4

Another example, using mutables:
... def bar(x, history):
... history.append(x)
... return list(history)
...[['s'], ['s', 'p'], ['s', 'p', 'a'], ['s', 'p', 'a', 'm']]
 
A

Ali

In the absence of a better solution, I'm very comfortable with keeping
the behaviour as is. Unfortunately, there's no good solution in Python to
providing functions with local storage that persists across calls to the
function:

....

(4) Instances handle this pretty well, just s/functions/methods/.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,777
Messages
2,569,604
Members
45,229
Latest member
GloryAngul

Latest Threads

Top