mutable default parameter problem [Prothon]

M

Mark Hahn

As we are addressing the "warts" in Python to be fixed in Prothon, we have
come upon the
mutable default parameter problem. For those unfamiliar with the problem,
it can be seen in this Prothon code sample where newbies expect the two
function calls below to both print [ 1 ] :

def f( list=[ ] ):
print list.append!(1)

f() # prints [ 1 ]
f() # prints [ 1, 1 ]

It is more than just a newbie problem. Even experts find themselves having
to do things like this which is a waste of programming effort:

def f( list = None ):
if list == None: list = [ ]

We have three proposals in the Prothon mailing list right now to fix this.
I'd like to bounce these off of the Python list also since this will
possibly make a big difference in Python code ported over to Prothon and we
can always use more advice.

1) Only allow immutable objects as default values for formal parameters. In
Prothon immutable objects are well-defined since they have an immutable flag
that write-protects them. This (as the other solutions below) would only
solve the problem in a shallow way as one could still have something like a
tuple of mutable objects and see the problem at a deeper level. If the
newbie is going to be dealing with something this complex though then they
are dealing with the overall problem of references versus copies and that is
a bigger overall issue.

2) Evaluate the default expression once at each call time when the default
value is needed. The default expression would be evaluated in the context
of the function definition (like a closure).

3) Evaluate the expression at definition time as it is done now, but at call
time do a defaultValue.copy() operation. This would be a shallow copy so
again it would be a shallow solution.

Choice 2 is my favorite in that it matches the dynamic nature of Prothon,
but it is an expensive solution. Choice 1 is the least expensive solution
but it is limiting to the user. Choice 1 does not help the second code
sample above. Choice 3 is a good compromise since an object.copy() is
pretty fast in Prothon.

Comments? How much Python code would these different proposals break?
 
M

Marcin 'Qrczak' Kowalczyk

Choice 2 is my favorite in that it matches the dynamic nature of Prothon,
but it is an expensive solution. Choice 1 is the least expensive solution
but it is limiting to the user. Choice 1 does not help the second code
sample above. Choice 3 is a good compromise since an object.copy() is
pretty fast in Prothon.

Comments? How much Python code would these different proposals break?

I like 2 the most. Well, actually I like only 2 :)

I'm not sure why it would be expensive, it's a pity if it's expensive,
but it should be appropriate for most cases and it's easy to understand.
 
T

Troy Melhase

As we are addressing the "warts" in Python to be fixed in Prothon, we have
come upon the
mutable default parameter problem. For those unfamiliar with the problem,
it can be seen in this Prothon code sample where newbies expect the two
function calls below to both print [ 1 ] :
We have three proposals in the Prothon mailing list right now to fix this.
I'd like to bounce these off of the Python list also since this will
possibly make a big difference in Python code ported over to Prothon and we
can always use more advice.

Here's an idea: if it ain't broke, don't fix it.

Seriously, you see a "wart" and a "problem". I see a pleasant side-effect of
the documented semantics. True, new folks are surprised by the behavior, but
once it's understood, it becomes more powerful.

How do you intend to account for code like this:

def F(a, b, cache={}):
try:
return cache[(a,b)]
except (IndexError, ):
value = cache[(a,b)] = do_some_long_calc(a,b)
return value

Or even this:

shared_cache = {}

def F(a, b, cache=shared_cache):
...

Of course you can argue that this is bad style, but the counter argument is
just as strong: this is quite pythonic and quite readable.

Python is a tool, and you decrease the utility of that tool when you limit
it's idioms.
How much Python code would these different proposals break?

A lot. I ran this:

$ find /usr/lib/python2.3/ -name "*.py" -exec grep "def.*=\[\]" {} \; | wc

And see 67 instances just in the standard library. Multiply that by a factor
of 1000, 10000 or more to reflect code in the field, and you might start to
understand the significance of changing the language definition.
 
M

Mark Hahn

Troy said:
Here's an idea: if it ain't broke, don't fix it.

Seriously, you see a "wart" and a "problem". I see a pleasant
side-effect of the documented semantics. True, new folks are
surprised by the behavior, but once it's understood, it becomes more
powerful.

All four of the Python gotcha's, wart's and regrets lists I have found
included this problem. It is not only a newbie's problem as I showed in my
posting.
How do you intend to account for code like this:

def F(a, b, cache={}):
try:
return cache[(a,b)]
except (IndexError, ):
value = cache[(a,b)] = do_some_long_calc(a,b)
return value

Or even this:

shared_cache = {}

def F(a, b, cache=shared_cache):
...

The first example is very unreadable and uncool in general. Your second
example will work just fine with our fix.
Of course you can argue that this is bad style,

Yes I (and many others) will.
but the counter
argument is just as strong: this is quite pythonic and quite
readable.

I disagree strongly. I would never be caught coding something like that and
I love Python dearly.
Python is a tool, and you decrease the utility of that tool when you
limit it's idioms.

So far you have only shown me an idiom that many say should not be used.
Show me one that everyone agrees is useful.
How much Python code would these different proposals break?

A lot. I ran this:

$ find /usr/lib/python2.3/ -name "*.py" -exec grep "def.*=\[\]" {}
\; | wc

And see 67 instances just in the standard library. Multiply that by
a factor of 1000, 10000 or more to reflect code in the field, and you
might start to understand the significance of changing the language
definition.

That count is not accurate. Fixing this will not break every use of [] as a
default formal param. Using [] in __init__ for example would break nothing.
I can think of many other cases where it is legal to use []. The only case
I can think of that would break would be the idiom we disagree on above. If
I am wrong, then show me other cases.

If I also might make a general argument for the fix then let me continue.
Doing a late evaluation of the default expression makes the language more
dynamic, which fits the overall goal of making Prothon more dynamic. Using
prototypes instead of classes, dynamic var scoping, this fix, and many other
Prothon changes from Python all work towards that goal.

Dynamic var scoping fixed another Python gotcha which doesn't break
anything. Here are the two versions of code showing the problem and the
fix:

--- Python ---
.... x = x + 1
.... print x
....UnboundLocalError: local variable 'x' referenced before assignment

--- Prothon ---

O>> x = 1
1
O>> def f():
.... x = x + 1
.... print x
....
O>> f()
2

Prothon's scoping rules are dynamic which means that x comes from outside
the function until the actual assignment happens. At that point x becomes a
local variable. This, along with the fact that vars are inherited from
ancestors along with methods, allow for some intuitive and simple var
initialization techniques.

Obviously it is the responsibility of the programmer to make sure that the
outer x has the proper initialization value for the local x. This can cause
a hiding-of-uninitialized-vars bug if the programmer uses the same names for
unrelated variables but it is worth the extra power and intuitiveness.
 
A

Andrea Griffini

Seriously, you see a "wart" and a "problem". I see a pleasant side-effect of
the documented semantics. True, new folks are surprised by the behavior, but
once it's understood, it becomes more powerful.

How do you intend to account for code like this:

def F(a, b, cache={}):
try:
return cache[(a,b)]
except (IndexError, ):
value = cache[(a,b)] = do_some_long_calc(a,b)
return value

I'm new to python. To my eyes this is a pretty poor attempt to
have static variables. I've implemented in the past a few
scripting languages, and it's not really difficult to
implement static variables... it's quite surprising for me
there's no such a concept in python and just that wart...
hmmm... excuse me... that bad smelling wart has to be used instead.
Or even this:

shared_cache = {}

def F(a, b, cache=shared_cache):
...

A global you mean ? Why not just saying "global shared_cache"
at the start of the function ?

If you need more power than a global then you probably a
(callable) class is going to serve you better than a function.

Is really a feature that if someone passes (by mistake) an extra
parameter to the function the script silently swallows it and
behaves strangely ? Or you also double-wrap the function, so
that a(x,y) calls real_a(x,y,cache=[]) ?
Of course you can argue that this is bad style, but the counter argument is
just as strong: this is quite pythonic and quite readable.

Pythonic ? In the sense that this is for example more explicit ?
Are you kidding ?
Python is a tool, and you decrease the utility of that tool when you limit
it's idioms.


A lot. I ran this:

This doesn't surprise me. Static variables are useful when you
don't really need the power of a class instance. Too bad that
them are missing in python, and for unknown (to me) reasons
they haven't been added in the evolution of the language.
$ find /usr/lib/python2.3/ -name "*.py" -exec grep "def.*=\[\]" {} \; | wc

And see 67 instances just in the standard library. Multiply that by a factor
of 1000, 10000 or more to reflect code in the field, and you might start to
understand the significance of changing the language definition.

That something is used is one thing. That something is elegant
is another.

Andrea
 
M

Mark Hahn

Mark said:
Fixing this will not break every use of
[] as a default formal param. Using [] in __init__ for example would
break nothing.

Correction: Using [] as a default param is the same problem in __init__ as
it is anywhere else. My brain went south for a moment.

My point still stands though. There are many cases where a formal param can
default to [] and can be evaluated early or late without breaking the code.

My question asking for comments on code breakage also still stands. Does
anyone have personal experience with usages of [] or {} as default params
that would break with late evaluation? Is there any common idiom other than
the non-recommended use of them as static vars that would break?
 
T

Troy Melhase

All four of the Python gotcha's, wart's and regrets lists I have found
included this problem. It is not only a newbie's problem as I showed in my
posting.

You're right, it's not a newbie problem. It's a problem for everyone who
hasn't bothered to read the documentation.
How do you intend to account for code like this:

def F(a, b, cache={}):
try:
return cache[(a,b)]
except (IndexError, ):
value = cache[(a,b)] = do_some_long_calc(a,b)
return value

Or even this:

shared_cache = {}

def F(a, b, cache=shared_cache):
...

The first example is very unreadable and uncool in general. Your second
example will work just fine with our fix.

Uncool? Do you mean "uncool" as in "forking a language and distracting a
bunch of folks because I don't like its otherwise hard-earned design
decisions" or "uncool" as in "I don't know how to otherwise express my
thoughts and therefore will assign to them some magnificently subjective
expression"?
I disagree strongly. I would never be caught coding something like that
and I love Python dearly.

Then you are limiting yourself to a subset of something wonderful.

(And by the way, one definition of love means to accept what we perceive as
deficiencies. So maybe you don't love Python as dearly as you love the idea
of Python.)
So far you have only shown me an idiom that many say should not be used.
Show me one that everyone agrees is useful.

If you're goal is universal acceptance, you should stop now.
And see 67 instances just in the standard library. Multiply that by
a factor of 1000, 10000 or more to reflect code in the field, and you
might start to understand the significance of changing the language
definition.

That count is not accurate. Fixing this will not break every use of [] as
a default formal param. Using [] in __init__ for example would break
nothing. I can think of many other cases where it is legal to use []. The
only case I can think of that would break would be the idiom we disagree on
above. If I am wrong, then show me other cases.

Oh, but it will. You'll have to read and comprehend every function definition
that uses mutable default arguments to start to prove otherwise.
If I also might make a general argument for the fix then let me continue.
Doing a late evaluation of the default expression makes the language more
dynamic, which fits the overall goal of making Prothon more dynamic. Using
prototypes instead of classes, dynamic var scoping, this fix, and many
other Prothon changes from Python all work towards that goal.

Dynamic var scoping fixed another Python gotcha which doesn't break
anything. Here are the two versions of code showing the problem and the
fix:

[snip]

Maybe you should take a step back and look at what you're doing. From my
perspective, you're adding a whole lot of additional rules to the language,
and a completely different way of doing things. That's fine, and more power
to you, but if you're bent on changing so much, you should stop looking to
c.l.p to validate your ideas.

(Of course, I don't speak for the Python community or c.l.p, but I am
horrified nonetheless with what you're doing. Please forgive me if I've been
disagreeable while disagreeing.)
 
R

Rob Williscroft

Andrea Griffini wrote in
in comp.lang.python:
def F(a, b, cache={}):
try:
return cache[(a,b)]
except (IndexError, ):
value = cache[(a,b)] = do_some_long_calc(a,b)
return value

I'm new to python. To my eyes this is a pretty poor attempt to
have static variables. I've implemented in the past a few
scripting languages, and it's not really difficult to
implement static variables...

But python has static variables.

def another( x ):
y = getattr( another, 'static', 10 )
another.static = x
return y

print another(1), another(2), another(4)

it's quite surprising for me
there's no such a concept in python and just that wart...
hmmm... excuse me... that bad smelling wart has to be used instead.

It seems to me in python "everything is an object" leads to
"everything is a dictionary" (except when it isn't:).

Rob.
 
T

Troy Melhase

A global you mean ? Why not just saying "global shared_cache"
at the start of the function ?

The shared_cache dictionary is mutable, making the global statement redundant
at best. It could have just as easily been left out of the function argument
list.

The definition above would also allow clients of the function to pass in a
cache of their choosing. (Granting, of course, that this would most likely
be an obscure use.)
If you need more power than a global then you probably a
(callable) class is going to serve you better than a function.

Of course. I was trying to illustrate using a named module-level dict instead
of a mutable value that could not be accessed outside of the function body.
Is really a feature that if someone passes (by mistake) an extra
parameter to the function the script silently swallows it and
behaves strangely ? Or you also double-wrap the function, so
that a(x,y) calls real_a(x,y,cache=[]) ?

How many times have you passed an extra parameter by mistake? This is Python,
and one of the mottos is "we're all consenting adults".
Pythonic ? In the sense that this is for example more explicit ?
Are you kidding ?

No, I'm not. The function specified a muteable argument. Nothing implicit
about it.
That something is used is one thing. That something is elegant
is another.

Of course. But using a mutable default function parameter as a cache is
elegant as far as I'm concerned.
 
M

Mark Hahn

Troy said:
You're right, it's not a newbie problem. It's a problem for everyone
who hasn't bothered to read the documentation.

Having a language with features intuitive enough to require less trips to
the documention is a laudable goal. Of course you know what I really meant
was what I said in the original posting. Even experts have to write extra
code to get around this problem The following is a common piece of Python
code that is a problem work-around:

def f( a = None ):
if a == None: a = []
Uncool? Do you mean "uncool" as in "forking a language and
distracting a bunch of folks ...

Evolution works by forking. If you have some problem with Prothon then
let's discuss it. Don't hide your problem with Prothon behind your
discussion of this thread's topic.
(And by the way, one definition of love means to accept what we
perceive as deficiencies. So maybe you don't love Python as dearly
as you love the idea of Python.)

You can love something and still want to improve it. Don't tell my wife I
said this :)
That count is not accurate. Fixing this will not break every use of
[] as a default formal param. <correction removed>. I can think of many
other cases where it is legal to use []. The only case I can think of that would
break would be the idiom we disagree on above. If I am wrong, then show
me other cases.

Oh, but it will. You'll have to read and comprehend every function
definition that uses mutable default arguments to start to prove
otherwise.

I'm not sure I follow you. I'm saying that you only have to show me one or
two cases for me to realize I'm wrong.
Maybe you should take a step back and look at what you're doing.
From my perspective, you're adding a whole lot of additional rules to
the language, and a completely different way of doing things. That's
fine, and more power to you, but if you're bent on changing so much,
you should stop looking to c.l.p to validate your ideas.

I'm not looking for validation, just a reasonable discussion of the issue of
which of the three methods to use to fix the problem. You are the one that
started the seperate discussion as to whether it should be fixed or not.

By the way, Prothon is removing a lot of rules, not adding them, by it's
simplification in almost all areas.
(Of course, I don't speak for the Python community or c.l.p, but I am
horrified nonetheless with what you're doing. Please forgive me if
I've been disagreeable while disagreeing.)

No problem. I am quite thick-skinned. I also apologize if I have been
harsh.

I am sorry you are so upset that someone is daring to make a changed Python.
I expected this reaction from many Python afficionados who may love the
warts as much as the beauty. I was surprised that Prothon was received as
warmly as it was here at c.l.p. All you have to do is ignore any postings
with subjects that end in [Prothon].
 
M

Marcin 'Qrczak' Kowalczyk

$ find /usr/lib/python2.3/ -name "*.py" -exec grep "def.*=\[\]" {} \; | wc

And see 67 instances just in the standard library.

I don't know how you counted them:

[qrczak ~/src/python/Python-2.3.4]$ egrep 'def.*= ?\[\]' **/*.py | wc -l
45
[qrczak ~/src/python/Python-2.3.4]$ egrep 'def.*= ?None' **/*.py | wc -l
1420

Now consider that many of the Nones are a workaround for the current
Python behavior.

I agree that it's probably impractical to change Python rules because
some code relies on the current behavior. OTOH evaluating the default
argument each time when the function is applied is technically better.
This is one of warts which is hard to fix because of compatibility.
 
A

Andrea Griffini

But python has static variables.

That's what I mean with "I'm new to python" :)
def another( x ):
y = getattr( another, 'static', 10 )
another.static = x
return y

print another(1), another(2), another(4)

I like more as an example:
... if not hasattr(foo,'list'):
... foo.list = []
... foo.list.append(x)
... print foo.list
...
>>> foo(1) [1]
>>> foo(2) [1, 2]
>>> foo(3) [1, 2, 3]
>>>

In C++ you get the "if" part for free, but the python
approach is still goodlooking enough.

C++:

static int y = 12;

Python:

if not hasattr(foo,'y'):
foo.y = 12

The python "static" also worked as I expected when
they're inside locally defined functions returned as
callable objects.
It seems to me in python "everything is an object" leads to
"everything is a dictionary" (except when it isn't:).

This language looks better every day :) ...

But if there are those better-looking statics, why so
much use of that modifiable-default-of-a-fake-parameter
ugly trick ? Is this something that became legal
only recently ?

I found the description of that 'wart' and the corresponding
'trick' it in some doc about python (don't remember which
one), and even started using it myself! shame on me!

Now that I see this other approach, the function object
attributes look way better IMO...
Why pushing the uglyness instead of the beauty ?

Historical reasons may be ? Those are behind a lot of
C++ horrible parts...


Andrea
 
M

Mark Hahn

Andrea said:
That's what I mean with "I'm new to python" :)
def another( x ):
y = getattr( another, 'static', 10 )
another.static = x
return y

print another(1), another(2), another(4)

I like more as an example:
... if not hasattr(foo,'list'):
... foo.list = []
... foo.list.append(x)
... print foo.list

In Prothon:

def foo(x):
print foo.list.append!(x)
foo.list = []

(Sorry. I couldn't resist bragging.)
 
P

Peter Hansen

Mark said:
Andrea said:
I like more as an example:
def foo(x):
... if not hasattr(foo,'list'):
... foo.list = []
... foo.list.append(x)
... print foo.list

In Prothon:
def foo(x):
print foo.list.append!(x)
foo.list = []

(Sorry. I couldn't resist bragging.)

About what?

Python 2.3.3 (#51, Dec 18 2003, 20:22:39) [MSC v.1200 32 bit (Intel)].... foo.list.append(x)
.... print foo.list
....
>>> foo.list = []
>>> foo('test')
['test']

(Oh, did you mean bragging about how a hard-to-see exclamation
mark causes append() to return the sequence? I thought
maybe it was about the function attribute or something.)

-Peter
 
M

Mark Hahn

Peter said:
In Prothon:
def foo(x):
print foo.list.append!(x)
foo.list = []

(Sorry. I couldn't resist bragging.)

About what?

Python 2.3.3 (#51, Dec 18 2003, 20:22:39) [MSC v.1200 32 bit (Intel)]... foo.list.append(x)
... print foo.list
...
foo.list = []
foo('test')
['test']

(Oh, did you mean bragging about how a hard-to-see exclamation
mark causes append() to return the sequence? I thought
maybe it was about the function attribute or something.)

Actually, I didn't know if the function attribute assignment outside the
function would work in Python or not. I guess I'll know better than to try
to play one-upmanship with Python next time. I did say I was sorry :)

FYI: It's not that the exclamation mark causes append to return the
sequence. The exclamation mark is always there and the sequence is always
returned. The exclamation mark is the universal symbol for in-place
modification. This is straight from Ruby and solves the problem that caused
Guido to not allow sequences to be returned. And, yes, I do think that's
worth bragging about ;-)
 
A

Andrea Griffini

In Prothon:

def foo(x):
print foo.list.append!(x)
foo.list = []

(Sorry. I couldn't resist bragging.)

The very first thing I tried was assigning foo.list
outside of the function (and, by the way, that works in
python too); this however doesn't mimic C++ static,
as initialization of the static local variable is done
in C++ when (and only IF) the function is entered.
The value used for initialization can for example
depend on local parameters or global state at *that time*.

Using "hasattr" seemed ugly to me at first, but after
all you need an additional flag anyway, so why not
checking the presence of a certain key in foo.__dict__ ?
That way both initialization and setting the flag are
done at the same time using just one clean statement.

The only (very small) syntax price is the if (that in
C++ is implicit in the overused keyword "static").

Andrea
 
D

David Bolen

Andrea Griffini said:
But if there are those better-looking statics, why so
much use of that modifiable-default-of-a-fake-parameter
ugly trick ? Is this something that became legal
only recently ?

Function attributes were in fact somewhat recently added (as of Python
2.1, so circa April of 2001).

-- David
 
M

Michele Simionato

Mark Hahn said:
FYI: It's not that the exclamation mark causes append to return the
sequence. The exclamation mark is always there and the sequence is always
returned. The exclamation mark is the universal symbol for in-place
modification. This is straight from Ruby and solves the problem that caused
Guido to not allow sequences to be returned. And, yes, I do think that's
worth bragging about ;-)

I think the esclamation mark comes from Scheme if not from a more
ancient language. It is certainly not a new idea. OTOH, it is a good
idea, no question
about that. Same for "?" in booleans.


Michele Simionato
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top