Tuple parameter unpacking in 3.x

M

Martin Geisler

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkjlQNwACgkQ6nfwy35F3Tj8ywCgox+XdmeDTAKdN9Q8KZAvfNe4
0/4AmwZGClr8zmonPAFnFsAOtHn4JhfY
=hTwE
-----END PGP SIGNATURE-----
 
B

bearophileHUGS

Martin Geisler:
ci.addCallback(lambda (ai, bi): ai * bi)
or
map(lambda (i, s): (field(i + 1), s), enumerate(si))
Rewriting these to
ci.addCallback(lambda abi: abi[0] * abi[1])
and
map(lambda is: (field(is[0] + 1), is[1]), enumerate(si))
makes the code much uglier! And slightly longer.

I agree a lot. I can show similar examples of my code with sort/sorted
that contain a lambda that de-structures sequences of 2 or 3 items, to
define a sorting key.

As I've stated in the past, I'd like to see more support of pattern
matching in Python, and not less. People coming from Mathematica,
Scala, OcaML, etc, know they can be useful, and Scala shows that
Python too may find some usages for that.

I think they have removed (part of, not fully) this Python feature
mostly to simplify the C implementation of CPython.

So far I think this removal, and not using {:} as empty array literal
are the only two mistakes done during the design of Python 3. If you
look at the really large number of design decisions taken during the
creation of Python 3 itself, I think this is an exceptionally good
result anyway.

Bye,
bearophile
 
P

Peter Otten

Nick said:
I don't think many people will miss tuple unpacking in def statements.

I think the warning is probably wrong anyway - you just need to remove
a few parens...


On
Python 3.0rc1 (r30rc1:66499, Oct 4 2008, 11:04:33)

File "<stdin>", line 1
f = lambda (ai, bi): ai * bi
^
SyntaxError: invalid syntax

But

File "<stdin>", line 1
lambda (i, s): (field(i + 1), s)
^
SyntaxError: invalid syntax

So just remove the parentheses and you'll be fine.

No, you change the function signature in the process.

f = lambda (a, b): a*b

is equivalent to

def f((a, b)): # double parens
return a*b

and called as f(arg) where arg is an iterable with two items.

In 3.0 it has to be rewritten as

def f(ab):
a, b = ab
return a*b

i. e. it needs a statement and an expression and is therefore no longer
suitable for a lambda.

Peter
 
M

Martin Geisler

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkjnhqMACgkQ6nfwy35F3TgRMwCg3kRONIU1Q33WFQQmXM1XHYlO
8hsAn1S+t8EhtdkcY6/wKIQ4034rXPyY
=Owc1
-----END PGP SIGNATURE-----
 
T

Terry Reedy

Martin said:
A somewhat related question: do I pay a performance penalty when I let a
function define an inner function like this:

def foo():

def bar()
...

bar()

Some. The *code* for the body of bar is compiled as part of compiling
the body of foo, but each call of foo creates a new *function* object.
compared to just defining it once outside:

def bar():
...

def foo():
...
bar()

I'm thinking that each execution of the first foo could spend a little
time defining a new bar each time, or is that not how things work?

I realize that defining bar as an inner function has the advantage of
being able to see variables in the namespace of foo.

The alternative is to pass in the value(s) needed.

tjr
 
M

Martin Geisler

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkjn2LQACgkQ6nfwy35F3Til5gCdFfodPKsKC0Tn6YhLjwrcSvAf
eA8AoOix4R/H1CrTnZcpUuM771IigfvY
=/NcI
-----END PGP SIGNATURE-----
 
S

Steven D'Aprano

Dennis Lee Bieber said:
In 3.0 it has to be rewritten as

def f(ab):
a, b = ab
return a*b

i. e. it needs a statement and an expression and is therefore no
longer suitable for a lambda.
Given that most lambda's are rather short, is it really that
much of a pain to just use (for the above example) ab[0]*ab[1] without
unpacking?

Well -- it is because the code used to be short and sweet that it feels
ugly to me to change

lambda (a, b): a * b

into

lambda ab: ab[0] * ab[1]

The first looks perfect -- it matches the math behind the code that I am
writing. The second does not look so nice.

Here's another alternative. Compare:
6

with this:
6



From reading the PEP-3113 I got the impression that the author thought
that this feature was unused and didn't matter. With this I wish to say
that it matters to me.

Alas, I think it's too late. I feel your pain.

The final release of Python 3.0 hasn't been made yet. If you really care
strongly about this, you could write to the python-dev mailing list and
ask if it is worth trying to change their mind about tuple unpacking.
They'll almost certainly say no, but there's a chance it might be
reverted in 3.1.

A tiny chance.
 
S

Steven D'Aprano

A somewhat related question: do I pay a performance penalty when I let a
function define an inner function like this:

def foo():

def bar()
...

bar()

compared to just defining it once outside:

def bar():
...

def foo():
...
bar()

I'm thinking that each execution of the first foo could spend a little
time defining a new bar each time, or is that not how things work?

I realize that defining bar as an inner function has the advantage of
being able to see variables in the namespace of foo.

That is the main advantage, followed by reducing namespace pollution, but
yes there is a very small performance penalty.

.... return x+1
........ return outer(x+1)
........ def inner(x):
.... return x+1
.... return inner(x+1)
....
assert func(37) == func2(37)
from timeit import Timer
t1 = Timer('func(23)', 'from __main__ import func')
t2 = Timer('func2(23)', 'from __main__ import func2')
t1.repeat() [1.5711719989776611, 0.82663798332214355, 0.82708191871643066]
t2.repeat()
[1.8273210525512695, 1.1913230419158936, 1.1786220073699951]
 
M

Martin Geisler

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkjofRMACgkQ6nfwy35F3TgKewCfcH21ZG02FQ7gy+poLrdYWg9K
Uh8An3cVmnYAnF3ekoA4E9uZmOcTpdaC
=DgZp
-----END PGP SIGNATURE-----
 
M

Martin Geisler

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.9 (GNU/Linux)

iEYEARECAAYFAkjofe4ACgkQ6nfwy35F3TiXvgCeKlrvMjC65XLrfM6H6iQtyH2a
1HwAoNpPvq+82Bt/fjNFfbvs2IPfmA2/
=eKST
-----END PGP SIGNATURE-----
 
S

Steven D'Aprano

PEP 3113 offers the following recommendation for refactoring tuple
arguments:

def fxn((a, (b, c))):
pass

will be translated into:

def fxn(a_b_c):
(a, (b, c)) = a_b_c
pass

and similar renaming for lambdas.
http://www.python.org/dev/peps/pep-3113/


I'd like to suggest that this naming convention clashes with a very
common naming convention, lower_case_with_underscores. That's easy enough
to see if you replace the arguments a, b, c above to something more
realistic:

def function(vocab_list, (result, flag), max_value)

becomes:

def function(vocab_list, result_flag, max_value)

Function annotations may help here, but not everyone is going to use them
in the same way, or even in a way that is useful, and the 2to3 tool
doesn't add annotations.

It's probably impossible to avoid all naming convention clashes, but I'd
like to suggest an alternative which distinguishes between a renamed
tuple and an argument name with two words:

def function(vocab_list, (result, flag), max_value):
pass

becomes:

def function(vocab_list, t__result_flag, max_value):
result, flag = t__result_flag
pass

The 't__' prefix clearly marks the tuple argument as different from the
others. The use of a double underscore is unusual in naming conventions,
and thus less likely to clash with other conventions. Python users are
already trained to distinguish single and double underscores. And while
it's three characters longer than the current 2to3 behaviour, the length
compares favorably with the original tuple form:

t__result_flag
(result, flag)

What do people think? Is it worth taking this to the python-dev list?
 
A

Aaron \Castironpi\ Brady

Steven said:
PEP 3113 offers the following recommendation for refactoring tuple
arguments:

def fxn((a, (b, c))):
pass

will be translated into:

def fxn(a_b_c):
(a, (b, c)) = a_b_c
pass

and similar renaming for lambdas.
http://www.python.org/dev/peps/pep-3113/


I'd like to suggest that this naming convention clashes with a very
common naming convention, lower_case_with_underscores. That's easy enough
to see if you replace the arguments a, b, c above to something more
realistic:

def function(vocab_list, (result, flag), max_value)

becomes:

def function(vocab_list, result_flag, max_value)

Function annotations may help here, but not everyone is going to use them
in the same way, or even in a way that is useful, and the 2to3 tool
doesn't add annotations.

It's probably impossible to avoid all naming convention clashes, but I'd
like to suggest an alternative which distinguishes between a renamed
tuple and an argument name with two words:

def function(vocab_list, (result, flag), max_value):
pass

becomes:

def function(vocab_list, t__result_flag, max_value):
result, flag = t__result_flag
pass

The 't__' prefix clearly marks the tuple argument as different from the
others. The use of a double underscore is unusual in naming conventions,
and thus less likely to clash with other conventions. Python users are
already trained to distinguish single and double underscores. And while
it's three characters longer than the current 2to3 behaviour, the length
compares favorably with the original tuple form:

t__result_flag
(result, flag)

What do people think? Is it worth taking this to the python-dev list?

There's the possibility that the most important words should go first in
this case:

result_flag__t

But, I'll admit that other people could have learned different orders of
scanning words than I, especially depending on their spoken language
backgrounds. A poll of the newsgroup isn't exactly academically
impartial sampling, but there aren't any better ways to make decisions,
are there? (I think it would be easy to make either one a habit.)

Here's the other option in the same context:

def function(vocab_list, result_flag__t, max_value):
result, flag = result_flag__t
pass

To be thorough, there's also a trailing double underscore option.

def function(vocab_list, result_flag__, max_value):
result, flag = result_flag__
pass

Which I don't recognize from any other usages, but I defer. If there
aren't any, conditionally, I think this is my favorite.
 
P

Peter Otten

Steven said:
PEP 3113 offers the following recommendation for refactoring tuple
arguments:

def fxn((a, (b, c))):
pass

will be translated into:

def fxn(a_b_c):
(a, (b, c)) = a_b_c
pass

and similar renaming for lambdas.
http://www.python.org/dev/peps/pep-3113/


I'd like to suggest that this naming convention clashes with a very
common naming convention, lower_case_with_underscores. That's easy enough
to see if you replace the arguments a, b, c above to something more
realistic:

def function(vocab_list, (result, flag), max_value)

becomes:

def function(vocab_list, result_flag, max_value)

Function annotations may help here, but not everyone is going to use them
in the same way, or even in a way that is useful, and the 2to3 tool
doesn't add annotations.

It's probably impossible to avoid all naming convention clashes, but I'd
like to suggest an alternative which distinguishes between a renamed
tuple and an argument name with two words:

def function(vocab_list, (result, flag), max_value):
pass

becomes:

def function(vocab_list, t__result_flag, max_value):
result, flag = t__result_flag
pass

The 't__' prefix clearly marks the tuple argument as different from the
others. The use of a double underscore is unusual in naming conventions,
and thus less likely to clash with other conventions. Python users are
already trained to distinguish single and double underscores. And while
it's three characters longer than the current 2to3 behaviour, the length
compares favorably with the original tuple form:

t__result_flag
(result, flag)

Let's see what the conversion tool does:

$ cat tmp.py
g = lambda (a, b): a*b + a_b
$ 2to3 tmp.py
RefactoringTool: Skipping implicit fixer: buffer
RefactoringTool: Skipping implicit fixer: idioms
RefactoringTool: Skipping implicit fixer: ws_comma
--- tmp.py (original)
+++ tmp.py (refactored)
@@ -1,1 +1,1 @@
-g = lambda (a, b): a*b + a_b
+g = lambda a_b1: a_b1[0]*a_b1[1] + a_b
RefactoringTool: Files that need to be modified:
RefactoringTool: tmp.py

So the current strategy is to add a numerical suffix if a name clash occurs.
The fixer clearly isn't in final state as for functions instead of lambdas
it uses xxx_todo_changeme.
What do people think? Is it worth taking this to the python-dev list?

I suppose that actual clashes will be rare. If there is no clash a_b is the
best name and I prefer trying it before anything else.
I don't particularly care about what the fallback should be except that I
think it should stand out a bit more than the current numerical prefix.
xxx1_a_b, xxx2_a_b,... maybe?

Peter
 
P

Peter Otten

Steven said:
PEP 3113 offers the following recommendation for refactoring tuple
arguments:

def fxn((a, (b, c))):
pass

will be translated into:

def fxn(a_b_c):
(a, (b, c)) = a_b_c
pass

and similar renaming for lambdas.
http://www.python.org/dev/peps/pep-3113/


I'd like to suggest that this naming convention clashes with a very
common naming convention, lower_case_with_underscores. That's easy enough
to see if you replace the arguments a, b, c above to something more
realistic:

def function(vocab_list, (result, flag), max_value)

becomes:

def function(vocab_list, result_flag, max_value)

Function annotations may help here, but not everyone is going to use them
in the same way, or even in a way that is useful, and the 2to3 tool
doesn't add annotations.

It's probably impossible to avoid all naming convention clashes, but I'd
like to suggest an alternative which distinguishes between a renamed
tuple and an argument name with two words:

def function(vocab_list, (result, flag), max_value):
pass

becomes:

def function(vocab_list, t__result_flag, max_value):
result, flag = t__result_flag
pass

The 't__' prefix clearly marks the tuple argument as different from the
others. The use of a double underscore is unusual in naming conventions,
and thus less likely to clash with other conventions. Python users are
already trained to distinguish single and double underscores. And while
it's three characters longer than the current 2to3 behaviour, the length
compares favorably with the original tuple form:

t__result_flag
(result, flag)

Let's see what the conversion tool does:

$ cat tmp.py
g = lambda (a, b): a*b + a_b
$ 2to3 tmp.py
RefactoringTool: Skipping implicit fixer: buffer
RefactoringTool: Skipping implicit fixer: idioms
RefactoringTool: Skipping implicit fixer: ws_comma
--- tmp.py (original)
+++ tmp.py (refactored)
@@ -1,1 +1,1 @@
-g = lambda (a, b): a*b + a_b
+g = lambda a_b1: a_b1[0]*a_b1[1] + a_b
RefactoringTool: Files that need to be modified:
RefactoringTool: tmp.py

So the current strategy is to add a numerical suffix if a name clash occurs.
The fixer clearly isn't in final state as for functions instead of lambdas
it uses xxx_todo_changeme.
What do people think? Is it worth taking this to the python-dev list?

I suppose that actual clashes will be rare. If there is no clash a_b is the
best name and I prefer trying it before anything else.
I don't particularly care about what the fallback should be except that I
think it should stand out a bit more than the current numerical suffix.
xxx1_a_b, xxx2_a_b,... maybe?

Peter
 
T

Terry Reedy

And that there were good alternatives, and that there were technical
reasons why maintaining the feature in the face of other arguments
options would be a nuisance.
Thanks! And I know it's too late, I should have found out about this
earlier :-(

For future reference, the time to have said something would have been
during the 3 month alpha phase, which is for testing feature and api
changes. I suspect, from reading the pydev discussion, that the answer
still would have been to either use a def statement and add the unpack
line or to subscript the tuple arg to stick with lambda expressions.
But who knows?

tjr
 
H

Harald Luessen

There's the possibility that the most important words should go first in
this case:

result_flag__t

But, I'll admit that other people could have learned different orders of
scanning words than I, especially depending on their spoken language
backgrounds. A poll of the newsgroup isn't exactly academically
impartial sampling, but there aren't any better ways to make decisions,
are there? (I think it would be easy to make either one a habit.)

Here's the other option in the same context:

def function(vocab_list, result_flag__t, max_value):
result, flag = result_flag__t
pass

To be thorough, there's also a trailing double underscore option.

def function(vocab_list, result_flag__, max_value):
result, flag = result_flag__
pass

Which I don't recognize from any other usages, but I defer. If there
aren't any, conditionally, I think this is my favorite.

t__result_flag and result_flag__t have the advantage that you can
search for t__ or __t as start or end of a name if you want to
find and change all these places in the source. You can compare
it with the decision to use reinterpret_cast<long>(...) as a cast
operator in C++. It is ugly but much easier to find than (long)...
A search for __ alone would have too many hits in Python.

Harald
 
B

Brett C.

And that there were good alternatives, and that there were technical
reasons why maintaining the feature in the face of other arguments
options would be a nuisance.

As the author of PEP 3113, I should probably say something (kudos to
Python-URL bringing this thread to my attention).

There are two things to realize about the tuple unpacking that acted
as motivation. One is supporting them in the compiler is a pain.
Granted that is a weak argument that only the core developers like
myself care about.

Second, tuple unpacking in parameters breaks introspection horribly.
All one has to do is look at the hoops 'inspect' has to jump through
in order to figure out what the heck is going on to see how messy this
is. And I realize some people will say, "but if 'inspect' can handle
it, then who cares about the complexity", but that doesn't work if
'inspect' is viewed purely as a simple wrapper so you don't have to
remember some attribute names and not having to actually perform some
major hoops. I personally prefer being able to introspect from the
interpreter prompt without having to reverse-engineer how the heck
code objects deal with tuple unpacking.
 >>> With this I wish to say that it matters to me.

Well, every feature matters to someone. Question is whether it matters
to enough people to warrant having a feature. I brought this up at
PyCon 2006 through a partially botched lightning talk and on python-
dev through the PEP, so this was not some snap decision on my part
that I rammed through python-dev; there was some arguing for keeping
it from some python-dev members, but Guido agreed with me in the end.

If you still hate me you can find me at PyCon 2009 and tar & feather
me *after* my talk. =)
For future reference, the time to have said something would have been
during the 3 month alpha phase, which is for testing feature and api
changes.  I suspect, from reading the pydev discussion, that the answer
still would have been to either use a def statement and add the unpack
line or to subscript the tuple arg to stick with lambda expressions.
But who knows?

I have not read this whole thread thoroughly, but it sounds like using
iterator unpacking at the call site (e.g., ``fxn(*args)`` when calling
your lambda) is out of the question because it is from a callback?

As Terry said, the alpha is one way you can give feedback if you don't
want to follow python-dev or python-3000 but still have your opinion
be heard. The other way is to subscribe to the PEP news feed (found
off of http://www.python.org/dev/peps/) to keep on top of PEPs as
practically all language changes have to result in a PEP at some
point. And of course the last option is to follow python-dev. =)

-Brett
 
B

bearophileHUGS

Brett C.:
There are two things to realize about the tuple unpacking that acted
as motivation. One is supporting them in the compiler is a pain.
...
Second, tuple unpacking in parameters breaks introspection horribly.

Are there ways to re-implement from scratch similar native pattern-
matching features in CPython, like ones of Closure:
http://items.sjbach.com/16/some-notes-about-clojure
Or better ones of Scala, (or even OcaML, but those are static), etc,
that avoid that pain and introspection troubles?

Bye,
bearophile
 
S

Steven D'Aprano

Steven said:
PEP 3113 offers the following recommendation for refactoring tuple
arguments:

def fxn((a, (b, c))):
pass

will be translated into:

def fxn(a_b_c):
(a, (b, c)) = a_b_c
pass

and similar renaming for lambdas.
http://www.python.org/dev/peps/pep-3113/


I'd like to suggest that this naming convention clashes with a very
common naming convention, lower_case_with_underscores. That's easy
enough to see if you replace the arguments a, b, c above to something
more realistic:

def function(vocab_list, (result, flag), max_value)

becomes:

def function(vocab_list, result_flag, max_value)

Function annotations may help here, but not everyone is going to use
them in the same way, or even in a way that is useful, and the 2to3
tool doesn't add annotations.

It's probably impossible to avoid all naming convention clashes, but
I'd like to suggest an alternative which distinguishes between a
renamed tuple and an argument name with two words:

def function(vocab_list, (result, flag), max_value):
pass

becomes:

def function(vocab_list, t__result_flag, max_value):
result, flag = t__result_flag
pass

The 't__' prefix clearly marks the tuple argument as different from the
others. The use of a double underscore is unusual in naming
conventions, and thus less likely to clash with other conventions.
Python users are already trained to distinguish single and double
underscores. And while it's three characters longer than the current
2to3 behaviour, the length compares favorably with the original tuple
form:

t__result_flag
(result, flag)

Let's see what the conversion tool does:

$ cat tmp.py
g = lambda (a, b): a*b + a_b
$ 2to3 tmp.py
RefactoringTool: Skipping implicit fixer: buffer RefactoringTool:
Skipping implicit fixer: idioms RefactoringTool: Skipping implicit
fixer: ws_comma --- tmp.py (original)
+++ tmp.py (refactored)
@@ -1,1 +1,1 @@
-g = lambda (a, b): a*b + a_b
+g = lambda a_b1: a_b1[0]*a_b1[1] + a_b RefactoringTool: Files that need
to be modified: RefactoringTool: tmp.py

So the current strategy is to add a numerical suffix if a name clash
occurs. The fixer clearly isn't in final state as for functions instead
of lambdas it uses xxx_todo_changeme.
What do people think? Is it worth taking this to the python-dev list?

I suppose that actual clashes will be rare. If there is no clash a_b is
the best name and I prefer trying it before anything else. I don't
particularly care about what the fallback should be except that I think
it should stand out a bit more than the current numerical suffix.
xxx1_a_b, xxx2_a_b,... maybe?


Possibly you have misunderstood me. I'm not concerned with a clash
between names, as in the following:

lambda a_b, (a, b):
maps to -> lambda a_b, a_b:

as I too expect they will be rare, and best handled by whatever mechanism
the fixer users to fix any other naming clash.

I am talking about a clash between *conventions*, where there could be
many argument names of the form a_b which are not intended to be two item
tuples.

In Python 2.x, when you see the function signature

def spam(x, (a, b))

it is clear and obvious that you have to pass a two-item tuple as the
second argument. But after rewriting it to spam(x, a_b) there is no such
help. There is no convention in Python that says "when you see a function
argument of the form a_b, you need to pass two items" (nor should there
be).

But given the deafening silence on this question, clearly other people
don't care much about misleading argument names.

*wink*
 
M

MRAB

Steven D'Aprano wrote:
Let's see what the conversion tool does:
$ cat tmp.py
g = lambda (a, b): a*b + a_b
$ 2to3 tmp.py
RefactoringTool: Skipping implicit fixer: buffer RefactoringTool:
Skipping implicit fixer: idioms RefactoringTool: Skipping implicit
fixer: ws_comma --- tmp.py (original)
+++ tmp.py (refactored)
@@ -1,1 +1,1 @@
-g = lambda (a, b): a*b + a_b
+g = lambda a_b1: a_b1[0]*a_b1[1] + a_b RefactoringTool: Files that need
to be modified: RefactoringTool: tmp.py
So the current strategy is to add a numerical suffix if a name clash
occurs. The fixer clearly isn't in final state as for functions instead
of lambdas it uses xxx_todo_changeme.
I suppose that actual clashes will be rare. If there is no clash a_b is
the best name and I prefer trying it before anything else. I don't
particularly care about what the fallback should be except that I think
it should stand out a bit more than the current numerical suffix.
xxx1_a_b, xxx2_a_b,... maybe?

Possibly you have misunderstood me. I'm not concerned with a clash
between names, as in the following:

lambda a_b, (a, b):
maps to -> lambda a_b, a_b:

as I too expect they will be rare, and best handled by whatever mechanism
the fixer users to fix any other naming clash.

I am talking about a clash between *conventions*, where there could be
many argument names of the form a_b which are not intended to be two item
tuples.

In Python 2.x, when you see the function signature

def spam(x, (a, b))

it is clear and obvious that you have to pass a two-item tuple as the
second argument. But after rewriting it to spam(x, a_b) there is no such
help. There is no convention in Python that says "when you see a function
argument of the form a_b, you need to pass two items" (nor should there
be).

But given the deafening silence on this question, clearly other people
don't care much about misleading argument names.

*wink*
Could it be a double underscore instead, eg a__b,
first_item__second_item?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,535
Members
45,007
Latest member
obedient dusk

Latest Threads

Top