Pythonification of the asterisk-based collection packing/unpacking syntax

Eelco · Dec 17, 2011

This is a follow-up discussion on my earlier PEP-suggestion. Ive
integrated the insights collected during the previous discussion, and
tried to regroup my arguments for a second round of feedback. Thanks
to everybody who gave useful feedback the last time.

PEP Proposal: Pythonification of the asterisk-based collection packing/
unpacking syntax.

This proposal intends to expand upon the currently existing collection
packing and unpacking syntax. Thereby we mean the following related
python constructs:
head, *tail = somesequence
#pack the remainder of the unpacking of somesequence into a list
called tail
def foo(*args): pass
#pack the unbound positional arguments into a tuple calls args
def foo(**kwargs): pass
#pack the unbound keyword arguments into a dict calls kwargs
foo(*args)
#unpack the sequence args into positional arguments
foo(**kwargs)
#unpack the mapping kwargs into keyword arguments

We suggest that these constructs have the following shortcomings that
could be remedied.
It is unnecessarily cryptic, and out of line with Pythons preference
for an explicit syntax. One can not state in a single line what the
asterisk operator does; this is highly context dependent, and is
devoid of that ‘for line in file’ pythonic obviousness. From the
perspective of a Python outsider, the only hint as to what *args means
is by loose analogy with the C-way of handling variable arguments.
The current syntax, in its terseness, leaves to be desired in terms of
flexibility. While a tuple might be the logical choice to pack
positional arguments in the vast majority of cases, it need not be
true that a list is always the preferred choice to repack an unpacked
sequence, for instance.

Type constraints:

In case the asterisk is not used to signal unpacking, but rather to
signal packing, its semantics is essentially that of a type
constraint. The statement:

head, tail = sequence

Signifies regular unpacking. However, if we add an asterisk, as in:

head, *tail = sequence

We demand that tail not be just any python object, but rather a list.
This changes the semantics from normal unpacking, to unpacking and
then repacking all but the head into a list.

It may be somewhat counter-intuitive to think of this as a type
constraint, since python is after all a weakly-typed language. But the
current usage of askeriskes is an exception to that rule. For those
who are unconvinced, please consider the analogy to the following
simple C# code:

var foo = 3;

An ‘untyped‘ object foo is created (actually, its type will be
inferred from its rhs as an integer).

float foo = 3;

By giving foo a type-constraint of float instead, the semantics are
modified; foo is no longer the integer 3, but gets silently cast to
3.0. This is a simple example, but conceptually entirely analogous to
what happens when one places an asterisk before an lvalue in Python.
It means ‘be a list, and adjust your behavior accordingly’, versus ‘be
a float, and adjust your behavior accordingly’.

The aim of this PEP, is that this type-constraint syntax is expanded
upon. We should be careful here to distinguish with providing optional
type constraints throughout python as a whole; this is not our aim.
This concept has been considered before, but the costs have not been
found to out-weight the benefits. http://www.artima.com/weblogs/viewpost.jsp?thread=86641
Our primary aim is the niche of collection packing/unpacking, but if
further generalizations can be made without increasing the cost, those
are most welcome. To reiterate: what is proposed is nothing radical;
merely to replace the asterisk-based type constraints with a more
explicit type constraint.

Currently favored alternative syntax:

Both for the sake of explicitness and flexibility, we consider it
desirable that the name of the collection type is used directly in any
collection packing statement. Annotating a variable declaration with a
collection type name should signal collection packing. This
association between a collection type name and a variable declaration
can be accomplished in many ways; for now, we suggest
collectionname::collectiontype for packing, and ::collectionname for
unpacking.

Examples of use:
head, tail::tuple = ::sequence
def foo(args::list, kwargs::dict): pass
foo

:args, ::kwargs)

The central idea is to replace annotations with asteriskes by
annotations with collection type names, but note that we have opted
for several other minor alterations of the existing syntax that seem
natural given the proposed changes.

First of all, explicitly mentioning the type of the collection
involved eliminates the need to have two symbols, * and **. Which
variable captures the positional arguments and which captures the
keyword arguments can be inferred from the collection type they model,
mapping or sequence. The rare case of collections that both model a
sequence and a mapping can either be excluded or handled by assigning
precedence for one type or the other.

A double semicolon before a collection type signals unpacking. As with
declarations, there is no genuine need to have a different operator
for sequence and mapping types, although if such a demand exists, it
would not be hard to accommodate. A double semicolon in front of the
collection is congruent with the asterisk syntax, and nicely
emphasizes this unpacking operation being the symmetric counterpart of
the packing operation, which is signalled by the same symbols to the
right of the identifier. Since we are going to make the double
semicolon (or whatever the symbol) a general collection packing/
unpacking marker, we feel it makes sense to allow it to be used to
explicitly signify unpacking, even when as much is implied by the
syntax on the left hand side, to preserve symmetry with the syntax
inside function calls.

Summarizing, what this syntax achieves, in loose order of perceived
importance:
Simplicity: we have reduced a set of rather arbitrary rules concerning
the syntax and semantics of the asterisk (does it construct a list or
a tuple?) to a single general symbol: the double semicolon is the
collection packing/unpacking annotation symbol, and that is all there
is to know about it.
Readability: the proposed syntax reads like a book: args-list and
kwargs-dict, unlike the more cryptic asterisk syntax. We avoid extra
lines of code in the event another sequence or mapping type than the
one returned by default is required.
Efficiency: by declaring the desired collection type, it can be
constructed in the optimal way from the given input, rather than
requiring a conversion after the default collection type is
constructed.

A double semicolon is suggested, since the single colon is already
taken by the function annotation syntax in Python 3. This is somewhat
unfortunate: programming should come before meta-programming, and it
should rather be the other way around. On the one hand having both :
and :: as variable declaration annotation symbols is a nice
unification, on the other hand, a syntax more easily visually
distinguished from function annotations can be defended. For increased
backwards compatibility the asterisk could be used, but sandwiched
between two identifiers it looks like a multiplication. But many
others symbols would do, such as @ or !.

Steven D'Aprano · Dec 17, 2011

One can not state in a single line what the asterisk
operator does;

Multiplication, exponentiation, sequence packing/unpacking, and varargs.

Roy Smith · Dec 17, 2011

Steven D'Aprano said:
Multiplication, exponentiation, sequence packing/unpacking, and varargs.

Import wildcarding?

Chris Angelico · Dec 17, 2011

Import wildcarding?

That's not an operator, any more than it is when used in filename
globbing. The asterisk _character_ has many meanings beyond those of
the operators * and **.

ChrisA

Eelco · Dec 17, 2011

That's not an operator, any more than it is when used in filename
globbing. The asterisk _character_ has many meanings beyond those of
the operators * and **.

ChrisA

To cut short this line of discussion; I meant the asterisk symbol
purely in the context of collection packing/unpacking. Of course it
has other uses too.

Even that single use requires a whole paragraph to explain completely;
when does it result in a tuple or a list, when is unpacking implicit
and when not, why * versus **, and so on.

Steven D'Aprano · Dec 17, 2011

....
To cut short this line of discussion; I meant the asterisk symbol purely
in the context of collection packing/unpacking. Of course it has other
uses too.

Even that single use requires a whole paragraph to explain completely;
when does it result in a tuple or a list, when is unpacking implicit and
when not, why * versus **, and so on.

Do you think that this paragraph will become shorter if you change the
spelling * to something else?

It takes more than one line to explain list comprehensions, content
managers, iterators, range(), and import. Why should we care if * and **
also take more than one paragraph? Even if you could get it down to a
single line, what makes you think that such extreme brevity is a good
thing?

You might not be able to explain them in a single line, but you can
explain them pretty succinctly:

Varags: Inside a function parameter list, * collects an arbitrary
number of positional arguments into a tuple. When calling functions,
* expands any iterator into positional arguments. In both cases, **
does the same thing for keyword arguments.

Extended iterator unpacking: On the left hand side of an assignment,
* collects multiple values from the right hand side into a list.

Let's see you do better with your suggested syntax. How concisely can you
explain the three functions?

Don't forget the new type coercions (not constraints, as you keep calling
them) you're introducing. It boggles my mind that you complain about the
complexity of existing functionality, and your solution involves
*increasing* the complexity with more functionality.

Steven D'Aprano · Dec 18, 2011

Type constraints:

In case the asterisk is not used to signal unpacking, but rather to
signal packing, its semantics is essentially that of a type constraint.

"Type constraint" normally refers to type restrictions on *input*: it is
a restriction on what types are accepted. When it refers to output, it is
not normally a restriction, therefore "constraint" is inappropriate.
Instead it is normally described as a coercion, cast or conversion.
Automatic type conversions are the opposite of a constraint: it is a
loosening of restrictions. "I don't have to use a list, I can use any
sequence or iterator".

In iterator unpacking, it is the *output* which is a list, not a
restriction on input: in the statement:

head, *tail = sequence

tail may not exist before the assignment, and so describing this as a
constraint on the type of tail is completely inappropriate.

The statement:

head, tail = sequence

Signifies regular unpacking. However, if we add an asterisk, as in:

head, *tail = sequence

We demand that tail not be just any python object, but rather a list.

We don't demand anything, any more than when we say:

for x in range(1, 100):

we "demand" that x is not just any python object, but rather an int.

Rather, we accept what we're given: in case of range and the for loop, we
are given an int. In the case of extended tuple unpacking, we are given a
list.

This changes the semantics from normal unpacking, to unpacking and then
repacking all but the head into a list.

Aside: iterator unpacking is more general than just head/tail unpacking.
0 1 7 8 9 [2, 3, 4, 5, 6]

You are jumping to conclusions about implementation details which aren't
supported by the visible behaviour. What evidence do you have that
iterator unpacking creates a tuple first and then converts it to a list?

It may be somewhat counter-intuitive to think of this as a type
constraint, since python is after all a weakly-typed language.

The usual test of a weakly-typed language is that "1"+1 succeeds (and
usually gives 2), as in Perl but not Python. I believe you are confusing
weak typing with dynamic typing, a common mistake.

[...]

The aim of this PEP, is that this type-constraint syntax is expanded
upon. We should be careful here to distinguish with providing optional
type constraints throughout python as a whole; this is not our aim.

Iterator unpacking is no more about type constraints than is len().

Chris Angelico · Dec 18, 2011

The usual test of a weakly-typed language is that "1"+1 succeeds (and
usually gives 2), as in Perl but not Python. I believe you are confusing
weak typing with dynamic typing, a common mistake.

I'd go stronger than "usually" there. If "1"+1 results in "11", then
that's not weak typing but rather a convenient syntax for
stringification - if every object can (or must) provide a to-string
method, and concatenating anything to a string causes it to be
stringified, then it's still strongly typed.

Or is a rich set of automated type-conversion functions evidence of
weak typing? And if so, then where is the line drawn - is upcasting of
int to float weak?

ChrisA

Evan Driscoll · Dec 18, 2011

I'd go stronger than "usually" there. If "1"+1 results in "11", then
that's not weak typing but rather a convenient syntax for
stringification - if every object can (or must) provide a to-string
method, and concatenating anything to a string causes it to be
stringified, then it's still strongly typed.

Or is a rich set of automated type-conversion functions evidence of
weak typing? And if so, then where is the line drawn - is upcasting of
int to float weak?

ChrisA

Sorry, I just subscribed to the list so am stepping in mid-conversation,
but "strong" vs "weak" typing does not have a particularly well-defined
meaning. There are at least three very different definitions you'll find
people use which are almost pairwise orthogonal in theory, if less so in
practice. There's a great mail to a Perl mailing list I've seen [1]
where someone lists *eight* definitions (albeit with a couple pairs of
definitions that are only slightly different).

I like to use it in the "automated conversion" sense, because I feel
like the other possible definitions are covered by other terms
(static/dynamic, and safe/unsafe). And in that sense, I think that
thinking of languages as "strong" *or* "weak" is a misnomer; it's a
spectrum. (Actually even a spectrum is simplifying things -- it's more
like a partial order.)

Something like ML or Haskell, which does not even allow integer to
double promotions, is very strong typing. Something like Java, which
allows some arithmetic conversion and also automatic stringification (a
la "1" + 1) is somewhere in the middle of the spectrum. Personally I'd
put Python even weaker on account of things such as '[1,2]*2' and '1 <
True' being allowed, but on the other hand it doesn't allow "1"+1.

Evan

[1]
http://groups.google.com/group/comp.lang.perl.moderated/msg/89b5f256ea7bfadb
(though I don't think I've seen all of those)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJO7VflAAoJEAOzoR8eZTzgE1wH+wSMMYP6hKR6dNM4/j6ffHGE
VrMSQkKMoPaSNwwtLyxPhc9IOIWrp3HqxyXR/howHLPMO/j5kW0VZ8Vh5+HdxX6Q
Emu0sCHuzDdWXctqE1TfiA7UGJ3dLzhUPQSHzS0yOiKgQXboQoPtplvG2q0h0uxp
L1XpyEt0POYUTKxrVwNSrG5IECZ2XRUcvRrq150WgmzPJPTwG11JNegJ/gCXMjn1
MWKA0vxJPs42B6tONNcqh3eYfqvmqH1piPy4jA/Yc3ZtZzbADZL/fkJvokEnaLrK
3NID7xH1jxLbO1Kfg0b9gNC2nCLJiJo28wKz2rfZ6gNOYR93FPrrDvSIpCgx9w8=
=xxT3
-----END PGP SIGNATURE-----

Evan Driscoll · Dec 18, 2011

Personally I'd put Python even weaker on account of things such as
'[1,2]*2' and '1 < True' being allowed, but on the other hand it
doesn't allow "1"+1.

Not to mention duck typing, which under my definition I'd argue is
pretty much the weakest of typing that you can apply to structure-like
types which I can think of.

Evan

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJO7VkVAAoJEAOzoR8eZTzgrUcIAIjTXy65MCWOK18WvmSOa9xE
uTXK1LnR3A0rgCKR1tORZanDIAqdvl4hXUhJ0xV5gQa2guOs74QFHMY5vsAbbww8
sHqygFrX61n1FfGJEsReoKyGa9ZzIYwD8PiPH+y+SurpAU84nZLzQv0fZ9HOKHNw
teZs+S+gFFGfZIhbwSkHGtw9kv+7CYzsFca0RVgTtNUWt/gPrG/V0fbNPNWGlpKL
jfZr0zd1xHgzSNXSKCjO6KPtTMdCvWe4rkI7UnY8dq6+QujUj8tsRIH2smeaZtTE
3qhQNqYhGz9MuerqOJYzxh0JodEYKaMvqT4FjbgwpDkE6N2FV0R+2GoS3tEmaUE=
=7pbQ
-----END PGP SIGNATURE-----

Steven D'Aprano · Dec 18, 2011

I'd go stronger than "usually" there. If "1"+1 results in "11", then
that's not weak typing but rather a convenient syntax for
stringification - if every object can (or must) provide a to-string
method, and concatenating anything to a string causes it to be
stringified, then it's still strongly typed.

For what it's worth, Wikipedia's article on type systems gives Javascript
as an example of weak typing because it converts "1"+1 to "11". I agree
with them.

http://en.wikipedia.org/wiki/Type_system

I think that weak and strong typing aren't dichotomies, but extremes in a
continuum. Assembly languages are entirely weak, since everything is
bytes and there are no types to check; some academic languages may be
entire strong; but most real-world languages include elements of both.
Most commonly coercing ints to floats.

Chris Smith's influence article "What To Know Before Debating Type
Systems" goes further, suggesting that weak and strong typing are
meaningless terms. I don't go that far, but you should read his article:

http://cdsmith.wordpress.com/2011/01/09/an-old-article-i-wrote/

Or is a rich set of automated type-conversion functions evidence of weak
typing? And if so, then where is the line drawn - is upcasting of int to
float weak?

To my mind, the distinction that should be drawn is that if two types are
in some sense the same *kind* of thing, then automatic conversions or
coercions are weak evidence of weak typing. Since we consider both ints
and floats to be kinds of numbers, mixed int/float arithmetic is not a
good example of weak typing. But since numbers and strings are quite
different kinds of things, mixed str/int operations is a good example of
weak typing.

But not *entirely* different: numbers can be considered strings of
digits; and non-digit strings can have numeric values. I don't know of
any language that allows 1 + "one" to return 2, but such a thing wouldn't
be impossible.

Chris Angelico · Dec 18, 2011

Sorry, I just subscribed to the list so am stepping in mid-conversation,

Welcome to the list! If you're curious as to what's happened, check
the archives:
http://mail.python.org/pipermail/python-list/

Something like ML or Haskell, which does not even allow integer to
double promotions, is very strong typing. Something like Java, which
allows some arithmetic conversion and also automatic stringification (a
la "1" + 1) is somewhere in the middle of the spectrum. Personally I'd
put Python even weaker on account of things such as '[1,2]*2' and '1 <
True' being allowed, but on the other hand it doesn't allow "1"+1.

But [1,2]*2 is operator overloading. The language doesn't quietly
convert [1,2] into a number and multiply that by 2, it keeps it as a
list and multiplies the list by 2.

Allowing 1 < True is weaker typing. It should be noted, however, that
"1 < True" is False, and "1 > True" is also False. The comparison
doesn't make much sense, but it's not an error.

ChrisA

Roy Smith · Dec 18, 2011

Steven D'Aprano said:
some academic languages may be
entire strong; but most real-world languages include elements of both.
Most commonly coercing ints to floats.

Early Fortran compilers did not automatically promote ints to floats.

But not *entirely* different: numbers can be considered strings of
digits; and non-digit strings can have numeric values. I don't know of
any language that allows 1 + "one" to return 2, but such a thing wouldn't
be impossible.

It is possible for 1 + "one" to be equal to 2 in C or C++. All it takes
is for the string literal to be located at memory location 1. Not
likely, but nothing in the language prevents it.

Chris Angelico · Dec 18, 2011

It is possible for 1 + "one" to be equal to 2 in C or C++. All it takes
is for the string literal to be located at memory location 1. Not
likely, but nothing in the language prevents it.

Not quite; 1 + "one" will be "ne", which might happen to be at memory
location 2. The data type is going to be char* (or, in a modern
compiler, const char*), not int. That said, though, I think that (in
this obscure circumstance) it would compare equal with 2. For what
that's worth.

ChrisA

Evan Driscoll · Dec 18, 2011

Welcome to the list! If you're curious as to what's happened, check
the archives:
http://mail.python.org/pipermail/python-list/

Thanks! Incidentally, is there a good way to respond to the original
post in this thread, considering it wasn't delivered to me?

But [1,2]*2 is operator overloading. The language doesn't quietly
convert [1,2] into a number and multiply that by 2, it keeps it as a
list and multiplies the list by 2.

Allowing 1 < True is weaker typing. It should be noted, however, that
"1 < True" is False, and "1 > True" is also False. The comparison
doesn't make much sense, but it's not an error.

I see where you're coming from, especially as I wouldn't consider
overloading a function on types (in a language where that phrase makes
sense) moving towards weak typing either. Or at least I wouldn't have
before this discussion... At the same time, it seems a bit
implementationy. I mean, suppose '1' were an object and implemented
__lt__. Does it suddenly become not weak typing because of that?

(The other thing is that I think strong vs weak is more subjective than
many of the other measures. Is '1 < True' or '"1"+1' weaker? I think it
has a lot to do with how the operations provided by the language play to
your expectations.)

Not quite; 1 + "one" will be "ne", which might happen to be at memory
location 2. The data type is going to be char* (or, in a modern
compiler, const char*), not int.

I'm not quite sure I'd say that it could be 2, exactly, but I definitely
disagree with this... after running 'int x = 5, *p = &x' would you say
that "p is 5"? (Assume &x != 5.) 1+"one" *points to* "ne", but it's
still a pointer.

Evan

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJO7W9kAAoJEAOzoR8eZTzgj6wH/0NGvt3ENA62bRVoA0hIeMm6
HQNcx8F5IvPIscLpD5pbUdLk3QHHNdZWk4RDTemDW16SIJfmxxyQG64Kt9uc8LHw
kORW7AnUZl9x+zBsB8XBU8nfAJrKvGEoum1RBaq8MxAkBEkLqZoWK8+y67iAR4eI
fg3iFwZ6UPJFt/8dUjC7rIKAVWtXuzI2WU4ojXNn7s4KIbktL5dJij5wdKAe0id2
ZKmh49WzocnY7saZTSrodW4mtfDppc+tyyd1BLnjXM5fIHaIYRTywdO/Rj7OfpN1
QdM56MdJgFMSO6SRfZHAToXhCmJKTJ80STz1yAqm80Y87klITe9K/gDNTPE3Fbc=
=Ee7m
-----END PGP SIGNATURE-----

buck · Dec 18, 2011

I like the spirit of this. Let's look at your examples.

Examples of use:
head, tail::tuple = ::sequence
def foo(args::list, kwargs::dict): pass
foo:args, ::kwargs)

My initial reaction was "nonono!", but this is simply because of the ugliness. The double-colon is very visually busy.

I find that your second example is inconsistent with the others. If we say that the variable-name is always on the right-hand-side, we get:

def foo(list::args, dict::kwargs): pass

This nicely mirrors other languages (such as in your C# example: "float foo") as well as the old python behavior (prefixing variables with */** to modify the assignment).

As for the separator, let's examine the available ascii punctuation. Excluding valid variable characters, whitespace, and operators, we have:

! -- ok.
" -- can't use this. Would look like a string.
# -- no. Would looks like a comment.
$ -- ok.
' -- no. Would look like a string.
( -- no. Would look like a function.
) -- no. Would look like ... bad syntax.
, -- no. Would indicate a separate item in the variable list.
.. -- no. Would look like an attribute.
: -- ok, maybe. Seems confusing in a colon-terminated statement.
; -- no, just no.
? -- ok.
@ -- ok.
[ -- no. Would look like indexing.
] -- no.
` -- no. Would look like a string?
{ -- too strange
} -- too strange
~ -- ok.

That leaves these. Which one looks least strange?

float ! x = 1
float $ x = 1
float ? x = 1
float @ x = 1

The last one looks decorator-ish, but maybe that's proper. The implementation of this would be quite decorator-like: take the "normal" value of x, pass it through the indicated function, assign that value back to x.

Try these on for size.

head, @tuple tail = sequence
def foo(@list args, @dict kwargs): pass
foo(@args, @kwargs)

For backward compatibility, we could say that the unary * is identical to @list and unary ** is identical to @dict.

-buck

Chris Angelico · Dec 18, 2011

Thanks! Incidentally, is there a good way to respond to the original
post in this thread, considering it wasn't delivered to me?

I don't know of a way, but this works. It's all part of the same thread.

I mean, suppose '1' were an object and implemented
__lt__. Does it suddenly become not weak typing because of that?

Is it weak typing to overload a function?

//C++ likes this a lot.
int foo(int x,int y) {return x*3+y;}
double foo(double x,double y) {return x*3+y;}

This is definitely making the term "strongly typed language" fairly useless.

(The other thing is that I think strong vs weak is more subjective than
many of the other measures. Is '1 < True' or '"1"+1' weaker? I think it
has a lot to do with how the operations provided by the language play to
your expectations.)

+1. My expectations are:
1) The Boolean value "True" might be the same as a nonzero integer, or
might not; it would make sense to use inequality comparisons with
zero, MAYBE, but not with 1. So I don't particularly care what the
language does with "1 < True", because it's not something that I would
normally do.
2) "1"+1, in any high level language with an actual string type, I
would expect to produce "11". It makes the most sense this way; having
it return 2 means there's a special case where the string happens to
look like a number - meaning that " 1"+1 is different from "1"+1. That
just feels wrong to me... but I'm fully aware that many other people
will disagree.

Yep, it's pretty subjective.

I'm not quite sure I'd say that it could be 2, exactly, but I definitely
disagree with this... after running 'int x = 5, *p = &x' would you say
that "p is 5"? (Assume &x != 5.) 1+"one" *points to* "ne", but it's
still a pointer.

Point. I stand corrected. I tend to think of a char* as "being" the
string, even though technically it only points to the beginning of it;
it's the nearest thing C has to a string type. (To be honest, it's
still a lot better than many high level languages' string types for
certain common operations - eg trimming leading whitespace is pretty
efficient on a PSZ.) In your example, p would be some integer value
that is the pointer, but *p is 5. However, there's really no syntax in
C to say what the "string value" is.

ChrisA

Chris Angelico · Dec 18, 2011

The last one looks decorator-ish, but maybe that's proper. The implementation of this would be quite decorator-like: take the "normal" value of x, pass it through the indicated function, assign that value back to x.

Try these on for size.

head, @tuple tail = sequence
def foo(@list args, @dict kwargs): pass
foo(@args, @kwargs)

That's reasonably clean as a concept, but it's not really quite the
same. None of these examples is the way a decorator works; each of
them requires a fundamental change to the way Python handles the rest
of the statement.

head, @tuple tail = sequence
-- Does this mean "take the second element of a two-element sequence,
pass it through tuple(), and store the result in tail"? Because, while
that might be useful, and would make perfect sense as a decorator,
it's insufficient as a replacement for current "*tail" syntax.

ChrisA

Evan Driscoll · Dec 18, 2011

Try these on for size.

head, @tuple tail = sequence
def foo(@list args, @dict kwargs): pass
foo(@args, @kwargs)

For backward compatibility, we could say that the unary * is identical to @list and unary ** is identical to @dict.

I like this idea much more than the original one. In addition to the
arguments buck puts forth, which I find compelling, I have one more: you
go to great length to say "this isn't really type checking in any sense"
(which is true)... but then you go forth and pick a syntax that looks
almost exactly like how you name types in many languages! (In fact,
except for the fact that it's inline, the 'object :: type' syntax is
*exactly* how you name types in Haskell.)

buck's syntax still has some of the feel of "I wonder if this is type
checking" to a newbie, but much much less IMO.

I have a bigger objection with the general idea, however.It seems very
strange that you should have to specify types to use it. If the */**
syntax were removed, that would make the proposed syntax very very
unusual for Python. I could be missing something, but I can think of any
other place where you have to name a type except where the type is an
integral part of what you're trying to do. (I would not say that
choosing between tuples and lists are an integral part of dealing with
vararg functions.) If */** were to stick around, I could see 99% of
users continuing to use them. And then what has the new syntax achieved?

You can fix this if you don't require the types and just allow the user
to say "def foo(@args)" and "foo(@args)". Except... that's starting to
look pretty familiar... (Not to mention if you just omit the type from
the examples above you need another way to distinguish between args and
kwargs.)

I have one more suggestion.

I do have one more thing to point out, which is that currently the
Python vararg syntax is very difficult to Google for. In the first pages
of the four searches matching "python (function)? (star | asterisk)",
there was just one relevant hit on python.org which wasn't a bug report.
I certainly remember having a small amount of difficulty figuring out
what the heck * and ** did the first time I encountered them.

This would suggest perhaps some keywords might be called for instead of
operators. In the grand scheme of things the argument packing and
unpacking are not *all* that common, so I don't think the syntactic
burden would be immense. The bigger issue, of course, would be picking
good words.

This also helps with the issue above. Let's say we'll use 'varargs' and
'kwargs', though the latter too well-ingrained in code to steal. (I
don't want to get too much into the debate over *what* word to choose.
Also these don't match the 'head, *tail = l' syntax very well.) Then we
could say:
def foo(varargs l, kwargs d):
bar(varargs l, kwargs d)
and varargs would be equivalent to * and kwargs would be equivalent to
**. But then you could also say
def foo(varargs(list) l, kwargs(dict) d)

Evan

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJO7XsnAAoJEAOzoR8eZTzgYsAH/2Q/po+5w9/FA7vOgx1cii6X
WBHx772m/C24XKt8U2SM4a3rT3QUs69L9LE3Sk8uzq6OuQHycqjxdvhwFUJWqX+m
5jQw15VK554/bSI8b+AX3oLsam8xbSsZ9J2Ru49KoWMcj7R0ns1vJES4MFwu9L+M
y9E0/miUcSpGQe18N1VlYmEdF5wJcg4vitJrimLC5CLttWZiUq94CoaQY2ju3cZk
bg14ISsOlmy3uW8D84WUd7YxI4dPVyTg2PAhnwdZgpTnCvSe12Mu6R8WUxBvbzwj
xvle/VOXhgC8msocMBA756A2KvJEtLwCofBrjeEnVHKn0S7QyUMjpXDsZoBUS7E=
=2y+K
-----END PGP SIGNATURE-----

Evan Driscoll · Dec 18, 2011

I do have one more thing to point out, which is that currently the
Python vararg syntax is very difficult to Google for. In the first pages
of the four searches matching "python (function)? (star | asterisk)",
there was just one relevant hit on python.org which wasn't a bug report.

Though in the interest of full disclosure, and I should have said this
before, the first hit for all of those searches except "python star"
*is* relevant and reasonably helpful; just from someone's blog post
instead of "from the horse's mouth", so to speak.

Evan

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJO7YAbAAoJEAOzoR8eZTzg1q8IAINyMZUpbMeEKHO8DkCTEfWE
kdsazCuM3Iv9najPtbRmYs8GnNjsWG2rTqHlxrjhtjKc9dkbOdmskRshP/pcPzSv
Uorvf7XDLGbJ/GwTERIDW7PqPVeXlTtdW6jW7xPrztWWU6/hIEiETFEmQwK/CHDG
82wb3rdxOAS+RxnVUfbYJq3RXI0IiH+ATVWnUpXlnq2IuosKeJlqNZ6Rk8ADB0R4
ZVmfHvSNb7TpSG4HNJZODVC3O8t1+T1jsYjwgMp/oN5d8x/2cTrrwaqLqYAR+dVS
UVYLEt85I9FJqb0ddubKAZH+vmnwbzgwgcY0fL2WcloW9/CpCgrS+BNhyMqlOi0=
=XYhS
-----END PGP SIGNATURE-----

Unpacking the structures	1	Dec 19, 2013
Errors with HTML packing slip code	2	Jul 5, 2023
What is the most astounding C++ syntax construct?	0	Dec 22, 2022
Verbose and flexible args and kwargs syntax	0	Dec 11, 2011
Is this PEP-able? (syntax for functools.partial-like functionality)	3	Aug 20, 2013
Verbose and flexible args and kwargs syntax	88	Dec 11, 2011
Non-identifiers in dictionary keys for **expression syntax	3	May 23, 2013
Packing and unpacking bitvectors	0	Oct 9, 2003

Pythonification of the asterisk-based collection packing/unpacking syntax

Eelco

Steven D'Aprano

Roy Smith

Chris Angelico

Eelco

Steven D'Aprano

Steven D'Aprano

Chris Angelico

Evan Driscoll

Evan Driscoll

Steven D'Aprano

Chris Angelico

Roy Smith

Chris Angelico

Evan Driscoll

buck

Chris Angelico

Chris Angelico

Evan Driscoll

Evan Driscoll

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads