Death to tuples!

M

Mike Meyer

It seems that the distinction between tuples and lists has slowly been
fading away. What we call "tuple unpacking" works fine with lists on
either side of the assignment, and iterators on the values side. IIRC,
"apply" used to require that the second argument be a tuple; it now
accepts sequences, and has been depreciated in favor of *args, which
accepts not only sequences but iterators.

Is there any place in the language that still requires tuples instead
of sequences, except for use as dictionary keys?

If not, then it's not clear that tuples as a distinct data type still
serves a purpose in the language. In which case, I think it's
appropriate to consider doing away with tuples. Well, not really: just
changing their intended use, changing the name to note that, and
tweaking the implementation to conform to this.

The new intended use is as an immutable sequence type, not a
"lightweight C struct". The new name to denote this new use -
following in the footsteps of the set type - is "frozenlist". The
changes to the implementation would be adding any non-mutating methods
of list to tuple, which appears to mean "index" and "count".

Removing the tuple type is clearly a Py3K action. Adding frozenlist
could be done now. Whehter or not we could make tuple an alias for
frozenlist before Py3K is an interesting question.

<mike
 
P

Paul Rubin

Mike Meyer said:
The new intended use is as an immutable sequence type, not a
"lightweight C struct". The new name to denote this new use -
following in the footsteps of the set type - is "frozenlist". The
changes to the implementation would be adding any non-mutating methods
of list to tuple, which appears to mean "index" and "count".

I like this. I also note that tuples predated classes. The
appropriate way to do a C struct these days is often as a class
instance with __slots__.
 
D

Dan Bishop

Mike said:
It seems that the distinction between tuples and lists has slowly been
fading away. What we call "tuple unpacking" works fine with lists on
either side of the assignment, and iterators on the values side. IIRC,
"apply" used to require that the second argument be a tuple; it now
accepts sequences, and has been depreciated in favor of *args, which
accepts not only sequences but iterators.

Is there any place in the language that still requires tuples instead
of sequences, except for use as dictionary keys?

The % operator for strings. And in argument lists.

def __setitem__(self, (row, column), value):
...
 
S

Sybren Stuvel

Mike Meyer enlightened us with:
Is there any place in the language that still requires tuples
instead of sequences, except for use as dictionary keys?

Anything that's an immutable sequence of numbers. For instance, a pair
of coordinates. Or a value and a weight for that value.
If not, then it's not clear that tuples as a distinct data type
still serves a purpose in the language. In which case, I think it's
appropriate to consider doing away with tuples.

I really disagree. There are countless examples where adding or
removing elements from a list just wouldn't be right.
The new intended use is as an immutable sequence type, not a
"lightweight C struct".

It's the same, really. A lightweight list of elements, where each
element has its own meaning, is both an immutable sequence as well as
a lightweight C struct.

Sybren
 
P

Peter Hansen

Mike said:
It seems that the distinction between tuples and lists has slowly been
fading away. What we call "tuple unpacking" works fine with lists on
either side of the assignment, and iterators on the values side. IIRC,
"apply" used to require that the second argument be a tuple; it now
accepts sequences, and has been depreciated in favor of *args, which
accepts not only sequences but iterators.

Is there any place in the language that still requires tuples instead
of sequences, except for use as dictionary keys?

Would it be possible to optimize your "frozenlist" so that the objects
would be created during compilation time and rather than only during
runtime? If not then tuples() have a distinct performance advantage in
code like the following where they are used as local constants:
.... if x in (1, 3, 5, 7, 8):
.... print 'x is really odd'
.........
3 20 LOAD_FAST 0 (x)
23 LOAD_CONST 8 ((1, 3, 5, 7, 8))
26 COMPARE_OP 6 (in)
.... if x in [1,3,5,7,8]:
.... print 'x is really odd'
........
3 20 LOAD_FAST 0 (x)
23 LOAD_CONST 2 (1)
26 LOAD_CONST 3 (3)
29 LOAD_CONST 4 (5)
32 LOAD_CONST 5 (7)
35 LOAD_CONST 6 (8)
38 BUILD_LIST 5
41 COMPARE_OP 6 (in)


-Peter
 
A

Antoon Pardon

Op 2005-11-28 said:
Mike said:
It seems that the distinction between tuples and lists has slowly been
fading away. What we call "tuple unpacking" works fine with lists on
either side of the assignment, and iterators on the values side. IIRC,
"apply" used to require that the second argument be a tuple; it now
accepts sequences, and has been depreciated in favor of *args, which
accepts not only sequences but iterators.

Is there any place in the language that still requires tuples instead
of sequences, except for use as dictionary keys?

Would it be possible to optimize your "frozenlist" so that the objects
would be created during compilation time and rather than only during
runtime? If not then tuples() have a distinct performance advantage in
code like the following where they are used as local constants:
... if x in (1, 3, 5, 7, 8):
... print 'x is really odd'
.......
3 20 LOAD_FAST 0 (x)
23 LOAD_CONST 8 ((1, 3, 5, 7, 8))
26 COMPARE_OP 6 (in)
... if x in [1,3,5,7,8]:
... print 'x is really odd'
......
3 20 LOAD_FAST 0 (x)
23 LOAD_CONST 2 (1)
26 LOAD_CONST 3 (3)
29 LOAD_CONST 4 (5)
32 LOAD_CONST 5 (7)
35 LOAD_CONST 6 (8)
38 BUILD_LIST 5
41 COMPARE_OP 6 (in)

I'm probably missing something, but what would be the problem if this
list was created during compile time?
 
D

Duncan Booth

Antoon said:
def func(x):
... if x in [1,3,5,7,8]:
... print 'x is really odd'
...
dis.dis(func)
...
3 20 LOAD_FAST 0 (x)
23 LOAD_CONST 2 (1)
26 LOAD_CONST 3 (3)
29 LOAD_CONST 4 (5)
32 LOAD_CONST 5 (7)
35 LOAD_CONST 6 (8)
38 BUILD_LIST 5
41 COMPARE_OP 6 (in)

I'm probably missing something, but what would be the problem if this
list was created during compile time?

Not much in this particular instance. 'x in aList' is implemented as
aList.__contains__(x), so there isn't any easy way to get hold of the
list[*] and keep a reference to it. On the other hand:

def func(x):
return x + [1, 3, 5, 7, 8]

we could pass in an object x with an add operator which gets hold of its
right hand operand and mutates it.

So the problem is that we can't just turn any list used as a constant into
a constant list, we need to be absolutely sure that the list is used only
in a few very restricted circumstances, and since there isn't actually any
benefit to using a list here rather than a tuple it hardly seems
worthwhile.

There might be some mileage in compiling the list as a constant and copying
it before use, but you would have to do a lot of timing tests to be sure.

[*] except through f.func_code.co_consts, but that doesn't count.
 
A

Aahz

Is there any place in the language that still requires tuples instead
of sequences, except for use as dictionary keys?

If not, then it's not clear that tuples as a distinct data type
still serves a purpose in the language. In which case, I think it's
appropriate to consider doing away with tuples. Well, not really:
just changing their intended use, changing the name to note that, and
tweaking the implementation to conform to this.

There's still the obstacle known as Guido. I suggest you write a PEP so
that whatever decision gets made, there's a document.
 
M

Mike Meyer

Steven Bethard said:
Dan said:
The % operator for strings. And in argument lists.
def __setitem__(self, (row, column), value):
...
Interesting that both of these two things[1][2] have recently been
suggested as candidates for removal in Python 3.0.
[1]http://www.python.org/dev/summary/2005-09-01_2005-09-15.html#string-formatting-in-python-3-0
[2]http://www.python.org/dev/summary/2005-09-16_2005-09-30.html#removing-nested-function-parameters

#2 I actually mentioned in passing, as it's part of the general
concept of tuple unpacking. When names are bound, you can use a
"tuple" for an lvalue, and the sequence on the rhs will be "unpacked"
into the various names in the lvalue:

for key, value = mydict.iteritems(): ...
a, (b, c) = (1, 2), (3, 4)

I think of the parameters of a function as just another case of
this; any solution that works for the above two should work for
function paremeters as well.

<mike
 
M

Mike Meyer

Peter Hansen said:
Would it be possible to optimize your "frozenlist" so that the objects
would be created during compilation time and rather than only during
runtime? If not then tuples() have a distinct performance advantage
in code like the following where they are used as local constants:

No, because I want to change the definition.

Tuples have the problem that they are immutable, except when they're
not (or for proper values of immutable, your choice). They're
hashable, except when they're not. Or equivalently, they can be used
as dictionary keys - or set elements - except when they can't.

I want frozenlists to *not* be that way. A frozenlist should *always*
be valid as a dictionary key. This changes the semantics significantly
- and means that creating a frozenlist from a list could be very
expensive.

Doing this requires extending the "frozen" concept, adding two new
types, a new magic method, and maybe a new builtin function.

Right now, a "frozenset" is always valid as a dictionary key. The concept
extension is to provide facilities to freeze nearly everything.

New types:
frozenlist: a tuple with count/index methods, and the constraint
that all the elements are frozen.
frozendict: A dictionary without __setitem__, and with the constraint
that all values stored in it are frozen.

New magic method: __freeze__(self): returns a frozen version of self.

Possibe new builtin:
freeze(obj) - returns the frozen form of obj. For lists, it
returns the equivalent frozenlist; for a set, it returns the
equivalent frozenset. For dicts, it returns the equivalent
frozendict. For other built-ins, it returns the original object.
For classes written in Python, if the instance has __freeze__, it
returns the result of calling that. Otherwise, if the instance
has __hash__, or does not have __cmp__ and __eq__ (i.e. - if
the class is hashable as is), it returns the instance. If all
of that fails, it throws an exception. While not strictly
required, it appears to be handy, and it makes thinking about
the new types much less simler.

You can't always create a frozenlist (or frozendict) at compile time. I.e.:

a = []
foo(a)
fa = frozenlist([a])

You don't know at compile time what a is going to be. Since you can't
change the frozenlist after creation, you have to wait until you know
the value of a to create fa.

This version is *not* suitable as an alias for tuple. It might work as
a replacement for tuple in Py3K, but a number of issues have to be
worked out - most notable tuple packing. % on strings is easy - you
make it accept either a list or a frozenlist (or, if we're going to
change it, maybe an arbitary sequence). But tuple unpacking would need
some real work. Do we blow off the comma syntax for creating tuples,
and force peole to write real lists? Doesn't seem like a good idea to
me.

<mike
 
S

skip

Mike> Tuples have the problem that they are immutable, except when
Mike> they're not (or for proper values of immutable, your
Mike> choice). They're hashable, except when they're not. Or
Mike> equivalently, they can be used as dictionary keys - or set
Mike> elements - except when they can't.

For those of us not following this thread closely, can you identify cases
where tuples are mutable, not hashable or can't be used as dictionary keys?
I've never encountered any such cases.

Skip
 
P

Paul Rubin

For those of us not following this thread closely, can you identify cases
where tuples are mutable, not hashable or can't be used as dictionary keys?
I've never encountered any such cases.

t = ([1,2], [3,4])
 
A

Aahz

For those of us not following this thread closely, can you identify
cases where tuples are mutable, not hashable or can't be used as
dictionary keys? I've never encountered any such cases.

t = ([1,2], [3,4])

Exactly. Technically, the tuple is still immutable, but because it
contains mutable objects, it is no longer hashable and cannot be used as
a dict key. One of the odder bits about this usage is that augmented
assignment breaks, but not completely:
Traceback (most recent call last):
([1, 2, 5], [3, 4])

(I'm pretty sure Skip has seen this before, but I figure it's a good
reminder.)
 
P

Paddy

I would consider
t = ([1,2], [3,4])
to be assigning a tuple with two list elements to t.
The inner lists will be mutable but I did not know you could change the
outer tuple and still have the same tuple object.

- Pad.
 
P

Paul Rubin

Paddy said:
I would consider
t = ([1,2], [3,4])
to be assigning a tuple with two list elements to t.
The inner lists will be mutable but I did not know you could change the
outer tuple and still have the same tuple object.

Whether t is mutable is a question of definitions. Either way, you
can't hash t or use it as a dictionary key.
 
M

Mike Meyer

Mike> Tuples have the problem that they are immutable, except when
Mike> they're not (or for proper values of immutable, your
Mike> choice). They're hashable, except when they're not. Or
Mike> equivalently, they can be used as dictionary keys - or set
Mike> elements - except when they can't.
For those of us not following this thread closely, can you identify cases
where tuples are mutable, not hashable or can't be used as dictionary keys?
I've never encountered any such cases.

Actually, that didn't come from this thread. But it happens if one of
the elements in the tuple is mutable. For instance:
Traceback (most recent call last):
File said:
t[0].append(1)
t == [], False
{t: 23}
Traceback (most recent call last):

For builtins, the three cases - hashable, immutable and usable as
dictionary keys - are all identical. Instances of Python classes with
proper __hash__ and __eq__ defintions will be hashable and usable as
dictionary keys. Immutable for them is a messy question.

<mike
 
S

skip

>>> For those of us not following this thread closely, can you identify
>>> cases where tuples are mutable, not hashable or can't be used as
>>> dictionary keys? I've never encountered any such cases.
>>
>> t = ([1,2], [3,4])

...
>>>> t = ([1,2], [3,4])
>>>> t[0] += [5]
aahz> Traceback (most recent call last):
aahz> ([1, 2, 5], [3, 4])

aahz> (I'm pretty sure Skip has seen this before, but I figure it's a
aahz> good reminder.)

Actually, no, I hadn't. I don't use tuples that way. It's rare when I have
a tuple whose elements are not all floats, strings or ints, and I never put
mutable containers in them.

Skip
 
P

Peter Hansen

Antoon said:
Op 2005-11-28 said:
Would it be possible to optimize your "frozenlist" so that the objects
would be created during compilation time and rather than only during
runtime? If not then tuples() have a distinct performance advantage in
code like the following where they are used as local constants:
[snip code]
I'm probably missing something, but what would be the problem if this
list was created during compile time?

?? I was *asking* if there would be a problem doing that, not saying
there would be a problem. I think you misread or misunderstood something.

-Peter
 
S

Steven Bethard

Mike said:
Steven Bethard said:
Dan said:
Mike Meyer wrote:


Is there any place in the language that still requires tuples instead
of sequences, except for use as dictionary keys?

The % operator for strings. And in argument lists.
def __setitem__(self, (row, column), value):
...

Interesting that both of these two things[1][2] have recently been
suggested as candidates for removal in Python 3.0.
[1]http://www.python.org/dev/summary/2005-09-01_2005-09-15.html#string-formatting-in-python-3-0
[2]http://www.python.org/dev/summary/2005-09-16_2005-09-30.html#removing-nested-function-parameters

#2 I actually mentioned in passing, as it's part of the general
concept of tuple unpacking. When names are bound, you can use a
"tuple" for an lvalue, and the sequence on the rhs will be "unpacked"
into the various names in the lvalue:

for key, value = mydict.iteritems(): ...
a, (b, c) = (1, 2), (3, 4)

I think of the parameters of a function as just another case of
this; any solution that works for the above two should work for
function paremeters as well.

The difference is that currently, you have to use tuple syntax in
functions, while you have your choice of syntaxes with normal unpacking::

py> def f(a, (b, c)):
.... pass
....
py> def f(a, [b, c]):
.... pass
....
Traceback ( File "<interactive input>", line 1
def f(a, [b, c]):
^
SyntaxError: invalid syntax
py> a, (b, c) = (1, 2), (3, 4)
py> a, [b, c] = (1, 2), (3, 4)
py> a, [b, c] = [1, 2], (3, 4)
py> a, [b, c] = [1, 2], [3, 4]

Of course, the result in either case is still a tuple. So I do agree
that Python doesn't actually require tuples in function definitions;
just their syntax.

STeVe
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,586
Members
45,089
Latest member
Ketologenic

Latest Threads

Top