lists v. tuples

M

MartinRinehart

What are the considerations in choosing between:

return [a, b, c]

and

return (a, b, c) # or return a, b, c

Why is the immutable form the default?
 
C

castironpi

What are the considerations in choosing between:

   return [a, b, c]

and

    return (a, b, c) # or return a, b, c

Why is the immutable form the default?

Using a house definition from some weeks ago, a tuple is a data
structure such which cannot contain a refrence to itself. Can a
single expression refer to itself ever?
 
N

Ninereeds

What are the considerations in choosing between:

return [a, b, c]

and

return (a, b, c) # or return a, b, c

Why is the immutable form the default?

My understanding is that the immutable form is not the default -
neither form is a default. The syntax for mutable lists requires
square brackets. The syntax for immutable tuples only requires commas.
However, commas are also used for other purposes. The parens for
tuples are the same parens that might wrap any subexpression, in this
case guarding against misinterpretation of commas. Since the parens
are normally included for consistency as well as disambiguation, they
end up being part of the tuple psychologically, but to Python itself
they are separate.

Personally, I'd make the parens compulsory so that mindsets and
language specification are better aligned. If, that is, I was
inventing a new language. But you can be sure that there's plenty of
code out there that would break if a change like that was made now.

As for choosing, if you never plan to modify the members of the
sequence, a tuple expresses that intent and allows Python to enforce
it. Some operations on tuples are probably also more efficient as a
result. That said, 90%+ of the time I use a list either way. After
all, as requirements change I might find I do need to modify after
all.

A change from tuple to list in any non-trivial case would require a
thorough test suite to ensure that all cases are updated correctly
(partly because Python is dynamically typed), and ashamed as I am to
admit it, my test suites are rarely that thorough. Without that
testing, there might be an obscure case where you still create a
tuple, and that tuple gets passed to code that expects a list and
tries to replace an element.
 
N

Ninereeds

Using a house definition from some weeks ago, a tuple is a data
structure such which cannot contain a refrence to itself. Can a
single expression refer to itself ever?

Can't imagine why that feature was highlighted in particular, but a
list can reference itself even though an expression can't.

The following example looks a bit self-referential, but isn't...

a = 2
a = [1, a, 3] # result [1, 2, 3]

The following additional line, however does create a self-referencing
list...

a [1] = a

The result being [1, [...], 2]

It's nice to see that Python can handle the output for this without
going into an infinite recursion - which is exactly what it used to do
in the distant past.

A tuple cannot be made to reference itself because it cannot be
modified after creation. The key point is that lists are mutable,
whereas tuples are not.
 
D

Duncan Booth

What are the considerations in choosing between:

return [a, b, c]

and

return (a, b, c) # or return a, b, c
A common explanation for this is that lists are for homogenous
collections, tuples are for when you have heterogenous collections i.e.
related but different things.

If you follow this line of argument then when you want to return some
values from a function, e.g. url, headers and data a tuple would be the
appropriate thing to use.

If you really are returning a homogenous collection (e.g. top 5 urls)
then a list would be more appropriate.

Another way to look at it is what you expect the caller to do with the
results. If they are just going to unpack a fixed set of values then a
tuple makes sense. If you write:

return url, headers, data

then the caller writing:

url, headers, data = frobozz()

has a certain symmetry.

It isn't set in stone of course: use whatever you want or whatever feels
right at the time.
Why is the immutable form the default?
It isn't. It uses whichever type you specify but tuples may involve a
little less typing.
 
R

Robert Bossy

What are the considerations in choosing between:

return [a, b, c]

and

return (a, b, c) # or return a, b, c

Why is the immutable form the default?

Using a house definition from some weeks ago, a tuple is a data
structure such which cannot contain a refrence to itself. Can a
single expression refer to itself ever?
In some way, I think this answer will be more confusing than
enlightening to the original poster...

The difference is that lists are mutable, tuples are not. That means you
can do the following with a list:
- add element(s)
- remove element(s)
- re-assign element(s)
These operations are impossible on tuples. So, by default, I use lists
because they offer more functionality.
But if I want to make sure the sequence is not messed up with later, I
use tuples. The most frequent case is when a function (or method)
returns a sequence whose fate is to be unpacked, things like:

def connect(self, server):
# try to connect to server
return (handler, message,)

It is pretty obvious that the returned value will (almost) never be used
as is, the caller will most probably want to unpack the pair. Hence the
tuple instead of list.

There's a little caveat for beginners: the tuple is immutable, which
doesn't mean that each element of the tuple is necessarily immutable.

Also, I read several times tuples are more efficient than lists, however
I wasn't able to actually notice that yet.

Cheers,
RB
 
N

Ninereeds

A common explanation for this is that lists are for homogenous
collections, tuples are for when you have heterogenous collections i.e.
related but different things.

I interpret this as meaning that in a data table, I should have a list
of records but each record should be a tuple of fields, since the
fields for a table usually have different forms whereas the records
usually all have the same record layout.

Makes sense, but not exactly *because* of the homogenous/heterogenous
thing but rather because a record is smaller than a table, and a
records component fields are more closely bound together than the
records in a table.

In short, adding, removing and overwriting records are essential
operations (though Haskell programmers might disagree). Any
modifications to records themselves can practically be handled as
replacing one complete tuple with another.

As a final note, I tend to implement tables as either lists of
dictionaries, or lists of class instances. That way my fields are
named. Its better in maintenance terms if I need to add new fields to
the tables later on.
 
D

Duncan Booth

Ninereeds said:
I interpret this as meaning that in a data table, I should have a list
of records but each record should be a tuple of fields, since the
fields for a table usually have different forms whereas the records
usually all have the same record layout.

That is indeed what Python's Database API usually does (although it doesn't
mandate it):

.fetchmany([size=cursor.arraysize])

Fetch the next set of rows of a query result, returning a
sequence of sequences (e.g. a list of tuples). An empty
sequence is returned when no more rows are available.
 
C

castironpi

I interpret this as meaning that in a data table, I should have a list
of records but each record should be a tuple of fields, since the
fields for a table usually have different forms whereas the records
usually all have the same record layout.

That is indeed what Python's Database API usually does (although it doesn't
mandate it):

        .fetchmany([size=cursor.arraysize])

            Fetch the next set of rows of a query result, returning a
            sequence of sequences (e.g. a list of tuples). An empty
            sequence is returned when no more rows are available.

Two things:
False

in S. D'Aprano's example, which somewhat obscures (casts a shadow on)
its intuitive meaning. "Hand me everything that's in there. > For x
in There: hand( me, x ).
For x in rchain( There ): hand( me, x ) < Hand me anything that's in anything that's in there, or is in there.

Closet = [ Box[ ball ] ] >
Box[ ball ]?
or Box, ball?

Despite that 'in' is a very common word [lacks cit.], you don't often
get ambiguities (read: questions) about containers in something. Too
custom. The word isn't defined at that extremity of combinations /
cartesian products of particulars (context). It doesn't have a common
sense, and computer science is a pretty unique field in that it finds
(or, invents) all-and-only definitions of words.

(Which reminds me, what about a classmethod __getitem__ construtor?)

However, from mathematics, which is at least older, if not more in
tune with the common senses and uses, (there exists an a and b such
that) Member( a, b ) != Exists x such that Member( a, b ) & Member( b,
x ). In other words, Python honors math's definition.

In the example, would you treat 'ball' separately unless explicity
asked to? (But, keep sure not to alter the contents of any of the
containers that are in there.) There's a fine example of
miscommunication: neither behavior was the default between the pair:
one knew one default; the other knew another. (But, it's ok to alter
the contents of any containers.) Aside, do you think there are trends
among who and when uses what defaults? Such as math background,
social status, etc.?

Computers are very good at attention to detail: ("I" am.) "You did
not specify whether to alter containers in it." Of course, the
program that's following would have to have noticed two things: that
there's a short-cut if you can, and you might not want it to. It's
not only a problem of formal systems: No= { X, Y, Z }. Some (finite)
context hasn't determined the convention of wish-expression in every
combination of (future possible) cases.

while 1:
print( "You made an assumption!" )
print( "No, -you- made an assumption!" )
break

while 1:
print( "I made an assumption." )
print( "I also made an assumption." )

But is it determined whether it's determined whether the person meant
to? That is, can you tell whether the person didn't specify? "I'm
sorry, we haven't spoken about containers in containers enough for me
to tell what your wishes are."

My worry is that popular judgment isn't best in science. What factors
in to the architect's decision? His feelings about use cases? His
calculations about use cases? His feelings about popularity of each?
His calculations about popularity? Mabye Python would have ten extra
speakers if __contains__ had been defined differently! X percent of
all shortest-form syntactic combinations can be written in less tokens
with one definition, but no matter how much I remind them, they won't
remember combination C, so only Y percent are actually, practically
ever written in 2008 Earth programming. Because we're stupid, no.
Because we have other priorities than programming. (But my bigger
house for me was more important.)

'rchain', recursive chain, might be the next function on the path that
itertools is on.



Databases have come to (but don't have to stay) use integers as
autokeys in records. That makes one way to refer to a record to refer
to its integer. However, if one's testing for identity between
records, that key is redundant. Just identify yourself, and refer
back to earlier in context. And incidentally, why namedtuples but no
namedlists? Or nameddicts? And could one define an anonymous type?

This is interesting:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
NameError: name 'G' is not defined

A type went out of scope. Is there an abstraction of 'pickle' which
doesn't need the type defined in its target module?
pickle.PicklingError: Can't pickle <class '__main__.A'>: it's not
found as __main__.A

But all the information's there!
 
D

Duncan Booth

That's actually interesting.

Just for the avoidance of doubt, I didn't write the 'b in b' line:
castironpi is replying to himself without attribution.

P.S. I still don't see the relevance of any of castironpi's followup to my
post, but since none it made any sense to me I guess it doesn't matter.
 
G

George Sakkis

Just for the avoidance of doubt, I didn't write the 'b in b' line:
castironpi is replying to himself without attribution.

P.S. I still don't see the relevance of any of castironpi's followup to my
post, but since none it made any sense to me I guess it doesn't matter.


Plus, it does work fine over here:

Python 2.5.1 (r251:54863, May 8 2007, 14:46:30)
[GCC 3.4.6 20060404 (Red Hat 3.4.6-3)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
a = []
a.append(a)
a [[...]]
a in a
True


George
 
C

castironpi

b in b
Just for the avoidance of doubt, I didn't write the 'b in b' line:
castironpi is replying to himself without attribution.

P.S. I still don't see the relevance of any of castironpi's followup to my
post, but since none it made any sense to me I guess it doesn't matter.

Well, it does show thought and it's an interesting anomaly. It's
related to the OP too (no less so than the one before it than any
other to the one before it): 'Why?' is pretty open-ended.

D'Aprano pointed out an ambiguity in the 'in' operator, which sparked
an idea that contributes to some other threads. I thought Booth and
Bossy, "homogeneous ... add+remove" pretty well summed it up. That's
my mistake for showing excitement to one thread that was actually
mostly prepared by another. I'm in A.I. if that helps any.

I am puzzled by the failure on 'a in a' for a=[a]. >>> a== [a] also
fails. Can we assume/surmise/deduce/infer it's intentional?
 
D

Duncan Booth

I am puzzled by the failure on 'a in a' for a=[a]. >>> a== [a] also
fails. Can we assume/surmise/deduce/infer it's intentional?
It may be less confusing if instead of an assignment following by a test
you just consider doing the test at the same time as the assignment
(then again it changes the behaviour so it may just be more confusing).

There is a very limited set of values for which you would expect this
output:
True

The most obvious one is a simple recursive list:
a = []
a.append(a)
a==[a]; a in [a]
True
True

Pushing the recursion down a level breaks some of the comparisons:
Traceback (most recent call last):
File said:
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in cmp

which may be considered a bug or possibly it is just a limit of the
implementation.

This also produces an overflow (in fact I think it is just the same
situation as above, recursing on the same list is handled, recursing on
different but equivalent lists is not):
a = []; a.append(a)
b = []; b.append(b)
a==b
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in cmp
 
C

castironpi

I am puzzled by the failure on 'a in a' for a=[a].  >>> a== [a] also
fails.  Can we assume/surmise/deduce/infer it's intentional?

It may be less confusing if instead of an assignment following by a test
you just consider doing the test at the same time as the assignment
(then again it changes the behaviour so it may just be more confusing).

There is a very limited set of values for which you would expect this
output:
a==[a] True
a in [a]

True

The most obvious one is a simple recursive list:
a = []
a.append(a)
a==[a]; a in [a]

True
True

Pushing the recursion down a level breaks some of the comparisons:
a = [[]]
a[0].append(a)
a==[a]

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in cmp>>> a in [a]
True
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in cmp

which may be considered a bug or possibly it is just a limit of the
implementation.

This also produces an overflow (in fact I think it is just the same
situation as above, recursing on the same list is handled, recursing on
different but equivalent lists is not):
a = []; a.append(a)
b = []; b.append(b)
a==b

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
RuntimeError: maximum recursion depth exceeded in cmp

Pickle helps,
pickle.dumps( a ) b'\x80\x02]q\x00h\x00a.'
pickle.dumps( b ) b'\x80\x02]q\x00h\x00a.'
pickle.dumps( a )== pickle.dumps( b )
True

However,
a= []
b= []
a.append( b )
b.append( a )
a [[[...]]]
b [[[...]]]
a== b
Traceback (most recent call last):
True

Liar's paradox, q.e.d.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,581
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top