Tuples vs. variable-length argument lists

S

Spencer Pearson

Hi!

This might be more of a personal-preference question than anything,
but here goes: when is it appropriate for a function to take a list or
tuple as input, and when should it allow a varying number of
arguments? It seems as though the two are always interchangeable. For
a simple example...

def subtract( x, nums ):
return x - sum( nums )

.... works equally well if you define it as "subtract( x, *nums )" and
put an asterisk in front of any lists/tuples you pass it. I can't
think of any situation where you couldn't convert from one form to the
other with just a star or a pair of parentheses.

Is there a generally accepted convention for which method to use? Is
there ever actually a big difference between the two that I'm not
seeing?
 
S

Steven D'Aprano

Hi!

This might be more of a personal-preference question than anything, but
here goes: when is it appropriate for a function to take a list or tuple
as input, and when should it allow a varying number of arguments?

That depends on the function as well as your personal preference.

Of course, you can also follow the lead of max and min and accept both:
max(1, 2, 3, 4) == max([1, 2, 3, 4])
True

You need to consider which is more "natural" for the semantics of the
function.

It seems as though the two are always interchangeable.

Not always. Consider sum(a, b, c, d).

Should that be "sum the sequence [a, b, c, d]" or "sum the sequence [a,
b, c] with initial value d"? You might ask what difference it makes, but
it may make a big difference:

* if a...d are floats, there may be differences in rounding errors
between a+b+c+d and d+a+b+c

* if a...d are lists, the order that you do the addition matters:
["a", "b", "c"] + ["d"] != ["d"] + ["a", "b", "c"]

* even if a...d are integers, it may make a big difference for
performance. Perhaps not for a mere four arguments, but watch:
n = 10**1000000
seq = [n] + range(10001)
from timeit import Timer
t1 = Timer("sum(seq)", "from __main__ import n, seq; seq.append(-n)")
t2 = Timer("sum(seq, -n)", "from __main__ import n, seq")
min(t1.repeat(number=1)) 6.1270790100097656
min(t2.repeat(number=1))
0.012988805770874023

In the first case, all the intermediate calculations are done using a
million digit long int, in the second case it is not.

[...]
I can't think
of any situation where you couldn't convert from one form to the other
with just a star or a pair of parentheses.

Of course you can convert, but consider that this is doing a conversion.
Perhaps that's unnecessary work for your specific function? You are
packing and unpacking a sequence into a tuple. Whether this is a good
thing or a bad thing depends on the function.

Is there a generally accepted convention for which method to use? Is
there ever actually a big difference between the two that I'm not
seeing?

If you think of the function as operating on a single argument which is a
sequence of arbitrary length, then it is best written to take a single
sequence argument.

If you think of the function as operating on an arbitrary number of
arguments, then it is best written to take an arbitrary number of
arguments.

If you think it doesn't matter, then choose whichever model is less work
for you.

If neither seems better than the other, then choose arbitrarily.

If you don't like the idea of making an arbitrary choice, or if your
users complain, then support both models (if possible).
 
J

Jean-Michel Pichavant

Spencer said:
Hi!

This might be more of a personal-preference question than anything,
but here goes: when is it appropriate for a function to take a list or
tuple as input, and when should it allow a varying number of
arguments? It seems as though the two are always interchangeable. For
a simple example...

def subtract( x, nums ):
return x - sum( nums )

... works equally well if you define it as "subtract( x, *nums )" and
put an asterisk in front of any lists/tuples you pass it. I can't
think of any situation where you couldn't convert from one form to the
other with just a star or a pair of parentheses.

Is there a generally accepted convention for which method to use? Is
there ever actually a big difference between the two that I'm not
seeing?
FYI some linters report the usage of * as bad practice, I don't know the
reason though. Pylint reports it as using 'magic'.
Anyway the form without * is commonly used.

JM
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,579
Members
45,053
Latest member
BrodieSola

Latest Threads

Top