S
Steven D'Aprano
I would weaken that claim a tad... I'd say it is "usual" to write
something like this:
alist = []
for x in some_values:
alist.append(something_from_x)
but it is not uncommon (at least not in my code) to write something
like this equivalent code instead:
alist = [None]*len(some_values)
for i, x in enumerate(some_values):
alist = something_from_x
I have never done this, except in the beginning I used Python, and --
maybe more importantly -- I've never seen this in others code. I really
looks like a construct from someone who is still programming in some
other language(s).
It occurs at least twice in the 2.5 standard library, once in
sre_parse.py:
groups = []
groupsappend = groups.append
literals = [None] * len(p)
for c, s in p:
if c is MARK:
groupsappend((i, s))
# literal is already None
else:
literals = s
and another time in xdrlib.py:
succeedlist = [1] * len(packtest)
count = 0
for method, args in packtest:
print 'pack test', count,
try:
method(*args)
print 'succeeded'
except ConversionError, var:
print 'ConversionError:', var.msg
succeedlist[count] = 0
count = count + 1
When will it be more natural to introduce an unnecessary index?
We can agree that the two idioms are functionally equivalent. Appending
is marginally less efficient, because the Python runtime engine has to
periodically resize the list as it grows, and that can in principle take
an arbitrary amount of time if it causes virtual memory paging. But
that's unlikely to be a significant factor for any but the biggest lists.
So in the same way that any while-loop can be rewritten as a recursive
function, and vice versa, so these two idioms can be trivially re-written
from one form to the other. When should you use one or the other?
When the algorithm you have is conceptually about growing a list by
appending to the end, then you should grow the list by appending to the
end. And when the algorithm is conceptually about dropping values into
pre-existing pigeon holes, then you should initialize the list and then
walk it, modifying the values in place.
And if the algorithm in indifferent to which idiom you use, then you
should use whichever idiom you are most comfortable with, and not claim
there's Only One True Way to build a list.
Everything acts by magic unless you know what it does. The Fortran
read(*,*)(a(i,j,k),j=1,3)
in the OP's first post looks like magic too.
It sure does. My memories of Fortran aren't good enough to remember what
that does.
But I think you do Python a disservice. One of my Perl coders was writing
some Python code the other day, and he was amazed at how guessable Python
was. You can often guess the right way to do something. He wanted a set
with all the elements of another set removed, so he guess that s1-s2
would do the job -- and it did. A lot of Python is amazingly readable to
people with no Python experience at all. But not everything.
I admit that my code shows
off advanced Python features but I don't think ``with`` is one of them.
It makes it easier to write robust code and maybe even understandable
without documentation by just reading it as "English text".
The first problem with "with" is that it looks like the Pascal "with"
statement, but acts nothing like it. That may confuse anyone with Pascal
experience, and there are a lot of us out there.
The second difficulty is that:
with open('test.txt') as lines:
binds the result of open() to the name "lines". How is that different
from "lines = open('test.txt')"? I know the answer, but we shouldn't
expect newbies coming across it to be anything but perplexed.
Now that the newbie has determined that lines is a file object, the very
next thing you do is assign something completely different to 'lines':
lines = (line for line in lines if line.strip())
So the reader needs to know that brackets aren't just for grouping like
in most other languages, but also that (x) can be equivalent to a for-
loop. They need to know, or guess, that iterating over a file object
returns lines of the file, and they have to keep the two different
bindings of "lines" straight in their head in a piece of code that uses
"lines" twice and "line" three times.
And then they hit the next line, which includes a function called
"partial", which has a technical meaning out of functional languages and
I am sure it will mean nothing whatsoever to anyone unfamiliar to it.
It's not something that is guessable, unlike open() or len() or append().
Do you mean `lines`? Then I disagree because the (duck) type is always
"iterable over lines". I just changed the content by filtering.
Nevertheless, for people coming from less dynamic languages than Python
(such as Fortran), it is a common idiom to never use the same variable
for two different things. It's not a bad choice really: imagine reading a
function where the name "lines" started off as an integer number of
lines, then became a template string, then was used for a list of
character positions...
Of course I'm not suggesting that your code was that bad. But rebinding a
name does make code harder to understand.