"Canonical" way of deleting elements from lists

R

Robert Latest

Hello,

From a list of strings I want to delete all empty ones. This works:

while '' in keywords: keywords.remove('')

However, to a long-term C programmer this looks like an awkward way of
accomplishing a simple goal, because the list will have to be re-evaluated
in each iteration. Is there a way to just walk the list once and throw out
unwanted elements as one goes along?

I started programming back when such little things were real performance
issues, so I have some sort of cringe reflex when something looks
inefficient.

robert
 
F

Fredrik Lundh

Robert said:
while '' in keywords: keywords.remove('')

However, to a long-term C programmer this looks like an awkward way of
accomplishing a simple goal, because the list will have to be re-evaluated
in each iteration.

you're using a quadratic algorihm ("in" is a linear search, and remove
has to move everything around for each call), and you're worried about
the time it takes Python to fetch a variable?
> Is there a way to just walk the list once and throw out unwanted
> elements as one goes along?

creating a new list is always almost the right way to do things like
this. in this specific case, filter() or list comprehensions are good
choices:

keywords = filter(None, keywords) # get "true" items only

keywords = [k for k in keywords if k]

also see:

http://effbot.org/zone/python-list.htm#modifying

</F>
 
H

Hrvoje Niksic

Robert Latest said:
From a list of strings I want to delete all empty ones. This works:

while '' in keywords: keywords.remove('')

If you're looking for a quick (no quadratic behavior) and convenient
way to do it, you can do it like this:

keywords = [s for s in keywords if s != '']

But that creates a new list, which might not be wanted for long lists
with few empty elements (or for shared lists). It also iterates over
every list element in a Python loop, which can take some time for long
lists.
Is there a way to just walk the list once and throw out unwanted
elements as one goes along?

I'd do it like this:

i = 0
while 1:
try:
i = keywords.index('', i)
except ValueError:
break
del keywords

Or even:

try:
i = 0
while 1:
i = keywords.index('', i) # throws ValueError and stops the loop
del keywords
except ValueError:
pass

In both cases the search for the empty string is done in efficient C
code, and you only loop in Python over the actual matches.
 
H

Hrvoje Niksic

Hrvoje Niksic said:
If you're looking for a quick (no quadratic behavior) and convenient
way to do it, you can do it like this:

keywords = [s for s in keywords if s != '']

It now occurred to me that a good compromise between convenience and
efficiency that retains the same list is:

keywords[:] = (s for s in keywords if s)
 
R

Robert Latest

Fredrik said:
keywords = filter(None, keywords) # get "true" items only

Makes seinse. BTW, where can I find all methods of the built-in types?
Section 3.6 only talks about strings and mentions the list append() method
only in an example. Am I too stupid to read the manual, or is this an
omission?

robert
 
R

Robert Latest

Hrvoje said:
keywords[:] = (s for s in keywords if s)

Looks good but is so far beyond my own comprehension that I don't dare
include it in my code ;-)

robert
 
B

bearophileHUGS

Robert Latest:
Makes seinse. BTW, where can I find all methods of the built-in types?
Section 3.6 only talks about strings and mentions the list append() method
only in an example. Am I too stupid to read the manual, or is this an
omission?

filter isn't a method of list, it's a free function.

For most situations that solution with filter(None,keywords) is good
enough (but it filters away all false items, so it filters away None,
0, empty things, etc, too).
The following one is an in-place version (it may be faster with
Psyco), but you usually don't need it:


def inplacefilter(pred, alist):
"""inplacefilter(pred, alist): filters the given list like
filter(),
but works inplace, minimizing the used memory. It returns None.
>>> pr = lambda x: x > 2
>>> l = []
>>> inplacefilter(pr, l)
>>> l []
>>> l = [1,2,2]
>>> inplacefilter(pr, l)
>>> l []
>>> l = [3]
>>> inplacefilter(pr, l)
>>> l [3]
>>> l = [1,2,3,1,5,1,6,0]
>>> r = filter(pr, l) # normal filter
>>> r [3, 5, 6]
>>> inplacefilter(pr, l)
>>> r == l
True
"""
slow = 0
for fast, item in enumerate(alist):
if pred(item):
if slow != fast:
alist[slow] = alist[fast]
slow += 1
del alist[slow:]


You may want to avoid the enumerate() too if you use Psyco and you
need max speed.

Bye,
bearophile
 
N

Nick Craig-Wood

Robert Latest said:
Hrvoje said:
keywords[:] = (s for s in keywords if s)

Looks good but is so far beyond my own comprehension that I don't dare
include it in my code ;-)

:) Worth understanding thought I think - here are some hints

keywords[:] = (s for s in keywords if s)

is equivalent to this (but without creating a temporary list)

keywords[:] = list(s for s in keywords if s)

which is equivalent to

keywords[:] = [s for s in keywords if s]

This

keywords[:] = ....

Replaces the contents of the keywords list rather than making a new
list.

Here is a demonstration of the fundamental difference
>>> a=[1,2,3,4]
>>> b=a
>>> a=[5,6,7]
>>> print a, b
[5, 6, 7] [1, 2, 3, 4]
>>> a=[1,2,3,4]
>>> b=a
>>> a[:]=[5,6,7]
>>> print a, b
[5, 6, 7] [5, 6, 7]

Using keywords[:] stops the creation of another temporary list. The
other behaviour may or may not be what you want!
 
F

Fredrik Lundh

Nick said:
Using keywords[:] stops the creation of another temporary list.

in CPython, "list[:] = iter" actually creates a temporary list object on
the inside, in case "iter" isn't already a list or a tuple.

(see the implementation of PySequence_Fast() for details).

</F>
 
S

Sion Arrowsmith

Robert Latest said:
BTW, where can I find all methods of the built-in types?
Section 3.6 only talks about strings and mentions the list append() method
only in an example. Am I too stupid to read the manual, or is this an
omission?

3.6 talks about features common to all "sequence" types. Strings
are discussed specifically in 3.6.1 ("String Methods"). Lists are
similarly discussed in 3.6.4 ("Mutable Sequence Types"). They are
certainly not omitted, although maybe the title of 3.6.4 could be
take a leaf from the Zen and be more explicit.
 
N

Nick Craig-Wood

Fredrik Lundh said:
Nick said:
Using keywords[:] stops the creation of another temporary list.

in CPython, "list[:] = iter" actually creates a temporary list object on
the inside, in case "iter" isn't already a list or a tuple.

(see the implementation of PySequence_Fast() for details).

Ah, OK, thanks for the correction!
 
R

Robert Latest

Sion said:
3.6 talks about features common to all "sequence" types. Strings
are discussed specifically in 3.6.1 ("String Methods"). Lists are
similarly discussed in 3.6.4 ("Mutable Sequence Types").

OK, the latter then. Too stupid. Thanks ;-)

robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top