How to del item of a list in loop?

Reinhold Birkenfeld · Jan 15, 2005

skull said:
Hi everybody, it is my first post in this newsgroup.
I am a newbie for python though I have several years development experience in c++.
recently, I was stumped when I tried to del item of a list when iteration.

here is the wrong way I did:

lst = [1, 2, 3]
for i in lst:
print i
if i == 2:
lst.remove(i)

the result is:

1
2
as you would see, '3' is missing. this problem is caused by 'lst.remove(i)'.
apparently, 'marked-and-sweep' is a solution to deal with this issue.
but I think there SHOULD BE more 'wise' trick. I want to get your help.

Quick solution:

for i in lst[:]

iterates over a copy.

Reinhold

Fredrik Lundh · Jan 15, 2005

skull said:
Hi everybody, it is my first post in this newsgroup.
I am a newbie for python though I have several years development experience in c++.
recently, I was stumped when I tried to del item of a list when iteration.

here is the wrong way I did:

lst = [1, 2, 3]
for i in lst:
print i
if i == 2:
lst.remove(i)

the result is:

1
2
as you would see, '3' is missing. this problem is caused by 'lst.remove(i)'.

the internal loop counter is incremented whether you remove stuff or not, so
when you're moving things around by removing items in the middle, the loop
will skip over items. this is basic for-in usage, and any decent tutorial should
explain this.

to fix this, you can build a new output list, loop backwards, or loop over a
copy. the last is easiest to implement:

lst = [1, 2, 3]
for i in lst[:]:
print i
if i == 2:
lst.remove(i)

but the others may have advantages in certain use cases.

if you want more details, see the language reference:

http://docs.python.org/ref/for.html

(note the warning)

</F>

Reinhold Birkenfeld · Jan 15, 2005

Reinhold said:
skull said:

Hi everybody, it is my first post in this newsgroup.
I am a newbie for python though I have several years development experience in c++.
recently, I was stumped when I tried to del item of a list when iteration.

here is the wrong way I did:

lst = [1, 2, 3]
for i in lst:
print i
if i == 2:
lst.remove(i)

the result is:

1
2
as you would see, '3' is missing. this problem is caused by 'lst.remove(i)'.
apparently, 'marked-and-sweep' is a solution to deal with this issue.
but I think there SHOULD BE more 'wise' trick. I want to get your help.

Click to expand...

Quick solution:

for i in lst[:]

iterates over a copy.

Addition: In Py2.4, I can't find a problem with

for i in reversed(lst)

Any objections?

Reinhold

Mitja · Jan 15, 2005

lst = [1, 2, 3]
for i in lst:
print i
if i == 2:
lst.remove(i)

the result is:

1
2

As others have suggested, you can use a copy of the list.
Alternatively and depending on what you're trying to accomplish (how
complicated it is), lst = [i for i in lst if i!=2] might look better.

Fredrik Lundh · Jan 15, 2005

skull said:
It makes a copy operation!

so? in python, shallow copies are cheap.

here is a faster and 'ugly' solution:

faster? did you try it, or are you combining a C++ mindset with an
urge to do premature optimizations? (hint: it's slower)

if you care about performance, you shouldn't use "remove", btw. building
a new list is more efficient, especially if you use a list comprehension:

lst = [i for i in lst if i != 2]

(if you have 2.4, try replacing [] with () and see what happens)

</F>

Michael Hoffman · Jan 15, 2005

skull said:
but I still have an other thing to worry about coming with this way: does
performance sucks when the list is big enough?
It makes a copy operation!

here is a faster and 'ugly' solution:

lst = [1, 2, 3]
i = 0
while i < len(lst):
if lst == 2:
lst.remove(i)
else:
i += 1

Actually, that is the slowest of the three methods proposed so far for
large lists on my system.

import timeit

def method_copy(lst):
for i in lst[:]:
if i == 2:
lst.remove(i)
return lst

def method_listcomp(lst):
return [i for i in lst if i!=2]

def method_while(lst):
i = 0
while i < len(lst):
if lst == 2:
lst.remove(i)
else:
i += 1
return lst

lsts = dict(three=range(3),
hundred=range(100),
thousand=range(1000),
million=range(1000000))

if __name__ == "__main__":
for lst_name, lst in lsts.iteritems():
print "%s" % lst_name
for method_name in ["method_copy", "method_listcomp", "method_while"]:
stmt = '%s(lsts["%s"])' % (method_name, lst_name)
setup = "from __main__ import %s, lsts" % method_name

times = 3000000 / len(lst) # reduce the number of times you repeat for big lists
print " %s: %s" % (method_name, timeit.Timer(stmt, setup).repeat(3, times))

RESULTS:

Michael@MINIMOO ~
$ python -V && uname -a
Python 2.4
CYGWIN_NT-5.1 MINIMOO 1.5.12(0.116/4/2) 2004-11-10 08:34 i686 unknown unknown Cygwin

Michael@MINIMOO ~
$ python testlist.py
thousand
method_copy: [1.0479998588562012, 1.0080001354217529, 1.0119998455047607]
method_listcomp: [1.5119998455047607, 1.5110001564025879, 1.503000020980835]
method_while: [3.8680000305175781, 3.8680000305175781, 3.872999906539917]
hundred
method_copy: [1.1269998550415039, 1.127000093460083, 1.190000057220459]
method_listcomp: [2.0360000133514404, 2.0480000972747803, 2.0069999694824219]
method_while: [3.5299999713897705, 3.5540001392364502, 3.5130000114440918]
three
method_copy: [6.1210000514984131, 6.1679999828338623, 6.1239998340606689]
method_listcomp: [9.7590000629425049, 9.5309998989105225, 9.5]
method_while: [6.6609997749328613, 6.625, 6.6800000667572021]
million
method_copy: [1.3420000076293945, 1.3029999732971191, 1.3389999866485596]
method_listcomp: [1.5409998893737793, 1.5500001907348633, 1.5289998054504395]
method_while: [3.9619998931884766, 3.9210000038146973, 3.9590001106262207]

Now your "while" method does use less memory, but it is not as fast as the
copy method.

If you want to get really hairy, you can compare the bytecode instructions
for these three methods:

$ python
Python 2.4 (#1, Dec 4 2004, 20:10:33)
[GCC 3.3.3 (cygwin special)] on cygwin
Type "help", "copyright", "credits" or "license" for more information. 13 0 LOAD_CONST 1 (0)
3 STORE_FAST 1 (i)

14 6 SETUP_LOOP 68 (to 77) 12 LOAD_GLOBAL 1 (len)
15 LOAD_FAST 0 (lst)
18 CALL_FUNCTION 1
21 COMPARE_OP 0 (<)
24 JUMP_IF_FALSE 48 (to 75)
27 POP_TOP

15 28 LOAD_FAST 0 (lst)
31 LOAD_FAST 1 (i)
34 BINARY_SUBSCR
35 LOAD_CONST 2 (2)
38 COMPARE_OP 2 (==)
41 JUMP_IF_FALSE 17 (to 61)
44 POP_TOP

16 45 LOAD_FAST 0 (lst)
48 LOAD_ATTR 3 (remove)
51 LOAD_FAST 1 (i)
54 CALL_FUNCTION 1
57 POP_TOP
58 JUMP_ABSOLUTE 9
18 62 LOAD_FAST 1 (i)
65 LOAD_CONST 3 (1)
68 INPLACE_ADD
69 STORE_FAST 1 (i)
72 JUMP_ABSOLUTE 9 76 POP_BLOCK

19 >> 77 LOAD_FAST 0 (lst)
80 RETURN_VALUE 4 0 SETUP_LOOP 45 (to 48)
3 LOAD_FAST 0 (lst)
6 SLICE+0
7 GET_ITER 11 STORE_FAST 1 (i)

5 14 LOAD_FAST 1 (i)
17 LOAD_CONST 1 (2)
20 COMPARE_OP 2 (==)
23 JUMP_IF_FALSE 17 (to 43)
26 POP_TOP

6 27 LOAD_FAST 0 (lst)
30 LOAD_ATTR 2 (remove)
33 LOAD_FAST 1 (i)
36 CALL_FUNCTION 1
39 POP_TOP
40 JUMP_ABSOLUTE 8
7 >> 48 LOAD_FAST 0 (lst)
51 RETURN_VALUE 10 0 BUILD_LIST 0
3 DUP_TOP
4 STORE_FAST 1 (_[1])
7 LOAD_FAST 0 (lst)
10 GET_ITER 14 STORE_FAST 2 (i)
17 LOAD_FAST 2 (i)
20 LOAD_CONST 1 (2)
23 COMPARE_OP 3 (!=)
26 JUMP_IF_FALSE 11 (to 40)
29 POP_TOP
30 LOAD_FAST 1 (_[1])
33 LOAD_FAST 2 (i)
36 LIST_APPEND
37 JUMP_ABSOLUTE 11

>> 40 POP_TOP 41 JUMP_ABSOLUTE 11
>> 44 DELETE_FAST 1 (_[1])

Click to expand...

Click to expand...

47 RETURN_VALUE

For method_while, almost all of the times the list runs through,
you still have to add 1 to i, which is a lot of instructions:

18 62 LOAD_FAST 1 (i)
65 LOAD_CONST 3 (1)
68 INPLACE_ADD
69 STORE_FAST 1 (i)
72 JUMP_ABSOLUTE 9

With the other methods, when you find a false result, all you have to
do is the JUMP_ABSOLUTE. That saves you some time over several
million repetitions.

Well, that was longer than I thought it would be. HTH.

Nick Coghlan · Jan 15, 2005

Reinhold said:
Addition: In Py2.4, I can't find a problem with

for i in reversed(lst)

Some poking around suggests it's fine - and it avoids copying the list. Don't
delete anything earlier in the list than the current element though (which
list.remove() will do quite happily when data is duplicated).

However, there will still be data movement costs when deleting elements from the
middle of the list. In addition, remove() itself has to do a linear search for
the value being removed.

An off-place solution based on a list comprehension is usually going to be your
best performer - it's an O(n) operation, based on the size of the original list.
The in-place mechanisms can turn out to be O(n**2) due to worst-case memory
movement effects (lists don't allow gaps, so deleting items will usually trigger
data movement).

I think this is about the best you can do for an in-place version:
for i, x in enumerate(reversed(lst)):
if x == 2:
del lst[-i]

The effbot's version is still going to be faster though:
lst = [x for x in lst if x != 2]

Cheers,
Nick.

skull · Jan 15, 2005

Hi everybody, it is my first post in this newsgroup.
I am a newbie for python though I have several years development experience in c++.
recently, I was stumped when I tried to del item of a list when iteration.

here is the wrong way I did:

lst = [1, 2, 3]
for i in lst:
print i
if i == 2:
lst.remove(i)

the result is:

1
2
as you would see, '3' is missing. this problem is caused by 'lst.remove(i)'.
apparently, 'marked-and-sweep' is a solution to deal with this issue.
but I think there SHOULD BE more 'wise' trick. I want to get your help.

Thanks in advance.

- skull

John Machin · Jan 16, 2005

Fredrik said:
lst = [i for i in lst if i != 2]

(if you have 2.4, try replacing [] with () and see what happens)

The result is a generator with a name ("lst") that's rather misleading
in the context. Achieving the same result as the list comprehension, by
doing lst = list(i for ... etc etc), appears to be slower.

John Machin · Jan 16, 2005

Nick said:
I think this is about the best you can do for an in-place version:
for i, x in enumerate(reversed(lst)):
if x == 2:
del lst[-i]

Don't think, implement and measure. You may be surprised. Compare these
two for example:

!def method_del_bkwds(lst, x):
! for inx in xrange(len(lst) - 1, -1, -1):
! if lst[inx] == x:
! del lst[inx]
! return lst

!def method_coghlan(lst, x):
! for i, obj in enumerate(reversed(lst)):
! if obj == x:
! del lst[-i]
! return lst

The effbot's version is still going to be faster though:
lst = [x for x in lst if x != 2]

Have you measured this?

skull · Jan 16, 2005

Thank you for your replys.
lst[:] is did a solution, it makes a copy of list specially for iteration and
removes items from the original one.

but I still have an other thing to worry about coming with this way: does
performance sucks when the list is big enough?
It makes a copy operation!

here is a faster and 'ugly' solution:

lst = [1, 2, 3]
i = 0
while i < len(lst):
if lst == 2:
lst.remove(i)
else:
i += 1

Hi everybody, it is my first post in this newsgroup.
I am a newbie for python though I have several years development experience in c++.
recently, I was stumped when I tried to del item of a list when iteration.

here is the wrong way I did:

lst = [1, 2, 3]
for i in lst:
print i
if i == 2:
lst.remove(i)

the result is:

1
2
as you would see, '3' is missing. this problem is caused by 'lst.remove(i)'.
apparently, 'marked-and-sweep' is a solution to deal with this issue.
but I think there SHOULD BE more 'wise' trick. I want to get your help.

Thanks in advance.

- skull

Click to expand...

John Machin · Jan 16, 2005

Michael said:
skull said:

but I still have an other thing to worry about coming with this way: does
performance sucks when the list is big enough?
It makes a copy operation!

here is a faster and 'ugly' solution:

lst = [1, 2, 3]
i = 0
while i < len(lst):
if lst == 2:
lst.remove(i)
else:
i += 1

Click to expand...

Actually, that is the slowest of the three methods proposed so far for
large lists on my system.

Assuming, as have other posters, that the requirement is to remove all
elements whose value is 2: it doesn't work. The result is [2, 3]
instead of the expected [1, 3].

method_while: [3.8680000305175781, 3.8680000305175781,

Click to expand...

3.872999906539917]

Three significant figures is plenty. Showing just the minimum of the
results might be better.

If you want to get really hairy, you can compare the bytecode instructions
for these three methods:

Click to expand...

Timing and bytecode-peeking a broken function are a little "premature".

skull · Jan 16, 2005

Reinhold Birkenfeld said:
Quick solution:

for i in lst[:]

iterates over a copy.

Click to expand...

Addition: In Py2.4, I can't find a problem with

for i in reversed(lst)

Any objections?

Reinhold

I just downloaded py2.4, and made a test using reversed.
it sure be no problem, I thought maybe the reversed holded a copy of list,
and eventually iterated through the copy.
but the truth is not as I thought so:

import sys

class Test:
pass

lst = [Test(),Test(),Test()]

E1: for i in lst[:]:
E2: for i in reversed(lst):
print sys.getrefcount(i)

###################
E1 outputs:
4
4
4

E2 outputs:
3
3
3

It looks that the reversed does not make a copy of list in contrast with lst[:].
so should we regard: reversed is faster than lst[:]?
I do not have any idea about why it is.

- skull

John Machin · Jan 16, 2005

Nick said:
I think this is about the best you can do for an in-place version:
for i, x in enumerate(reversed(lst)):
if x == 2:
del lst[-i]

I think del lst[-i-1] might be functionally better.

Michael Hoffman · Jan 16, 2005

John said:
Three significant figures is plenty. Showing just the minimum of the
results might be better.

It might be, but how much time do you want to spend on getting your
results for a benchmark that will be run once in the "better" format?

Next time you can run the benchmark yourself and it will be in exactly
the format you want. Give me a break.

John Machin · Jan 16, 2005

Michael said:
It might be, but how much time do you want to spend on getting your
results for a benchmark that will be run once in the "better" format?

About the same time as the "worse" format. The Mona Lisa was painted
once. The Taj Mahal was built once.

Next time you can run the benchmark yourself and it will be in exactly
the format you want.

I've done that already. I've taken your code and improved it along the
suggested lines, added timing for first, middle, and last elements,
added several more methods, and added a testing facility as well. Would
you like a copy?

Bengt Richter · Jan 16, 2005

Hi everybody, it is my first post in this newsgroup.
I am a newbie for python though I have several years development experience in c++.
recently, I was stumped when I tried to del item of a list when iteration.

here is the wrong way I did:

lst = [1, 2, 3]
for i in lst:
print i
if i == 2:
lst.remove(i)

the result is:

1
2
as you would see, '3' is missing. this problem is caused by 'lst.remove(i)'.
apparently, 'marked-and-sweep' is a solution to deal with this issue.
but I think there SHOULD BE more 'wise' trick. I want to get your help.

Thanks in advance.

No one seems to have suggested this in-place way yet,
so I'll trot it out once again ;-)

>>> lst = [1, 2, 3]
>>> i = 0
>>> for item in lst:

Click to expand...

Click to expand...

... if item !=2:
... lst = item
... i += 1
...

>>> del lst[i:]
>>> lst

Click to expand...

Click to expand...

Click to expand...

[1, 3]

Regards,
Bengt Richter

John Machin · Jan 16, 2005

Bengt said:
No one seems to have suggested this in-place way yet,
so I'll trot it out once again ;-)

lst = [1, 2, 3]
i = 0
for item in lst:

Click to expand...

Click to expand...

... if item !=2:
... lst = item
... i += 1
...

del lst[i:]
lst

Click to expand...

Click to expand...

[1, 3]

Works, but slowly. Here's another that appears to be the best on large
lists, at least for removing 1 element. It's O(len(list) *
number_to_be_removed).

!def method_try_remove(lst, remove_this):
! try:
! while 1:
! lst.remove(remove_this)
! except:
! pass

Bengt Richter · Jan 16, 2005

Bengt said:
Bengt said:

No one seems to have suggested this in-place way yet,
so I'll trot it out once again ;-)

lst = [1, 2, 3]
i = 0
for item in lst:

Click to expand...

... if item !=2:
... lst = item
... i += 1
...

del lst[i:]
lst

Click to expand...

[1, 3]

Click to expand...

Works, but slowly. Here's another that appears to be the best on large
lists, at least for removing 1 element. It's O(len(list) *
number_to_be_removed).

So my loop above should beat the while below pretty handily if deleting 2 from lst=[2]*10**7 ;-)
I.e., it should be O(len(lst)) period, though obviously lst.remove will be C-implemented,
and there will be a crossover point. Psyco ought to do well with the above, IWT.

!def method_try_remove(lst, remove_this):
! try:
! while 1:
! lst.remove(remove_this)
! except:
! pass

Maybe

Click to expand...

... i = 0
... for lst, item in itertools.izip(lst, lst): i += item!=vrem
... del lst[i:]
...

>>> lst = [1, 2, 3]
>>> foo(lst, 2)
>>> lst

Click to expand...

Click to expand...

Click to expand...

[1, 3]

I wonder if py2.4 generator expression as a souce of slice assignment, i.e.,
(did someone post this one? Someone must have ;-)

lst[:] = (item for item in lst if item!=remove_this)

would do what my loop is doing, in place. If the slice assignment accepts an
iterator, that might do it, if [:] assignment overwrites the old list
until it has to be extended or in this case possibly trimmed at the end.

If so, its hardly worth writing a function, since the generator expression should
have fast local access to its variables.

Regards,
Bengt Richter

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?= · Jan 16, 2005

Fredrik said:
faster? did you try it, or are you combining a C++ mindset with an
urge to do premature optimizations? (hint: it's slower)

Hmm. Given

import time

length = 2**15

L = range(length)

start = time.clock()
for i in L[:]:
if i%2 == 1:
L.remove(i)
print time.clock()-start

L = range(length)
start = time.clock()
i = 0
while i < len(L):
if L%2 == 1:
del L
else:
i += 1
print time.clock()-start

on my system (python 2.4, Linux 2.6, Pentium 4 3.2GHz), I get

4.85
0.29

So using an explicit index is *much* faster (the OP's version
was incorrect, of course, as it used .remove with an index).

if you care about performance, you shouldn't use "remove"

Click to expand...

Yes, but primarily because remove needs to search the entire
list for the item to be removed.

Whether del is more efficient than building a new list
depends on the number of items to be removed.

Regards,
Martin

Average of MultiMode of a list of a list	1	Oct 28, 2022
How to loop in folder through all excel files and all sheets using pandas?	0	Dec 1, 2022
For Loop in List	5	Jan 13, 2013
How does a HEAD pointer end up pointing to the first node in a linked list?	3	Jan 24, 2023
How to put loop result in csv file	1	Jan 3, 2023
Partially completed coding of loop script, need help finishing.	0	Oct 7, 2022
Insert item before each element of a list	22	Oct 8, 2012
Difference between using "let" in a "for" loop	0	Jul 3, 2022

How to del item of a list in loop?

Reinhold Birkenfeld

Fredrik Lundh

Reinhold Birkenfeld

Mitja

Fredrik Lundh

Michael Hoffman

Nick Coghlan

skull

John Machin

John Machin

skull

John Machin

skull

John Machin

Michael Hoffman

John Machin

Bengt Richter

John Machin

Bengt Richter

=?ISO-8859-1?Q?=22Martin_v=2E_L=F6wis=22?=

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads