For-each behavior while modifying a collection

V

Valentin Zahnd

Hello

For-each does not iterate ober all entries of collection, if one
removes elements during the iteration.

Example code:

def keepByValue(self, key=None, value=[]):
for row in self.flows:
if not row[key] in value:
self.flows.remove(row)

It is clear why it behaves on that way. Every time one removes an
element, the length of the colleciton decreases by one while the
counter of the for each statement is not.
The questions are:
1. Why does the interprete not uses a copy of the collection to
iterate over it? Are there performance reasons?
2. Why is the counter for the iteration not modified?

Valentin
 
S

Steven D'Aprano

It is clear why it behaves on that way. Every time one removes an
element, the length of the colleciton decreases by one while the counter
of the for each statement is not. The questions are:
1. Why does the interprete not uses a copy of the collection to iterate
over it? Are there performance reasons?

Of course. Taking a copy of the loop sequence takes time, possible a
*lot* of time depending on the size of the list, and that is a total
waste of both time and memory if you don't modify the loop sequence. And
Python cannot determine whether or not you modify the sequence. Consider
this:

data = some_list_of_something
for item in data:
func(item)


Does func modify the global variable data? How can you tell? Without
whole-of-program semantic analysis, you cannot tell whether data is
modified or not. Consider this one:

def func(obj):
stuff = globals()['DA'.lower() + 'ta']
eval("stuff.remove(obj)")


Do you expect Python to analyse the code in sufficient detail to realise
that in this case, it needs to make a copy of the loop sequence? I don't.
It is much better to have the basic principle that Python will not make a
copy of anything unless you ask it to. You, the programmer, are in the
best position to realise whether you are modifying the loop sequence and
can decide whether to make a shallow copy or a deep copy.

It is a basic principle in programming that you shouldn't modify objects
that you are traversing over unless you are very, very careful. Given
that, Python does the right thing here.

2. Why is the counter for the iteration not modified?

What counter? There is no counter. You are iterating over an iterator,
not running a C or Pascal "for i := 1 to 20" style loop.

Even if there was a counter, how should it be modified? The code you show
was this:

def keepByValue(self, key=None, value=[]):
for row in self.flows:
if not row[key] in value:
self.flows.remove(row)


What exactly does the remove() method do? How do you know?

self.flows could be *any object at all*, it won't be known until run-
time. The remove method could do *anything*, that won't be known until
runtime either. Just because you, the programmer, expects that self.flows
will be a list, and that remove() will remove at most one item, doesn't
mean that Python can possibly know that. Perhaps self.flows returns an
subclass of list, and remove() will remove all of the matching items, not
just one. Perhaps it is some other object, and rather than removing
anything, in fact it actually inserts extra items in the middle of the
sequence. (There is no law that says that methods must do what they say
they do.)

You are expecting Python to know more about your program than you do.
That is not the case.
 
D

Dennis Lee Bieber

Even if there was a counter, how should it be modified? The code you show
was this:

def keepByValue(self, key=None, value=[]):
for row in self.flows:
if not row[key] in value:
self.flows.remove(row)


What exactly does the remove() method do? How do you know?

self.flows could be *any object at all*, it won't be known until run-
time. The remove method could do *anything*, that won't be known until
runtime either. Just because you, the programmer, expects that self.flows
will be a list, and that remove() will remove at most one item, doesn't
mean that Python can possibly know that. Perhaps self.flows returns an
subclass of list, and remove() will remove all of the matching items, not
just one. Perhaps it is some other object, and rather than removing
anything, in fact it actually inserts extra items in the middle of the
sequence. (There is no law that says that methods must do what they say
they do.)

Let's really confuse matters...

Say "self.flows" is something derived from a database cursor/result
set...

Then "self.flows.remove(row)" could: 1) remove the row from the
cursor/result set [iterating over a cursor tends, in my experience, to use
up the items]; 2) execute a query to remove the matching row from the
database itself; 3) do both...
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,576
Members
45,054
Latest member
LucyCarper

Latest Threads

Top