Strange behaviour with reversed()

S

Steven D'Aprano

I don't understand how reversed() is operating. I've read the description
in the docs:

reversed(seq)
Return a reverse iterator. seq must be an object which supports the
sequence protocol (the __len__() method and the __getitem__() method with
integer arguments starting at 0). New in version 2.4.

and help(reversed) but neither gives any insight to what happens when you
use reversed() on a sequence, then modify the sequence.


This works as I expected:
['c', 'b', 'a']

This suggests that reversed() makes a copy of the list:
['c', 'b', 'a']

This suggests that reversed() uses a reference to the original list:
RL = reversed(L)
L[0] = 'e'
list(RL)
['d', 'c', 'b', 'e']

And these examples suggests that reversed() is confused, or at least
confusing:
RL = reversed(L)
del L[2]
list(RL)
[]
L = list("abc")
RL = reversed(L)
L.insert(0, "d")
L ['d', 'a', 'b', 'c']
list(RL)
['b', 'a', 'd']


Does anyone know what reversed() is actually doing?
 
B

Ben Finney

Steven D'Aprano said:
and help(reversed) but neither gives any insight to what happens
when you use reversed() on a sequence, then modify the sequence.

I would think the answer is the same for any question about modifying
sequences while iterating: "undefined, therefore don't do that".

In other words, if you think you'll be modifying the sequence during
iteration (even if you're iterating indirectly via something like
reversed()), iterate over a copy instead.
 
G

Gabriel Genellina

En Thu, 18 Oct 2007 01:31:13 -0300, Steven D'Aprano
I don't understand how reversed() is operating. I've read the description
in the docs:

reversed(seq)
Return a reverse iterator. seq must be an object which supports the
sequence protocol (the __len__() method and the __getitem__() method with
integer arguments starting at 0). New in version 2.4.

and help(reversed) but neither gives any insight to what happens when you
use reversed() on a sequence, then modify the sequence.

A reversed object is rather simple: it stores the original sequence (a
reference, as usual, not a copy!) and the next index to use, starting at
len-1. Each time the next() method is called, the index is decremented
until it goes below 0.
This is more or less the equivalent Python code:

class Reversed:
def __init__(self, seq):
n = len(seq)
self.index = n-1
self.seq = seq

def __iter__(self):
return self

def next(self):
index = self.index
if index>=0:
try: item = self.seq[index]
except (IndexError,StopIteration): pass
else:
self.index -= 1
return item
self.index = -1
self.seq = None
raise StopIteration

Note that the starting index is determined at creation time, not when the
iteration begins. So, if you create a reversed object over a list
containing 3 elements, the first returned element will be seq[2], then
seq[1], then seq[0]. It doesn't matter if you modify the list after
creating the reversed object but before starting the iteration: it will
start at seq[2] even if it's not the last item, and will silently stop if
seq[2] is not a valid index anymore.
 
A

Andreas Kraemer

I don't understand how reversed() is operating. I've read the description
in the docs:

reversed(seq)
Return a reverse iterator. seq must be an object which supports the
sequence protocol (the __len__() method and the __getitem__() method with
integer arguments starting at 0). New in version 2.4.

and help(reversed) but neither gives any insight to what happens when you
use reversed() on a sequence, then modify the sequence.

This works as I expected:

['c', 'b', 'a']

This suggests that reversed() makes a copy of the list:

['c', 'b', 'a']

This suggests that reversed() uses a reference to the original list:
RL = reversed(L)
L[0] = 'e'
list(RL)

['d', 'c', 'b', 'e']

And these examples suggests that reversed() is confused, or at least
confusing:
RL = reversed(L)
del L[2]
list(RL)
[]
L = list("abc")
RL = reversed(L)
L.insert(0, "d")
L

['d', 'a', 'b', 'c']>>> list(RL)

['b', 'a', 'd']

Does anyone know what reversed() is actually doing?

Without knowing the internals, it seems reversed() does exactly the
same as the following class:


class Reversed(object):
def __init__(self,seq):
self.seq = seq
self.i = len(seq)
def __iter__(self):
return self
def next(self):
self.i -= 1
if self.i < 0:
raise StopIteration
else:
return self.seq[self.i]

so it doesn't copy anything, just book-keeping of indexes. I guess one
would call this kind of object a (special) "view" of the sequence.

Cheers,

Andreas
 
S

Steven D'Aprano

A reversed object is rather simple: it stores the original sequence (a
reference, as usual, not a copy!) and the next index to use, starting at
len-1. Each time the next() method is called, the index is decremented
until it goes below 0.
....

Note that the starting index is determined at creation time, not when
the iteration begins. So, if you create a reversed object over a list
containing 3 elements, the first returned element will be seq[2], then
seq[1], then seq[0]. It doesn't matter if you modify the list after
creating the reversed object but before starting the iteration: it will
start at seq[2] even if it's not the last item, and will silently stop
if seq[2] is not a valid index anymore.

You know, that's probably the *least* intuitive behaviour possible.

For reversed() to iterate over the input as it was at creation time is a
reasonable design choice; for it to iterate over the input as it is at
call time is also a reasonable design choice; but for it to iterate over
some unholy mutant melding of the sequence as-it-was and as-it-is is
simply bizarre. I hope it just fell out of the implementation and wasn't
a deliberate design choice.
 
S

Steven D'Aprano

I would think the answer is the same for any question about modifying
sequences while iterating: "undefined, therefore don't do that".

That's not correct. You can modify sequences while iterating, and the
results are often perfectly predictable (but watch out for off-by-one
errors):
.... print item, "- deleting last item =", L[-1]
.... del L[-1]
....
0 - deleting last item = 4
1 - deleting last item = 3
2 - deleting last item = 2


(BTW, I'm not implying that the above is good practice. Far from it.)

The result of modifying sequences while you iterate over them is often --
not always -- hard to predict, but it is rarely indeterminate or
undefined.

enumerate(), for example, iterates over the input sequence *as it is* at
call time, not creation time. The result of modifying the sequence is
well defined, but not necessarily what you want. For an exercise in
Sorcerer's Apprentice, try this:

L = range(10)
for i, x in enumerate(L):
print i, x
L.append("another!")

Remember, ctrl-C to interrupt Python.


But even if it is undefined in the case of reversed(), there should be
some sort of note in the docs. By analogy with sorted(), reversed() looks
like it should make a copy. And if not, it looks like it should iterate
over the input sequence as it exists when called. In fact, it does a
little of both.

In other words, if you think you'll be modifying the sequence during
iteration (even if you're iterating indirectly via something like
reversed()), iterate over a copy instead.

By analogy with sorted(), reversed() looks like it should be iterating
over a copy. In any case, it isn't clear from the docs that reversed() is
actually a type of view. I like views, it's just that I like to know when
I'm working with one.
 
D

Duncan Booth

Steven D'Aprano said:
Note that the starting index is determined at creation time, not when
the iteration begins. So, if you create a reversed object over a list
containing 3 elements, the first returned element will be seq[2], then
seq[1], then seq[0]. It doesn't matter if you modify the list after
creating the reversed object but before starting the iteration: it will
start at seq[2] even if it's not the last item, and will silently stop
if seq[2] is not a valid index anymore.

You know, that's probably the *least* intuitive behaviour possible.

For reversed() to iterate over the input as it was at creation time is
a
reasonable design choice; for it to iterate over the input as it is at
call time is also a reasonable design choice; but for it to iterate over
some unholy mutant melding of the sequence as-it-was and as-it-is is
simply bizarre. I hope it just fell out of the implementation and wasn't
a deliberate design choice.

You mean that you don't find it intuitive that it iterates through the
indices that existed at creation time and returns the values as they are
now? I'd have said that was the *most* intuitive behaviour.

The only other behaviours I would regard as intuitive for iteration over
a mutating sequence would be to throw an exception either for mutating
the sequence while the iterator exists or for using the iterator after a
mutation.
 
A

Andreas Kraemer

Steven D'Aprano said:
Note that the starting index is determined at creation time, not when
the iteration begins. So, if you create a reversed object over a list
containing 3 elements, the first returned element will be seq[2], then
seq[1], then seq[0]. It doesn't matter if you modify the list after
creating the reversed object but before starting the iteration: it will
start at seq[2] even if it's not the last item, and will silently stop
if seq[2] is not a valid index anymore.
You know, that's probably the *least* intuitive behaviour possible.
For reversed() to iterate over the input as it was at creation time is a
reasonable design choice; for it to iterate over the input as it is at
call time is also a reasonable design choice; but for it to iterate over
some unholy mutant melding of the sequence as-it-was and as-it-is is
simply bizarre. I hope it just fell out of the implementation and wasn't
a deliberate design choice.

You mean that you don't find it intuitive that it iterates through the
indices that existed at creation time and returns the values as they are
now? I'd have said that was the *most* intuitive behaviour.

The only other behaviours I would regard as intuitive for iteration over
a mutating sequence would be to throw an exception either for mutating
the sequence while the iterator exists or for using the iterator after a
mutation.

Maybe it would have been slightly more intuitive if reversed() had
been implemented like this,

def Reversed(seq):
for i in xrange(len(seq)-1,-1,-1):
yield seq

so that the length of the sequence is determined when the iteration
starts, not when the iterator is created?

The unfortunate naming may also have added to the confusion since its
similarity to "sorted" suggests that a copy of the list is returned.
However,
<type 'listreverseiterator'>

and for comparison
<type 'listiterator'>

reveals what it really is.

-Andreas
 
T

Terry Reedy

|I don't understand how reversed() is operating. I've read the description
| in the docs:
|
| reversed(seq)
| Return a reverse iterator. seq must be an object which supports the
| sequence protocol (the __len__() method and the __getitem__() method with
| integer arguments starting at 0). New in version 2.4.

The input specification strongly suggests that rev.next() successively
yields seq[len(seq)-1], ..., seq[0]

The main question is when len(seq) is called -- on creation as it is, or
immediately before the first yield as you appear to expect, and as it would
be in this generator (which does NOT match the actual implementation):

def rev(seq):
n = len(seq)
while n:
n =-1
yield seq[n]

If len(seq) were called repeatedly (after the yield, for instance), then
termination would no longer guaranteed (see below).

I suppose the doc could be augmented with "The iterator is intialized once
with len(sequence) when it is created, rather than when first used or
anytime thereafter." But I wonder whether that would confuse those not
thinking about corner case nuances.

| and help(reversed) but neither gives any insight to what happens when you
| use reversed() on a sequence, then modify the sequence.

The sequence can potentially be modified between all calls to RL.next, and
not just before the first as in your examples.

abcs = list('abc')
for a in reversed(abcs):
print a
abcs.append(a)

The 'reverse' of a changing sequence, especially one changing in length, is
a poorly defined concept.

| >>> L = list("abc")
| >>> RL = reversed(L)
| >>> del L
| >>> list(RL)
| ['c', 'b', 'a']
|
| This suggests that reversed() makes a copy of the list:

Nope. 'del L' merely removes the association between 'L' and the list,
leaving the internal association between RL and the list and hence the list
itself. So the above is consistent with storing a reference (and an index
initialized to len-1).

| >>> L = list("abc")
| >>> RL = reversed(L)
| >>> L.append("d")
| >>> list(RL)
| ['c', 'b', 'a']
|
| This suggests that reversed() uses a reference to the original list:

It suggests that it uses a reference and an index initialized to len-1 when
reversed is called (rather than when RL.next is first called).

| >>> RL = reversed(L)
| >>> L[0] = 'e'
| >>> list(RL)
| ['d', 'c', 'b', 'e']
|
| And these examples suggests that reversed() is confused, or at least
| confusing:

This is completely consist with iterating down via reference and index.

| >>> RL = reversed(L)
| >>> del L[2]
| >>> list(RL)
| []

In the internal loop of list, RL first tries to return L[2]. But that
raises an exception, so RL quits

Terry Jan Reedy
 
D

Dustan

That's not correct. You can modify sequences while iterating, and the
results are often perfectly predictable (but watch out for off-by-one
errors):

Perhaps a better word to use here would have been "undocumented", and
therefore unsafe to use, as it might differ in different versions, or
in different data-types. Dictionaries cannot continue iteration once
they have changed size.
 
D

Duncan Booth

Andreas Kraemer said:
The only other behaviours I would regard as intuitive for iteration over
a mutating sequence would be to throw an exception either for mutating
the sequence while the iterator exists or for using the iterator after a
mutation.

Maybe it would have been slightly more intuitive if reversed() had
been implemented like this,

def Reversed(seq):
for i in xrange(len(seq)-1,-1,-1):
yield seq

so that the length of the sequence is determined when the iteration
starts, not when the iterator is created?


Perhaps, but either way it comes down to "don't modify the sequence while
iterating".
 
A

Andreas Kraemer

Maybe it would have been slightly more intuitive if reversed() had
been implemented like this,
def Reversed(seq):
for i in xrange(len(seq)-1,-1,-1):
yield seq

so that the length of the sequence is determined when the iteration
starts, not when the iterator is created?

Perhaps, but either way it comes down to "don't modify the sequence while
iterating".


Definitely agreed.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,767
Messages
2,569,570
Members
45,045
Latest member
DRCM

Latest Threads

Top