# Iteration over Lists and Strings

Discussion in 'Python' started by DeepBleu, Aug 28, 2004.

1. ### DeepBleuGuest

I noticed the following:
>>a = 'abba'
>>for n in a:
>> print n, a.index(n)

a 0
b 1
b 1
a 0

(expected result:
a 0
b 1
b 2
a 3)

The same with lists:
>>List = [1, 0 , 1]
>>for n in List:
>> print n, List.index(n)

1 0
0 1
1 0

(expected result:
1 0
0 1
1 2)

What is going on? Can someone clarify this to me? And how can I ensure
that the iteration produces the absolute index number **without** doing
something like:
>>a = 'abba'
>>k = len(a)
>>for m in range(0, k):
>> print a[m], m

a 0
b 1
b 2
a 3

Thanks -

DeepBleu, Aug 28, 2004

2. ### Andrew DurdinGuest

On Sat, 28 Aug 2004 03:22:48 GMT, DeepBleu <> wrote:
> What is going on? Can someone clarify this to me? And how can I ensure
> that the iteration produces the absolute index number **without** doing
> something like:
> >>a = 'abba'
> >>k = len(a)
> >>for m in range(0, k):
> >> print a[m], m

The index() method looks up the first index of the given object in the
list/sequence. What you want is to use the enumerate() builtin:

>>>a = 'abba'
>>>for i, c in enumerate(a):

.... print c, i
....
a 0
b 1
b 2
a 3

Andrew Durdin, Aug 28, 2004

3. ### Michel Claveau - abstraction méta-galactique non tGuest

and enumerate is more fast than index.

Michel Claveau - abstraction méta-galactique non t, Aug 28, 2004
4. ### Alex MartelliGuest

DeepBleu <> wrote:

> I noticed the following:
> >>a = 'abba'
> >>for n in a:
> >> print n, a.index(n)

> a 0
> b 1
> b 1
> a 0

a.index(n) returns the FIRST index x of a such thatn a[x]==n.

> (expected result:
> a 0
> b 1
> b 2
> a 3)

Misfounded expectation. The first 'b' and the second 'b' are equal, for
example, so a.index can never possibly return different results for
them.

> What is going on? Can someone clarify this to me? And how can I ensure
> that the iteration produces the absolute index number **without** doing
> something like:
> >>a = 'abba'
> >>k = len(a)
> >>for m in range(0, k):
> >> print a[m], m

> a 0
> b 1
> b 2
> a 3

That's what the enumerate built-in is for:

for m, n in enumerate(a):
print n, m

Alex

Alex Martelli, Aug 28, 2004
5. ### Alex MartelliGuest

Michel Claveau - abstraction méta-galactique non triviale en fuite
perpétuelle. <> wrote:

> and enumerate is more fast than index.

Oh, absolutely. sequence.index(anitem) takes time proportional to
len(sequence), for the average item. If you repeat that operation for
all items in sequence, you end up with total time proportional to the
SQUARE of len(sequence) -- a LOT, for long sequences, enumerate itself
takes constant time, and looping over all items that enumerate yields
takes time proportional to the number of items (costant time per item).

If you're familiar with big-O notation, we're talking O(N) vs O(N
square)... not the kind of performance issue one can ignore, for long
sequences, because the difference in performance keeps going up and up
without bounds as the sequence grows longer.

Alex

Alex Martelli, Aug 28, 2004
6. ### Terry ReedyGuest

"DeepBleu" <> wrote in message
news:cUSXc.56768\$...
> (expected result:

When something does not act as expected, look in the manual or try the
builtin help facility, which only takes seconds.

>>> help(list.index)

Help on method_descriptor:

index(...)
L.index(value) -> integer -- return index of first occurrence of value

dir(object) will give you a list of attributes and methods. You can look
up new ones the same way. In particular, dir(list) and dir(__builtins__).

Terry J. Reedy

Terry Reedy, Aug 28, 2004
7. ### DeepBleuGuest

Thanks all.

DeepBleu, Aug 28, 2004
8. ### Brent W. HughesGuest

"Alex Martelli" <> wrote in message
news:1gj80xs.1cfownoqz4m9N%...
> Michel Claveau - abstraction méta-galactique non triviale en fuite
> perpétuelle. <> wrote:
>
> > and enumerate is more fast than index.

>
> Oh, absolutely. sequence.index(anitem) takes time proportional to
> len(sequence), for the average item. If you repeat that operation for
> all items in sequence, you end up with total time proportional to the
> SQUARE of len(sequence) -- a LOT, for long sequences, enumerate itself
> takes constant time, and looping over all items that enumerate yields
> takes time proportional to the number of items (costant time per item).
>
> If you're familiar with big-O notation, we're talking O(N) vs O(N
> square)... not the kind of performance issue one can ignore, for long
> sequences, because the difference in performance keeps going up and up
> without bounds as the sequence grows longer.
>
>
> Alex

Did you say enumerate(seq) takes constant time? I would have thought it was
proportional to len(seq).

Brent W. Hughes, Aug 28, 2004
9. ### Marc 'BlackJack' RintschGuest

In <5M4Yc.326126\$a24.276550@attbi_s03>, Brent W. Hughes wrote:

> Did you say enumerate(seq) takes constant time? I would have thought it was
> proportional to len(seq).

>>> seq = [1,2,3]
>>> enumerate(seq)

<enumerate object at 0x4039dcec>

I guess creating a enumerate object (generator) takes constant time. ;-)

Ciao,
Marc 'BlackJack' Rintsch

Marc 'BlackJack' Rintsch, Aug 28, 2004
10. ### Andrew DurdinGuest

On Sat, 28 Aug 2004 19:09:53 GMT, Brent W. Hughes
<> wrote:
>
> Did you say enumerate(seq) takes constant time? I would have thought it was
> proportional to len(seq).

enumerate(seq) is O(1)
index(seq) is O(N)

And thus:

for i, val in enumerate(seq): print i is O(N)
for val in seq: print seq.index(val) is O(N*N)

Andrew Durdin, Aug 28, 2004
11. ### Alex MartelliGuest

Brent W. Hughes <> wrote:
...
> > SQUARE of len(sequence) -- a LOT, for long sequences, enumerate itself
> > takes constant time, and looping over all items that enumerate yields
> > takes time proportional to the number of items (costant time per item).

...
> Did you say enumerate(seq) takes constant time? I would have thought it was
> proportional to len(seq).

Please look carefully at my quoted text:
the call to enumerate(seq) itself takes constant time
looping over [N items] takes O(N)
In particular, looping over O(len(seq)) items does take O(len(seq)),
whether you call enumerate somewhere or not

The distinction matters, because some loops on enumerate(seq) may be
able to bail out earlier, which may reduce their big-O behavior. A
trivial example:
for i, item in enumerate(seq):
if i >= 3: break
print item
This prints the first few items of seq, but no more than three of them
in any case; and it takes O(1), constant time (assuming that so do
iter(seq)'s next(), and str(item) too) -- no matter how long seq is.

Don't take this for granted, because it was not necessarily true before
Python refined its looping protocol. Consider two recipes for
enumerate that you'll find in the Cookbook's first edition...:

[a]

class enumerate:
def __init__(self, seq):
self.seq = seq
def __getitem__(self, i):
return i, self.seq

def enumerate(seq): return zip(xrange(sys.maxint), seq)

Recipe [a] does satisfy the same big-O performance constraints as
today's enumerate ([a] is less general than today's enumerate, of
course, because at that time the looping protocol did require the
ability to index into the object, so [a] requires that from seq). But
recipe does not -- just calling enumerate in this version is O(N) in
both time and space. ((Nevertheless, was faster than [a] in many
practically important cases, which is why I presented it anyway;-)).

Alex

Alex Martelli, Aug 30, 2004