Iteration over Lists and Strings

D

DeepBleu

I noticed the following:a 0
b 1
b 1
a 0

(expected result:
a 0
b 1
b 2
a 3)

The same with lists:
List = [1, 0 , 1]
for n in List:
print n, List.index(n)
1 0
0 1
1 0

(expected result:
1 0
0 1
1 2)

What is going on? Can someone clarify this to me? And how can I ensure
that the iteration produces the absolute index number **without** doing
something like:
a = 'abba'
k = len(a)
for m in range(0, k):
print a[m], m
a 0
b 1
b 2
a 3

Thanks -
 
A

Andrew Durdin

What is going on? Can someone clarify this to me? And how can I ensure
that the iteration produces the absolute index number **without** doing
something like:
a = 'abba'
k = len(a)
for m in range(0, k):
print a[m], m

The index() method looks up the first index of the given object in the
list/sequence. What you want is to use the enumerate() builtin:
.... print c, i
....
a 0
b 1
b 2
a 3
 
A

Alex Martelli

DeepBleu said:
I noticed the following:
a 0
b 1
b 1
a 0

a.index(n) returns the FIRST index x of a such thatn a[x]==n.
(expected result:
a 0
b 1
b 2
a 3)

Misfounded expectation. The first 'b' and the second 'b' are equal, for
example, so a.index can never possibly return different results for
them.

What is going on? Can someone clarify this to me? And how can I ensure
that the iteration produces the absolute index number **without** doing
something like:
a = 'abba'
k = len(a)
for m in range(0, k):
print a[m], m
a 0
b 1
b 2
a 3

That's what the enumerate built-in is for:

for m, n in enumerate(a):
print n, m


Alex
 
A

Alex Martelli

Michel Claveau - abstraction méta-galactique non triviale en fuite
perpétuelle. said:
and enumerate is more fast than index.

Oh, absolutely. sequence.index(anitem) takes time proportional to
len(sequence), for the average item. If you repeat that operation for
all items in sequence, you end up with total time proportional to the
SQUARE of len(sequence) -- a LOT, for long sequences, enumerate itself
takes constant time, and looping over all items that enumerate yields
takes time proportional to the number of items (costant time per item).

If you're familiar with big-O notation, we're talking O(N) vs O(N
square)... not the kind of performance issue one can ignore, for long
sequences, because the difference in performance keeps going up and up
without bounds as the sequence grows longer.


Alex
 
T

Terry Reedy

DeepBleu said:
(expected result:

When something does not act as expected, look in the manual or try the
builtin help facility, which only takes seconds.
Help on method_descriptor:

index(...)
L.index(value) -> integer -- return index of first occurrence of value

dir(object) will give you a list of attributes and methods. You can look
up new ones the same way. In particular, dir(list) and dir(__builtins__).

Terry J. Reedy
 
B

Brent W. Hughes

Alex Martelli said:
Michel Claveau - abstraction méta-galactique non triviale en fuite


Oh, absolutely. sequence.index(anitem) takes time proportional to
len(sequence), for the average item. If you repeat that operation for
all items in sequence, you end up with total time proportional to the
SQUARE of len(sequence) -- a LOT, for long sequences, enumerate itself
takes constant time, and looping over all items that enumerate yields
takes time proportional to the number of items (costant time per item).

If you're familiar with big-O notation, we're talking O(N) vs O(N
square)... not the kind of performance issue one can ignore, for long
sequences, because the difference in performance keeps going up and up
without bounds as the sequence grows longer.


Alex


Did you say enumerate(seq) takes constant time? I would have thought it was
proportional to len(seq).
 
A

Andrew Durdin

Did you say enumerate(seq) takes constant time? I would have thought it was
proportional to len(seq).

enumerate(seq) is O(1)
index(seq) is O(N)

And thus:

for i, val in enumerate(seq): print i is O(N)
for val in seq: print seq.index(val) is O(N*N)
 
A

Alex Martelli

Brent W. Hughes said:
...
Did you say enumerate(seq) takes constant time? I would have thought it was
proportional to len(seq).

Please look carefully at my quoted text:
the call to enumerate(seq) itself takes constant time
looping over [N items] takes O(N)
In particular, looping over O(len(seq)) items does take O(len(seq)),
whether you call enumerate somewhere or not

The distinction matters, because some loops on enumerate(seq) may be
able to bail out earlier, which may reduce their big-O behavior. A
trivial example:
for i, item in enumerate(seq):
if i >= 3: break
print item
This prints the first few items of seq, but no more than three of them
in any case; and it takes O(1), constant time (assuming that so do
iter(seq)'s next(), and str(item) too) -- no matter how long seq is.


Don't take this for granted, because it was not necessarily true before
Python refined its looping protocol. Consider two recipes for
enumerate that you'll find in the Cookbook's first edition...:

[a]

class enumerate:
def __init__(self, seq):
self.seq = seq
def __getitem__(self, i):
return i, self.seq



def enumerate(seq): return zip(xrange(sys.maxint), seq)

Recipe [a] does satisfy the same big-O performance constraints as
today's enumerate ([a] is less general than today's enumerate, of
course, because at that time the looping protocol did require the
ability to index into the object, so [a] requires that from seq). But
recipe does not -- just calling enumerate in this version is O(N) in
both time and space. ((Nevertheless, was faster than [a] in many
practically important cases, which is why I presented it anyway;-)).


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top