docs on for-loop with no __iter__?

S

Steven Bethard

Can someone point me to the documentation on what's supposed to happen
when you use the "for x in X:" syntax when X does not have an __iter__
method? I know that the code:
.... def __len__(self): return 42
.... def __getitem__(self, i): return i
........ print x

tries to print all the non-negative integers, starting with 0. (Note
that the __len__ method doesn't stop it at 42.) Obviously, the right
way to do this is with __iter__, but presumably this behavior is
documented somewhere...

Steve
 
A

Andrew Dalke

Steven said:
Can someone point me to the documentation on what's supposed to happen
when you use the "for x in X:" syntax when X does not have an __iter__
method?

You need to raise an IndexError
.... def __len__(self): return 42
.... def __getitem__(self, i): return i

Make that say

def __getitem__(self, i):
if i >= 42: raise IndexError, i
return i

Obviously, the right
way to do this is with __iter__, but presumably this behavior is
documented somewhere...

http://docs.python.org/ref/sequence-types.html

] Note: for loops expect that an IndexError will be
] raised for illegal indexes to allow proper detection
] of the end of the sequence.

That's all I can find in the docs (searched docs.python.org
for __getitem__ and IndexError )

Looking at the language reference from CVS, I found
http://www.python.org/dev/doc/devel/ref/for.html

It states

] The suite is then executed once for each item in
] the sequence, in the order of ascending indices.

That implies the sequence is indexed, yes? But if
the sequence implements __iter__ then there's no
possibly no underlying idea of 'index'.

Should this be fixed?

Andrew
(e-mail address removed)
 
S

Steven Bethard

Andrew Dalke said:
http://docs.python.org/ref/sequence-types.html

] Note: for loops expect that an IndexError will be
] raised for illegal indexes to allow proper detection
] of the end of the sequence.

Thanks, that's what I was looking for. I knew it had to be around there
somewhere. =) Presumably there was a reason not to use len() to determine
the end of the sequence?

Steve
 
P

Paul McGuire

Looking at the language reference from CVS, I found
http://www.python.org/dev/doc/devel/ref/for.html

It states

] The suite is then executed once for each item in
] the sequence, in the order of ascending indices.

That implies the sequence is indexed, yes? But if
the sequence implements __iter__ then there's no
possibly no underlying idea of 'index'.

Should this be fixed?

Andrew
(e-mail address removed)
Section 7.3 (from the link given above) gives the syntax for "for" as:

for_stmt ::= "for" target_list "in" expression_list ":" suite
["else" ":" suite]

and then begins describing the component syntax elements as, "The expression
list is evaluated once; it should yield a sequence." This seems to be a bit
dated, since expression_list could also be a generator or iterator.

Additionally, "for" uses an adaptive method to try to simulate an iterator
if no __iter__ method is provided, by successively calling __getitem__ until
IndexError is raised (this error gets silently absorbed within this
pseudo-iterator).

Here is a simple test class: (I also implemented __len__ thinking that it
would be used to limit the calls to __getitem__, but we can see from the
output that it is never called - instead __getitem__ gets called one time
too many, telling the pseudo-iterator that there are no more entries).

class A(object):
def __init__(self,lst):
self.list_ = lst[:]

def __len__(self):
print self.__class__.__name__+".__len__"
return len(self.list_)

def __getitem__(self,i):
print self.__class__.__name__+".__getitem__"
return self.list_

class A_with_iter(A):
def __iter__(self):
print self.__class__.__name__+".__iter__"
return iter(self.list_)


for cls in (A, A_with_iter):

a = cls([1,2,3])

print "iterate over %s" % cls.__name__
for i in a:
print i

print

output:
iterate over A
A.__getitem__
1
A.__getitem__
2
A.__getitem__
3
A.__getitem__

iterate over A_with_iter
A_with_iter.__iter__
1
2
3


Note that this is the basis for the recently reported bugs in email.Message
and cgi.FieldStorage. email.Message does not implement __iter__, and its
__getitem__ method assumes that items will be retrieved like keyed
dictionary lookups, not using integer sequence access. So when __getitem__
calls .lower() on the input string, Python throws an error - you can't do
lower() on an int.
... print i
...
Traceback (most recent call last):
File "<stdin>", line 1, in ?
File "/usr/lib/python2.3/email/Message.py",
line 304, in __getitem__
File "/usr/lib/python2.3/email/Message.py",
line 370, in get
AttributeError: 'int' object has no attribute 'lower'

-- Paul
 
A

Andrew Dalke

Steven said:
Presumably there was a reason not to use len() to determine
the end of the sequence?

Because that allows iteration over things where
you don't yet know the size. For example, reading
lines from a file. Here's how it could have been
done using Python 1.x

class FileReader:
def __init__(self, infile):
self.infile = infile
self._n = 0

def next(self):
line = self.infile.readline()
if not line:
return None
self._n = self._n + 1
return line

def __getitem__(self, i):
assert i == self._n, "forward iteration only"
x = self.next()
if x is None:
raise IndexError, i
return x

I used the _n to double check that people don't
think it's a random access container.

Support for Python 2.x iterators was easy,

def __iter__(self):
return iter(self.next, None)

Andrew
(e-mail address removed)
 
S

Steven Bethard

Andrew Dalke said:
Because that allows iteration over things where
you don't yet know the size.

I'm trying to imagine a situation where it makes sense to want the x
behavior from __getitem__ but where you don't know the final index. x
suggests random access, so if you don't really have random access, shouldn't
you be defining __iter__ instead? Are there some good examples of classes
that allow the x indexing but don't support random access (e.g. you can do
only do x[2] after you do x[1])?

(Obviously your FileReader class was an example of this, but I've assumed this
was just an example of how things *could* have been written, not how they
actually were. Please correct me if I'm wrong!)

Steve
 
J

Just

Steven Bethard said:
Andrew Dalke said:
Because that allows iteration over things where
you don't yet know the size.

I'm trying to imagine a situation where it makes sense to want the x
behavior from __getitem__ but where you don't know the final index. x
suggests random access, so if you don't really have random access, shouldn't
you be defining __iter__ instead? Are there some good examples of classes
that allow the x indexing but don't support random access (e.g. you can do
only do x[2] after you do x[1])?


You're looking at it from a wrong totally wrong direction. The iterator
protocol did not exist prior to Python 2.2, and "__getitem__ iteration"
was the only way to implement custom iteration behavior. Its
limitiations were (I think) a major incentive to create something
better. In fact, the "new" iteration protocol is so brilliant, obvious
and simple, that it's hard to see how we ever did without it, or even
why noone invented it earlier...

In other words: you're looking at a legacy protocol.

Just
 
A

Andrew Dalke

Steven said:
I'm trying to imagine a situation where it makes sense to want the x
behavior from __getitem__ but where you don't know the final index. x
suggests random access, so if you don't really have random access, shouldn't
you be defining __iter__ instead?



The example I gave was for Python 1.x, which didn't support __iter__.
What I showed was how to support basic forward iteration (for use
in a for loop) in the old-days.

I used the assert check precisely so that the suggestion of
random access breaks when someone tries to use it that way.
Are there some good examples of classes
that allow the x indexing but don't support random access (e.g. you can do
only do x[2] after you do x[1])?

(Obviously your FileReader class was an example of this, but I've assumed this
was just an example of how things *could* have been written, not how they
actually were. Please correct me if I'm wrong!)


You say "x indexing" but I don't think that's the right
way to frame the question. What I wanted was forward iteration
in Python 1.x. It happened that forward iteration was
implemented only on top of indexing, so I had to hijack the
indexing mechanism to get what I wanted. But I never thought
of it as "x indexing" only "the hack needed to get forward
iteration working correctly."

I wrote several classes using that technique. One was a
way to read records from a file, another to get records from
from a database. They worked, in that I could do things like

for record in FastaReader(open(filename)):
print record.description

The new-style __iter__ is in all ways better.



Andrew
(e-mail address removed)
 
S

Steven Bethard

Andrew Dalke said:
What I wanted was forward iteration
in Python 1.x. It happened that forward iteration was
implemented only on top of indexing, so I had to hijack the
indexing mechanism to get what I wanted. But I never thought
of it as "x indexing" only "the hack needed to get forward
iteration working correctly."


Good, that reaffirms my intuition that you didn't really want the __getitem__
behavior (eg x access) -- it was just the only way to get the __iter__
behavior too.

Would it break old code if the __getitem__ iterator checked for a __len__
method and tried to use it if it was there? It just seems like if you already
know you're creating a sequence type and you have a __len__ and a __getitem__,
then you've already provided all the necessary information for iteration. Why
should you have to define an __iter__ or throw IndexErrors in your __getitem__?

Steve
 
A

Alex Martelli

Steven Bethard said:
Andrew Dalke said:
What I wanted was forward iteration
in Python 1.x. It happened that forward iteration was
implemented only on top of indexing, so I had to hijack the
indexing mechanism to get what I wanted. But I never thought
of it as "x indexing" only "the hack needed to get forward
iteration working correctly."


Good, that reaffirms my intuition that you didn't really want the __getitem__
behavior (eg x access) -- it was just the only way to get the __iter__
behavior too.


Yes, it used to be.

Would it break old code if the __getitem__ iterator checked for a __len__
method and tried to use it if it was there? It just seems like if you already

Yes, it would break old code wherever the old code had a __len__ method
returning a value not congruent with the index value for which
__getitem__ raises IndexError. That's possibly weird old code, but why
should it get broken?

Consider __len__ used to be a popular way to let your instances be
usable in a boolean context -- I believe __nonzero__ was introduced
later. So, take a class which only know whether it's empty or not, it
could have a __len__ that only returns 0 (==empty) or 1(==nonempty),
and still allow proper iteration by only raising in __getitem__ when all
items have been iterated on. If loops took account of __len__ suddenly
all that old code would break. Maybe there's much and maybe there's
little, but why break ANY of it?!
know you're creating a sequence type and you have a __len__ and a __getitem__,
then you've already provided all the necessary information for iteration. Why
should you have to define an __iter__ or throw IndexErrors in your
__getitem__?

Because you used to be allowed to return from __len__ a value not
congruent with the index value for which __getitem__ raises IndexError,
and changing that might break old code. As to whether that legacy
protocol was optimal, I think not -- today's __iter__ is clearly better,
simpler and faster. If you have many classes which define __getitem__
and __len__ and want to iterate on them all, make a mixin class:

class MixinLenwiseIterator:
def __iter__(self):
return LenwiseIterator(self)

class LenwiseIterator(object):
def __iter__(self):
return self
def __init__(self, seq):
self.seq = seq
self.i = 0
def next(self):
if self.i >= len(self.seq):
raise StopIteration
result = self.seq[self.i]
self.i += 1
return result

Just add MixinLenwiseIterator to your sequence classes' bases and be
happy.


Alex
 
S

Steven Bethard

Alex Martelli said:
Consider __len__ used to be a popular way to let your instances be
usable in a boolean context -- I believe __nonzero__ was introduced
later. So, take a class which only know whether it's empty or not, it
could have a __len__ that only returns 0 (==empty) or 1(==nonempty),
and still allow proper iteration by only raising in __getitem__ when all
items have been iterated on. If loops took account of __len__ suddenly
all that old code would break. Maybe there's much and maybe there's
little, but why break ANY of it?!

Same reasons Python always breaks old code -- ease and clarity of coding.
Hence the 'yield' keyword for generator functions, the change of method
resolution order, etc. Still, you make a good case -- I can imagine a fair
number of classes might run into these bugs (as compared to a very few classes
that would have changed from the introduction of 'yield' or the changed method
resolution order).
Just add MixinLenwiseIterator to your sequence classes' bases and be
happy.

If you read my posts from the beginning, I was clearly never asking for the
workaround -- I was asking for why the protocol was the way it was and why it
hadn't been updated after __iter__ was introduced. Despite a few snide
remarks ;) you did answer my question though, thanks!

Anyone know if the __getitem__ protocol will still be supported in Python 3000?

Steve
 
A

Alex Martelli

Steven Bethard said:
Same reasons Python always breaks old code -- ease and clarity of coding.

This is done extremely rarely. I have some unmaintained applications
from 1.5.2 days that still run just fine on the rare cases I need them.
(Luckily, they didn't happen to use 'yield').

In this case, if you're going to maintain the old app you should
obviously do that by adding an __iter__, so the issue doesn't arise.
Hence the 'yield' keyword for generator functions, the change of method
resolution order, etc.

MRO didn't change for classic classes, thus unmaintaned apps can't be
affected by that. The new yield keyword (and 'as' becoming a kw) are
the kind of changes that could break old unmaintained but good code.
Still, you make a good case -- I can imagine a fair
number of classes might run into these bugs (as compared to a very few classes
that would have changed from the introduction of 'yield' or the changed method
resolution order).

No classic class can possibly have been affected by a change that
specifically didn't affect such classes, so that number is surely 0.
Apart from the new keywords, and occasionally fixing up some sloppiness
that had used to work while not being intended to, I have a hard time
thinking of what breakage you thing has been wrought on old code -- we
still have classic classes around, and the DEFAULT, natch, just to avoid
such breakage.

And when there's breakage it's as far as possible obvious and easy to
fix if the old app is even _minimally_ maintained... I've done such
minimal maintenance to a lot of old apps of mine I have around and it's
generally an issue of wanting to take some advantage of new
opportunities, not _having_ to, new keywords apart.

If you read my posts from the beginning, I was clearly never asking for the
workaround -- I was asking for why the protocol was the way it was and why it
hadn't been updated after __iter__ was introduced. Despite a few snide
remarks ;) you did answer my question though, thanks!

You're welcome, and I do think that the second part of the question was
pretty weird -- with all the trouble we go to, to keep backwards
compatibility with most old unmaintained apps, just imagining we'd go
around breaking them to no good purpose seems weird to me. Unless one
_is_ maintaining old code or has the rare problem of needing to support
a wide range of Python versions (exceedingly rare for applications,
although libraries and tools may of course well have it), I can't really
see why these historical quibbles would be of much interest, anyway.
Anyone know if the __getitem__ protocol will still be supported in Python
3000?

Nobody can know for sure, but knowing the design philosophy that's
already declared as being behind Python 3.0, I'd be amazed if the old
protocol wasn't among the most obvious candidates for the axe. Since
3.0 _will_ break a lot of unmaintained old code, there will probably be
2.something releases continuing in parallel with it for a while, btw --
I first heard of that part of the plan at Europython this summer, but it
does seem to make sense.


Alex
 
S

Steven Bethard

Alex Martelli said:
MRO didn't change for classic classes, thus unmaintaned apps can't be
affected by that.

I may be mistaken, but I thought MRO did change for new classes... I read in
http://www.python.org/2.3/mro.html:

"In his post, Samuele showed that the Python 2.2 method resolution order is
not monotonic and he proposed to replace it with the C3 method resolution
order. Guido agreed with his arguments and therefore now Python 2.3 uses C3."

And the docs seemed to indicate that new-style classes were available in
Python 2.2. Did the new-style classes and the C3 MRO actually both come in
2.2?

Anyway, if you read the docs the way I did above, you could imagine that any
new-style class that was written in Python 2.2 could potentially have been
broken by the new MRO in Python 2.3. So the change in MRO would have been a
change that could break old code.
You're welcome, and I do think that the second part of the question was
pretty weird -- with all the trouble we go to, to keep backwards
compatibility with most old unmaintained apps, just imagining we'd go
around breaking them to no good purpose seems weird to me.

Clearly we're talking past each other. When I asked the question, I didn't
know what code would be broken. That's why I asked the question. (Go back
and read some of the thread where I ask questions like "Would it break old
code if the __getitem__ iterator checked for a __len__ method...?" if you
don't believe me.)

If no code would have been broken, I don't see why it would be unreasonable to
use a more intuitive protocol. I understand the history here is probably old
hat to you, but it's not to me, and that's why I was asking. Suggesting that
I want to "go around breaking [apps] to no good purpose" is just being
inflammatory.
I can't really see why these historical quibbles would be of much interest,
anyway.

Well, it's an interesting design decision that resulted from an interesting
set of facilities that were in the language at the time. Of course, if you
were already using Python at the time these decisions were made, you already
knew all about it. I'm a relative newcomer to Python though, and its history
and evolution is still interesting to me.

If there's a better forum to ask questions about Python's history, I'd be glad
if you'd redirect me to it.

Steve
 
A

Alex Martelli

Steven Bethard said:
I may be mistaken, but I thought MRO did change for new classes... I read in
http://www.python.org/2.3/mro.html:

"In his post, Samuele showed that the Python 2.2 method resolution order is
not monotonic and he proposed to replace it with the C3 method resolution
order. Guido agreed with his arguments and therefore now Python 2.3 uses C3."

And the docs seemed to indicate that new-style classes were available in
Python 2.2. Did the new-style classes and the C3 MRO actually both come in
2.2?

Anyway, if you read the docs the way I did above, you could imagine that any
new-style class that was written in Python 2.2 could potentially have been
broken by the new MRO in Python 2.3. So the change in MRO would have been a
change that could break old code.

Ah, sorry, you're right -- I'm not used to think of 2.2 applications as
"old code" yet, but of course there might theoretically be unmaintained
apps around written for 2.2. Bug fixes can indeed "break old code"
which somehow managed to rely on the bug (the "is not monotonic" part in
your quote above being a bug with 2.2's MRO) -- Python is not quite as
fossilized yet as to guarantee bug-to-bug compatibility between
versions.

I just don't see fixing bugs as comparable in any way with other
maintenance activity, based on dropping or modifying existing features
that were never bugs, just because there are now better ways to perform
certain tasks that the existing features existed for.

Clearly we're talking past each other. When I asked the question, I didn't
know what code would be broken. That's why I asked the question. (Go back
and read some of the thread where I ask questions like "Would it break old
code if the __getitem__ iterator checked for a __len__ method...?" if you
don't believe me.)

You asked that question in the post I was answering, yes. Then, when I
responded, you claimed that "Python always breaks old code", which
seemed and still seems like a weird claim to me.
If no code would have been broken, I don't see why it would be unreasonable to
use a more intuitive protocol.

It may seem more intuitive to programmers trained in languages where
exceptions are to be used as little as feasible, but to a
dyed-in-the-wool Pythonista it doesn't. Your proposed protocol would be
more complicated than the historical one, since it would have to cover
both classes that expose __len__ and ones that don't, in different ways,
while the historical one is simpler because it does not need to draw
that distinction, and "simpler is better than complicated" is a very
important design principle in Python.
I understand the history here is probably old
hat to you, but it's not to me, and that's why I was asking. Suggesting that
I want to "go around breaking [apps] to no good purpose" is just being
inflammatory.

My text you're quoting doesn't say you want to break old code: rather,
it's based on your saying that _Python_ (thus, presumably, the python
developers/committers, which is who I mean by "we" in my text you quote
above) "always breaks old code", and I believe THAT text of yours, at
least when read at face value, is the part which is "just being
inflammatory". If you have used other programming platforms in the
past, I think it's reasonable to expect you to perceive that Python, far
from _always_ "breaking old code", goes to pretty great lengths to avoid
doing so, albeit only within the realm of the "reasonable" (an
application which relied on a bug happening to cause a certain specific
behavior was, in a sense, broken at birth...).

At this point, I suspect you didn't mean that "always" in the way I read
it, but I don't think my reading of it was unreasonable. Starting from
the default assumption that miscommunications are most often due to both
speaker and listener, I'm quite willing to take my half share of
responsibility for this one if you're willing to take yours, and we can
call it quits.

Well, it's an interesting design decision that resulted from an interesting
set of facilities that were in the language at the time. Of course, if you
were already using Python at the time these decisions were made, you already
knew all about it. I'm a relative newcomer to Python though, and its history
and evolution is still interesting to me.

If there's a better forum to ask questions about Python's history, I'd be glad
if you'd redirect me to it.

No, I think this is the right forum, and I apologize if I seemed to
indicate otherwise. History is of interest to few people, but, being
one of those few, I should be encouraging such interests, surely not
discouraging them.


Alex
 
S

Steven Bethard

Alex Martelli said:
At this point, I suspect you didn't mean that "always" in the way I read
it, but I don't think my reading of it was unreasonable. Starting from
the default assumption that miscommunications are most often due to both
speaker and listener, I'm quite willing to take my half share of
responsibility for this one if you're willing to take yours, and we can
call it quits.

Ahh, yeah, I'm sorry, I didn't even realize that reading of "always" was
possible. When I wrote "Same reasons Python always breaks old code" I
intended the reading "For the only reasons that Python ever breaks old code",
but I can see the other reading now... Yup, I'll take my half. =)

Steve
 
S

Steven Bethard

Alex Martelli said:
It may seem more intuitive to programmers trained in languages where
exceptions are to be used as little as feasible, but to a
dyed-in-the-wool Pythonista it doesn't. Your proposed protocol would be
more complicated than the historical one, since it would have to cover
both classes that expose __len__ and ones that don't, in different ways,
while the historical one is simpler because it does not need to draw
that distinction, and "simpler is better than complicated" is a very
important design principle in Python.

Well, my logic here was something along the lines of:

If you've provided a __len__ method and a __getitem__ that returns items for
each index, then you've already provided all the necessary information for
iteration. If the __getitem__ also has to raise an IndexError when the index
exceeds the length, in some sense, you're duplicating information -- both
__len__ and the IndexError tell you the length of the sequence. So it's not
an exception fear, but a duplication of code fear, which I hope is somewhat
more Pythonic. =)

It would have made describing the protocol somewhat more complex, but it would
have made using the protocol in a class simpler. Moot point of course, since
I'm fully convinced that changing the protocol is infeasible. =)

Steve
 
J

Just

Steven Bethard said:
It would have made describing the protocol somewhat more complex, but it
would
have made using the protocol in a class simpler. Moot point of course, since
I'm fully convinced that changing the protocol is infeasible. =)

Why on earth would you want to improve a protocol that's only there for
legacy reasons, and has been replaced by something vastly better?

Just
 
A

Alex Martelli

Steven Bethard said:
If you've provided a __len__ method and a __getitem__ that returns items for
each index, then you've already provided all the necessary information for
iteration. If the __getitem__ also has to raise an IndexError when the index
exceeds the length, in some sense, you're duplicating information -- both
__len__ and the IndexError tell you the length of the sequence. So it's not
an exception fear, but a duplication of code fear, which I hope is somewhat
more Pythonic. =)

It would have made describing the protocol somewhat more complex, but it would
have made using the protocol in a class simpler. Moot point of course, since
I'm fully convinced that changing the protocol is infeasible. =)

We do agree the point is moot, but we deeply disagree on the point
itself.

__getitem__ has to raise IndexError for invalid indices -- that's part
of its job. Now we're not talking just about iteration anymore, but any
kind of indexing. Having the existence of __len__ interfere with how or
whether __getitem__ gets called would just substantially complicate
things, particularly considering that both exist for mappings as well as
for sequences. It may look superficially "convenient" to lighten your
_getitem__'s burden by wishing it would only be called for "good"
indices, but it's really an optical illusion.

As for having __getitem__ sometimes called unconditionally (whether
__len__ is there or not) and sometimes conditionally (when either
__len__ is absent, or, if present, then only for indices that appear to
be correct depending on __len__'s return value) -- this way madness
lies. Attempts to make things "convenient" in this way are behind the
almost inevitable bloating of all languages whose design principles
don't put simplicity high enough on the list. "Special cases are not
special enough to break the rules". When moving to Python from other
languages, this kind of dynamic tension between mere convenience and
conceptual simplicity is quite an important thing to keep in mind, if
one is keen about understanding in depth various aspects of its design
(and you do appear to be just the kind of person who values such
in-depth understanding -- I'm also very much like that, myself) -- which
is why I'm trying to explain my mental model for why that particular
design aspect, even though not all that relevant today, was indeed an
excellent choice (note that I have no bias in the matter -- that
protocol was in Python well before I knew what Python was!-).

It's partly a matter of "look before you leap" versus "easier to ask
forgiveness than permission", a conceptual distinction that IS quite a
hobby-horse of mine (although "practicality beats purity", mind you;-).


Alex
 
A

Alex Martelli

Just said:
Why on earth would you want to improve a protocol that's only there for
legacy reasons, and has been replaced by something vastly better?

I think he wants to understand more than he actually wants any change.

Sometimes it helps to understand X to consider what alternatives there
could be to X, even though it's not practically feasible to implement
such alternatives. At first I had a reaction quite similar to yours,
but I am currently holding this working hypothesis -- that it's about
understanding more than anything else. And I do believe that
understanding why that now-obsolete protocol was indeed optimal, most
Pythonic, in its time, can enhance one's understanding of what Pythonic
means. Of course, you're unfairly and genetically advantaged in that
understanding, but most of us have to work for it!-)


Alex
 
S

Steven Bethard

Alex Martelli said:
It's partly a matter of "look before you leap" versus "easier to ask
forgiveness than permission", a conceptual distinction that IS quite a
hobby-horse of mine (although "practicality beats purity", mind you.

Well, in either case, someone is looking before they leap. If __len__ is
checked in the protocol, then the protocol (i.e. the Python interpreter) has
to do the looking. Otherwise, the programmer has to do the looking to
determine when to raise the IndexError.

Hmm... Though I guess it kinda depends what you do in your __getitem__...
The example we've been looking at was something like:

class S:
def __len__(self): return 42
def __getitem__(self, i):
if 0 <= i < len(self):
return i
raise IndexError, i

So in this case, the programmer has to "look before they leap" (hence the if
statement). But in a more realistic situation, I can see that maybe you could
just "ask forgivenesss instead of permission":

class T:
def __init__(self, data): self.data = data
def __len__(self): return len(self.data)
def __getitem__(self, i):
try:
return self.data
except (IndexError, KeyError):
raise IndexError, i

No look-ahead here -- assume you'll usually get valid indices and catch the
exception if you don't.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,534
Members
45,008
Latest member
Rahul737

Latest Threads

Top