The Perils of PyContract (and Generators)

N

Nick Daly

So, just in case any body else runs into this strange, strange
happening, I thought I might as well document it somewhere Google can
find it. Contracts for Python[0] and generators don't necessarily play
well together. This mail comes in three parts, first, the example code
that didn't work at all, second, a more in-depth view of the situation
while you mull over the solution, and third, the answer.

The trouble with this example is that even with all the (2 lines of)
boilerplate code required to make PyContract work, the entire example is
all of about a dozen lines long. The lines after the "post:" directive
are executed every time the function returns, and if any one of them is
false, PyContract raises an exception, preventing the calling code from
acting on bad data:

import os

def find_files(name, directory="."):
"""Finds files with the sub-string name in the name.

post:
forall(__return__, lambda filename: name in filename)

"""
for root, dirs, files in os.walk(directory):
for the_file in files:
if name in the_file:
yield the_file

import contract
contract.checkmod(__name__)

That's it. We're just walking the directory and returning the next
matching item in the generator when it's called. However, if we try
executing this simple function in ipy (Interactive Python, not Iron
Python), nothing works as you'd expect:

In a directory containing 4 files:

["one fish", "two fish", "red fish", "blue fish"]
StopIteration: ...


Apparently our generator object is empty whenever it's returned. When
adding a print statement right before the yield, we see:
"one fish"
"two fish"
StopIteration: ...

Why are they printing during the function? Why is everything printing
before the function even returns?? Has my python croaked? (How actors
and serpents could both behave like amphibians is beyond me)

The trouble is that when the yield statement is replaced with a return
statement, everything works exactly as you might expect. It's perfect.
Unit tests don't fail, doctests are happy, and dependent code works
exactly as advertised. When you turn it back into a generator though,
well, generating empty lists for everything isn't helpful.

Getting irritated at it, I eventually just decided to comment out and
remove as many lines as possible (11) without actually breaking the
example, at which point it started working... What?

The problem actually lies in the contract. Generally, the PyContract
shouldn't affect the return values or in any way modify the code, which
it doesn't, as long as the function returns a list values (the way the
code had in fact originally been written). However, the contract
mentioned above is actually quite wrong for a generator. PyContract's
forall function checks every value in the list (exhausting the list)
before returning it, and actually operates on the actual return value,
and not a copy. Thus, when the forall function verifies all the values
in the list, it's generating every value in the list, emptying the
generator before it's returned.

Correcting the above example involves doing nothing more than
simplifying the contract:

post:
name in __return__

So, in conclusion, generators and PyContract's forall() function don't
mix, and PyContract doesn't operate off of a copy of your parameters
unless you explicitly tell it so (I don't think it ever operates off a
copy of your return value).

Nick

0: http://www.wayforward.net/pycontract/
 
S

Steven D'Aprano

The problem actually lies in the contract. Generally, the PyContract
shouldn't affect the return values or in any way modify the code, which
it doesn't, as long as the function returns a list values (the way the
code had in fact originally been written). However, the contract
mentioned above is actually quite wrong for a generator.

Yes, because you are conflating the items yielded from the generator with
the generator object returned from the generator function "find_files". You
can't look inside the generator object without using up whichever items you
look at.

[...]
Correcting the above example involves doing nothing more than
simplifying the contract:

post:
name in __return__

That can't be right, not unless PyContract is doing something I don't
expect. I expect that would be equivalent of:

'fish' in <generator object>

which should fail:
['one fish', 'two fish', 'red fish', 'blue fish']

Of course, I may be mistaking what PyContract is doing.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top