genexp surprise (wart?)

P

Paul Rubin

I tried to code the Sieve of Erastosthenes with generators:

def sieve_all(n = 100):
# yield all primes up to n
stream = iter(xrange(2, n))
while True:
p = stream.next()
yield p
# filter out all multiples of p from stream
stream = (q for q in stream if q%p != 0)

# print primes up to 100
print list(sieve_all(100))

but it didn't work. I had to replace

stream = (q for q in stream if q%p != 0)

with

def s1(p):
return (q for q in stream if q%p != 0)
stream = s1(p)

or alternatively

stream = (lambda p,stream: \
(q for q in stream if q%p != 0)) (p, stream)

I had thought that genexps worked like that automatically, i.e. the
stuff inside the genexp was in its own scope. If it's not real
obvious what's happening instead, that's a sign that the current
behavior is a wart. (The problem is that p in my first genexp comes
from the outer scope, and changes as the sieve iterates through the
stream)

Oh well.
 
B

Ben Cartwright

Paul said:
I tried to code the Sieve of Erastosthenes with generators:

def sieve_all(n = 100):
# yield all primes up to n
stream = iter(xrange(2, n))
while True:
p = stream.next()
yield p
# filter out all multiples of p from stream
stream = (q for q in stream if q%p != 0)

# print primes up to 100
print list(sieve_all(100))

but it didn't work. I had to replace

stream = (q for q in stream if q%p != 0)

with

def s1(p):
return (q for q in stream if q%p != 0)
stream = s1(p)

or alternatively

stream = (lambda p,stream: \
(q for q in stream if q%p != 0)) (p, stream)


You do realize that you're creating a new level of generator nesting
with each iteration of the while loop, right? You will quickly hit the
maximum recursion limit. Try generating the first 1000 primes.

I had thought that genexps worked like that automatically, i.e. the
stuff inside the genexp was in its own scope. If it's not real
obvious what's happening instead, that's a sign that the current
behavior is a wart. (The problem is that p in my first genexp comes
from the outer scope, and changes as the sieve iterates through the
stream)


I don't see how it's a wart. p is accessed (i.e., not set) by the
genexp. Consistent with the function scoping rules in...
http://www.python.org/doc/faq/progr...ules-for-local-and-global-variables-in-python
....Python treats p in the genexp as a non-local variable.

--Ben
 
P

Paul Rubin

Ben Cartwright said:
You do realize that you're creating a new level of generator nesting
with each iteration of the while loop, right? You will quickly hit the
maximum recursion limit. Try generating the first 1000 primes.

Yes, I know about the generator nesting, that was the point. The
Sieve of Eraswhatsisname takes linear space by definition. It's
usually done with an array but this version uses nested generators
instead. I didn't bother to test whether they counted against the
recursion limit, so it's informative to know that they do.
I don't see how it's a wart. p is accessed (i.e., not set) by the
genexp. Consistent with the function scoping rules in...
http://www.python.org/doc/faq/progr...ules-for-local-and-global-variables-in-python
...Python treats p in the genexp as a non-local variable.

Yeah, it's just counterintuitive is all. I guess the natural way to
express this would have been with tail recursion instead of a while
loop.
 
P

Paul Du Bois

The generator is in its own scope. For proof, try accessing q outside
the generator.

There are two possibilities. The first is that you don't know what
closures are and are complaining that python has them. That would be
amusingly ironic, but I'm guessing you do know (if you don't, google
"make_adder" and be enlightened)

The second is that you don't like the late-binding behavior of
generator expressions. PEP 289 has this to say:
After much discussion, it was decided that the first (outermost)
for-expression should be evaluated immediately and that the remaining
expressions be evaluated when the generator is executed.
and:

After exploring many possibilities, a consensus emerged that [...] [for]
more complex applications, full generator definitions are always superior
in terms of being obvious about scope, lifetime, and binding

And as an illustration of that last point, consider:

def sieve_all(n = 100):
# generate all primes up to n

def filter_multiples(input, m):
for q in input:
if q % m != 0:
yield q

stream = iter(xrange(2, n))
while True:
p = stream.next()
yield p
# this is now a redundant comment.
# filter out all multiples of p from stream
stream = filter_multiples(stream, p)


p
 
P

Paul Rubin

Paul Rubin said:
Yeah, it's just counterintuitive is all. I guess the natural way to
express this would have been with tail recursion instead of a while
loop.

FWIW, here's a listcomp version:

def s2(n=100):
stream = range(2,n)
while stream:
p = stream[0]
yield p
stream = [s for s in stream if s%p != 0]

print list(s2(100))

This should have the same space consumption and approx. running time
as the classic sieve. The genexp version actually used O(n/log(n))
space instead of linear space, I think.
 
M

Michele Simionato

Paul said:
I tried to code the Sieve of Erastosthenes with generators:

def sieve_all(n = 100):
# yield all primes up to n
stream = iter(xrange(2, n))
while True:
p = stream.next()
yield p
# filter out all multiples of p from stream
stream = (q for q in stream if q%p != 0)

# print primes up to 100
print list(sieve_all(100))

but it didn't work.

This is a known issue with the scope rules in loops which has bitten
many (including
myself; there is a long thread involving me and Jacek Generowitz
debating this at
death, you may find it if you google the newsgroup).

I would be curious to know if your code would work the way you expect
in Haskell
(I know it would for 'for' loop, dunno about 'while' loops, I am
completely ignorant
in Haskell).

Michele Simionato
 
P

Paul Rubin

Paul Du Bois said:
The second is that you don't like the late-binding behavior of
generator expressions. PEP 289 has this to say:

Thanks. That PEP is informative. It shows that I stumbled into
something that other people found confusing too, and that alternatives
were considered that turned out not to be better.
 
M

Mel Wilson

Paul said:
Thanks. That PEP is informative. It shows that I stumbled into
something that other people found confusing too, and that alternatives
were considered that turned out not to be better.

The net effect of the bad version seems to be that you're
not getting a new generator from
stream = (q for q in stream if q%p != 0)
(* disassembly shown below.)

Disassembly seems to show the program getting a pre-compiled
generator expression code block, and binding the new value
of p to it, then getting iter(stream), which is stream. As
Paul Du Bois says, the outer loop is fixed, but the
subsequent if-expression is liable to change.

So 3 passes because 3%2 != 0
4 passes because 4%3 != 0
5 passes because 5%4 != 0
and so on.


Interesting.

* Disassembly of stream = ...:

6 47 LOAD_CLOSURE 0 (p)
50 LOAD_CONST 2 (<code object
<generator expression> at 009D65E0, file "try_eratos.py",
line 6>)
53 MAKE_CLOSURE 0
56 LOAD_FAST 1 (stream)
59 GET_ITER
60 CALL_FUNCTION 1
63 STORE_FAST 1 (stream)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,043
Latest member
CannalabsCBDReview

Latest Threads

Top