Mixing generators and recursion

G

Gerson Kurz

I stumbled upon a behaviour that would be somewhat nice to have, but
seemingly is not possible (in 2.3 anyway):

import os, stat

def walkem(directory):
print "Read files in %s" % directory
for filename in os.listdir(directory):
pathname = os.path.join(directory, filename)
if os.path.isdir(pathname):
print "Before walkem: %s" % pathname
walkem(pathname)
print "After walkem: %s" % pathname
else:
yield pathname

if __name__ == "__main__":
for filename in walkem("<your directory here>"):
print filename

This code will not recurse subdirectories - because the recursive
walkem() is not called (or so is my interpretation). It is of course
easy to rewrite this; I was just wondering if someone can enlighten me
why this happens (the yield documentation doesn't give me any clues),
and whether this might be a future feature?
 
P

Peter Hansen

Gerson said:
I stumbled upon a behaviour that would be somewhat nice to have, but
seemingly is not possible (in 2.3 anyway):

import os, stat

def walkem(directory):
print "Read files in %s" % directory
for filename in os.listdir(directory):
pathname = os.path.join(directory, filename)
if os.path.isdir(pathname):
print "Before walkem: %s" % pathname
walkem(pathname)
print "After walkem: %s" % pathname
else:
yield pathname

if __name__ == "__main__":
for filename in walkem("<your directory here>"):
print filename

This code will not recurse subdirectories - because the recursive
walkem() is not called (or so is my interpretation). It is of course
easy to rewrite this; I was just wondering if someone can enlighten me
why this happens (the yield documentation doesn't give me any clues),
and whether this might be a future feature?

Is "directory" an absolute or a relative path?

You don't use any abspath() calls, nor any os.chdir() calls, so I would
think if "directory" is relative the code won't work properly.

-Peter
 
P

Peter Otten

Gerson said:
I stumbled upon a behaviour that would be somewhat nice to have, but
seemingly is not possible (in 2.3 anyway):

import os, stat

def walkem(directory):
print "Read files in %s" % directory
for filename in os.listdir(directory):
pathname = os.path.join(directory, filename)
if os.path.isdir(pathname):
print "Before walkem: %s" % pathname
walkem(pathname)

You are creating a generator here, but never use it. Change the above line
to

for subpn in walkem(pathname):
yield subpn
print "After walkem: %s" % pathname
else:
yield pathname

if __name__ == "__main__":
for filename in walkem("<your directory here>"):
print filename

This code will not recurse subdirectories - because the recursive
walkem() is not called (or so is my interpretation). It is of course
easy to rewrite this; I was just wondering if someone can enlighten me
why this happens (the yield documentation doesn't give me any clues),
and whether this might be a future feature?

Calling a generator just creates it. Invoke the next() method to draw the
actual items (the for loop does this implicitely).
Sometimes it may be instructive to have a look at the library code: in your
case os.walk() may be worth investigating since it performs a similar task.

Peter
 
G

Gerson Kurz

No, it has nothing to do with the pathname. Try this:

def f(nested):
if not nested:
f(True)
else:
yield 42

for k in f(False): print k

Step through it in the mind:
- f(False) gets called
- it is not nested, so f(True) gets called
- it is nested, so should yield 42 - but it doesn't!
- instead, you get an empty list.
 
F

Francis Avila

Gerson Kurz said:
No, it has nothing to do with the pathname. Try this:

def f(nested):
if not nested:
f(True)
else:
yield 42

for k in f(False): print k

Step through it in the mind:
- f(False) gets called
- it is not nested, so f(True) gets called
- it is nested, so should yield 42 - but it doesn't!
- instead, you get an empty list.

Computer's can't read our minds, though....
def f(nested):
if not nested:

F is false so...

....call function f(True), which returns a *generator*...
....which is never bound to anything, so the refcount goes to 0.
else:
yield 42

Above never gets executed by anything at all.

Look:.... yield 'Hello there!'
....Traceback (most recent call last):
File "<stdin>", line 1, in ?
g.next()
AttributeError: 'function' object has no attribute 'next'
See the difference?

What you want something like:

def f(nested):
if not nested:
for v in f(True): yield v
else:
yield 42

for k in f(False): print k # 42

I.e., you must explicitly iterate the generator. No 'yield' ever yields
without a next() asking first. (That next() call can be implicit or
explicit, but it's still always there.)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,776
Messages
2,569,603
Members
45,197
Latest member
ScottChare

Latest Threads

Top