Recursive generator

B

Ben C

Suppose I have an object containing an array called children. I can
therefore build a tree out of such objects.

I thought it might be useful to have a descendent generator, so I could
write:

for thing in self.genDescendents():
foo(thing)

expecting foo to be called for each descendent in pre-order.

The best I came up with so far is :

def genDescendents(self):
for child in self.children:
yield child
for grandChild in child.genDescendents():
yield grandChild

I think this works OK, but it seems a bit odd. Is there something more
"Pythonic" I should be doing?
 
P

Paul Hankin

Suppose I have an object containing an array called children. I can
therefore build a tree out of such objects.
The best I came up with so far is :

    def genDescendents(self):
        for child in self.children:
            yield child
            for grandChild in child.genDescendents():
                yield grandChild

Looks fine, although I'd include self in the generator because I think
that's more logical, (and spell descendant correctly :).

def genDescendants(self):
yield self
for child in self.children:
for grandchild in child.genDescendants():
yield grandchild


Often generators can be written more concisely with itertools at the
expense of some readability, and that's true here.

from itertools import chain

def genDescendants(self):
return chain([self], *[child.genDescendants()
for child in self.children])
 
B

Ben C

Looks fine, although I'd include self in the generator because I think
that's more logical, (and spell descendant correctly :).

I actually prefer descendent. It may be more American English since it's
closer to Latin. "Descendant" is basically French. But anyway, never
mind :)
def genDescendants(self):
yield self
for child in self.children:
for grandchild in child.genDescendants():
yield grandchild


Often generators can be written more concisely with itertools at the
expense of some readability, and that's true here.

from itertools import chain

def genDescendants(self):
return chain([self], *[child.genDescendants()
for child in self.children])

Thanks for that, I was wondering if there might be something in
itertools to do this.

I think the first version is probably more readable though anyway.
 
P

Paul Rubin

Paul Hankin said:
def genDescendants(self):
return chain([self], *[child.genDescendants()
for child in self.children])

That is scary. It generates an in-memory list the size of the
whole subtree, at every level. Total memory consumption is maybe
even quadratic, depending on the tree shape, but even if it's
only linear, it's way ugly.
 
B

Ben C

Paul Hankin said:
def genDescendants(self):
return chain([self], *[child.genDescendants()
for child in self.children])

That is scary. It generates an in-memory list the size of the
whole subtree, at every level. Total memory consumption is maybe
even quadratic, depending on the tree shape, but even if it's
only linear, it's way ugly.

This would probably be better (in terms of memory if not beauty):

def genDescendants(self):
return chain([self], *(child.genDescendants()
for child in self.children))
 
P

Paul Hankin

Paul Hankin said:
def genDescendants(self):
    return chain([self], *[child.genDescendants()
        for child in self.children])
That is scary.  It generates an in-memory list the size of the
whole subtree, at every level.  Total memory consumption is maybe
even quadratic, depending on the tree shape, but even if it's
only linear, it's way ugly.

This would probably be better (in terms of memory if not beauty):

    def genDescendants(self):
        return chain([self], *(child.genDescendants()
            for child in self.children))

No, it's the same I think. When I wrote the original function, I
forgot that a generator using yield isn't the same as the same
function that returns an equivalent generator, because one is lazy and
the other isn't. To get the lazy behaviour back, you'd have to
write....

def genDescendants(self):
for g in chain([self], *(child.genDescendants()
for child in self.children)):
yield g

But that's really ugly, so it looks like the simple version which
doesn't use itertools is the best.
 
B

Ben C

def genDescendants(self):
    return chain([self], *[child.genDescendants()
        for child in self.children])
That is scary.  It generates an in-memory list the size of the
whole subtree, at every level.  Total memory consumption is maybe
even quadratic, depending on the tree shape, but even if it's
only linear, it's way ugly.

This would probably be better (in terms of memory if not beauty):

    def genDescendants(self):
        return chain([self], *(child.genDescendants()
            for child in self.children))

No, it's the same I think.

Yes I think you're right. The problem is really the *. The generator
comprehension (child.genDescendants() ...) will have to be run to
completion to build the arguments to pass to chain. So no laziness
there.
forgot that a generator using yield isn't the same as the same
function that returns an equivalent generator, because one is lazy and
the other isn't. To get the lazy behaviour back, you'd have to
write....

def genDescendants(self):
for g in chain([self], *(child.genDescendants()
for child in self.children)):
yield g

Is that lazy? It still seems like (child.genDescendants() ...) would
have to be run all the way to the end before chain could be called.

Unless * is lazy. I don't think it is, but I don't know for sure.
 
E

Erich

I think this works OK, but it seems a bit odd. Is there something more
"Pythonic" I should be doing?

I have a similar tree to the one you describe here at work. I have a
visit function that is very similar to yours, but takes function
arguments for doing pre- and/or post- order operations on the node
(code below). It also yeilds the nodes, but in a postorder manner. The
flexibility is quite useful.

Regards,
Erich

def visit(self,prefunc = None, postfunc = None):
if prefunc:
prefunc(self)

for child in self.children:
for y in child.visit(prefunc, postfunc):
yield y

if postfunc:
postfunc(self)

yield self
 
B

Ben C

I have a similar tree to the one you describe here at work. I have a
visit function that is very similar to yours, but takes function
arguments for doing pre- and/or post- order operations on the node
(code below). It also yeilds the nodes, but in a postorder manner. The
flexibility is quite useful.

Yes that's pretty good too, although usually I want either a callback or
to yield a result, but not both, and often a function of a node passed
in to say whether to prune at that point (i.e. not visit further
descendents).

For various other reasons I've now gone to a tree where each node has a
parent, sibling and first child reference. No children array. The
generator yields tuples of (DOWN, node), (RIGHT, node) and (UP, node),
and is not itself recursive-- it uses the parent references instead.

If the caller wants pre-order it just ignores UP visits. If it wants
post-order it ignores DOWN. The distinction between DOWN and RIGHT is
important if the caller wants to push and pop stacks as it walks the
tree.

The generator itself is not so pretty, but the caller can more easily do
everything with it that it would be able to do if it was recursive
itself.

def genDescendents(self, prune = None):
node = self
dir = DOWN
while 1:
if prune and prune(node):
if dir == DOWN: dir = RIGHT
else:
yield dir, node

# Go down if we can, unless we've been there already, else
# right, or as a last resort, up.
if dir != UP:
if node.firstChild:
node = node.firstChild
dir = DOWN
else:
# Back up through leaf nodes-- so we change dir but not
# node. Sort of a U-turn.
dir = UP
elif node.sibling:
node = node.sibling
dir = RIGHT
elif node.parent:
node = node.parent
dir = UP
else: break
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,539
Members
45,024
Latest member
ARDU_PROgrammER

Latest Threads

Top