yield_all needed in Python

Douglas Alan · Feb 28, 2005

While writing a generator, I was just thinking how Python needs a
"yield_all" statement. With the help of Google, I found a
pre-existing discussion on this from a while back in the Lightweight
Languages mailing list. I'll repost it here in order to improve the
chances of this enhancement actually happening someday. The original
poster from the LL mailing list seems mostly concerned with
algorithmic efficiency, while I'm concerned more about making my
programs shorter and easier to read. The ensuing discussion on the LL
list talks about how yield_all would be somewhat difficult to
implement if you want to get the efficiency gain desired, but I don't
think it would be very difficult to implement if that goal weren't
required, and the goal were limited to just the expressive elegance:

A Problem with Python's 'yield'

* To: LL1 Mailing List <address@hidden>
* Subject: A Problem with Python's 'yield'
* From: Eric Kidd <address@hidden>
* Date: 27 May 2003 11:15:20 -0400
* Organization:
* Sender: address@hidden

I'm going to pick on Python here, but only because the example code will
be short and sweet.

I believe several other implementations of
generators have the same problem.

Python's generator system, used naively, turns an O(N) tree traversal
into an O(N log N) tree traversal:

class Tree:
def __init__(self, value, left=None, right=None):
self.value = value
self.left = left
self.right = right

def in_order(self):
if self.left is not None:
for v in self.left.in_order():
yield v
yield self.value
if self.right is not None:
for v in self.right.in_order():
yield v

t=Tree(2, Tree(1), Tree(3))
for v in yield_bug.t.in_order():
print v

This prints:
1
2
3

Unfortunately, this snippet calls 'yield' 5 times, because the leaf
values must be yielded twice on their way back up the tree.

We can shorten the code--and make it run in O(N) time--by adding a new
keyword to replace the "for v in ...: yield v" pattern:

def in_order(self):
if self.left is not None:
yield_all self.left.in_order():
yield self.value
if self.right is not None:
yield_all self.right.in_order():

Interestingly enough, this allows you define notions such as
"tail-recursive generation", and apply the usual bag of
recursion-optimization techniques.

Cheers,
Eric

|>oug

Andrew Dalke · Mar 1, 2005

While writing a generator, I was just thinking how Python needs a
"yield_all" statement. With the help of Google, I found a pre-existing
discussion on this from a while back in the Lightweight Languages
mailing list. I'll repost it here in order to improve the chances of
this enhancement actually happening someday.

You should also have looked for the responses to that. Tim Peter's
response is available from
http://aspn.activestate.com/ASPN/Mail/Message/624273
as linked from
http://aspn.activestate.com/ASPN/Mail/Message/python-dev/758572
Here is the most relevant parts.

I'm not bothered -- this comes with the territory. If/when
full-fledged coroutines make it in too, people worried about that can
use them instead. Curious fact: I *was* worried about the worst-case
time aspects of "simple generators" in Icon years ago, but in practice
never ever got burned by it. And rewriting stuff to use Icon
co-expressions instead invariably resulted in messier code that ran
significantly slower in virtually all cases, except for the ones I
*contrived* to prove the O() difference.

BTW, Python almost never worries about worst-case behavior, and people
using Python dicts instead of, e.g., balanced trees, get to carry their
shame home with them hours earlier each day <wink> .

Andrew
(e-mail address removed)

Terry Reedy · Mar 1, 2005

Douglas Alan said:
We can shorten the code--and make it run in O(N) time--by adding a
new
keyword to replace the "for v in ...: yield v" pattern:

Maybe. Until you define the semantics of yield_all and at least outline an
implementation, I am not convinced of 'run in o(n) time'. There was once a
several-post discussion of a related idea of having yield somehow,
magically, skip intermediate generators that only yielded value on up,
without tranformation. But it was never clear how to do this practically
without negatively impacting all generators. Cetainly, if <yield_all
iterator> == <for i in iterator: yield i>, I don't see how anything is

def in_order(self):
if self.left is not None:
yield_all self.left.in_order():
yield self.value
if self.right is not None:
yield_all self.right.in_order():

If and when I write a text-based double-recursion to iteration transformer,
a pseudokeyword might be be an idea for indicating that stacked yields are
identify functions and therefore bypassable.

Terry J. Reedy

Steve Holden · Mar 1, 2005

Terry said:
Maybe. Until you define the semantics of yield_all and at least outline an
implementation, I am not convinced of 'run in o(n) time'. There was once a
several-post discussion of a related idea of having yield somehow,
magically, skip intermediate generators that only yielded value on up,
without tranformation. But it was never clear how to do this practically
without negatively impacting all generators. Cetainly, if <yield_all
iterator> == <for i in iterator: yield i>, I don't see how anything is
gained except for a few keystrokes. If <yield_all iterator> == <yield
list(i for i in iterator)> then the replacement is a semantic change.

La plus ca change, la plus c'est la meme chose (I trust native French
speakers will excuse the laziness that led to the absence of accent).

This is very reminiscent of discussions several years ago about tail
recursion and how it would be a great thing to optimise the edge cases.
Of course we didn't have generators then, so we couldn't complain about
*their* inefficiencies then.

If and when I write a text-based double-recursion to iteration transformer,
a pseudokeyword might be be an idea for indicating that stacked yields are
identify functions and therefore bypassable.

The key words in the above being "use" and "case", I suspect.

python:-always-something-new-to-bitch-about-ly y'rs - steve

Antoon Pardon · Mar 1, 2005

Op 2005-03-01 said:
La plus ca change, la plus c'est la meme chose (I trust native French
speakers will excuse the laziness that led to the absence of accent).

Small diversion:

You weren't lazy enough because you added words. The idiom AFAIK is:

Plus ça change, plus ça reste la même chose.

You shouldn't add the "la", I think that came from translating
too literally, adding an article to a comparative in french
turns it into a superlative. So instead of writing:

The more it changes, the more is stays the same

You wrote something like:

Most it changes, most it is the same thing.

Douglas Alan · Mar 1, 2005

You should also have looked for the responses to that. Tim Peter's
response is available from

http://aspn.activestate.com/ASPN/Mail/Message/624273

Click to expand...

[...]

Here is the most relevant parts.
[...]

BTW, Python almost never worries about worst-case behavior, and people
using Python dicts instead of, e.g., balanced trees, get to carry their
shame home with them hours earlier each day <wink> .

If you'll reread what I wrote, you'll see that I'm not concerned with
performance, but rather my concern is that I want the syntactic sugar.
I'm tired of writing code that looks like

def foogen(arg1):

def foogen1(arg2):
# Some code here

# Some code here
for e in foogen1(arg3): yield e
# Some code here
for e in foogen1(arg4): yield e
# Some code here
for e in foogen1(arg5): yield e
# Some code here
for e in foogen1(arg6): yield e

when it would be much prettier and easier to read if it looked like:

def foogen(arg1):

def foogen1(arg2):
# Some code here

# Some code here
yield_all foogen1(arg3)
# Some code here
yield_all foogen1(arg4)
# Some code here
yield_all foogen1(arg5)
# Some code here
yield_all foogen1(arg6)

|>oug

Douglas Alan · Mar 1, 2005

Terry Reedy said:
Cetainly, if <yield_all
iterator> == <for i in iterator: yield i>, I don't see how anything
is gained except for a few keystrokes.

What's gained is making one's code more readable and maintainable,
which is the one of the primary reasons that I use Python.

|>oug

Duncan Booth · Mar 1, 2005

Douglas said:
What's gained is making one's code more readable and maintainable,
which is the one of the primary reasons that I use Python.

On of the reasons why Python is readable is that the core language is
comparatively small. Adding a new reserved word simply to save a few
characters is a difficult choice, and each case has to be judged on its
merits, but it seems to me that in this case the extra syntax is a burden
that would have to be learned by all Python programmers with very little
benefit.

Remember that many generators will want to do slightly more than just yield
from another iterator, and the for loop allows you to put in additional
processing easily whereas 'yield_all' has very limited application e.g.

for tok in tokenstream():
if tok.type != COMMENT:
yield tok

I just scanned a random collection of my Python files: out of 50 yield
statements I found only 3 which could be rewritten using yield_all.

Douglas Alan · Mar 1, 2005

Duncan Booth said:
Douglas Alan wrote:

On of the reasons why Python is readable is that the core language is
comparatively small.

It's not that small anymore. What it *is* is relatively conceptually
simple and readily comprehensible (i.e. "lightweight"), unlike
languages like C++ and Perl.

Adding a new reserved word simply to save a few
characters

It's not to "save a few characters". It's to make it immediately
clear what is happening.

is a difficult choice, and each case has to be judged on its merits,
but it seems to me that in this case the extra syntax is a burden
that would have to be learned by all Python programmers with very
little benefit.

The amount of effort to learn what "yield_all" does compared to the
amount of effort to understand generators in general is so miniscule,
as to be negligible. Besides, by this argument, the standard library
should be kept as small as possible too, since people have to learn
all that stuff in order to understand someone else's code.

Remember that many generators will want to do slightly more than just yield
from another iterator, and the for loop allows you to put in additional
processing easily whereas 'yield_all' has very limited application e.g.

for tok in tokenstream():
if tok.type != COMMENT:
yield tok

I just scanned a random collection of my Python files: out of 50 yield
statements I found only 3 which could be rewritten using yield_all.

For me, it's a matter of providing the ability to implement
subroutines elegantly within generators. Without yield_all, it is not
elegent at all to use subroutines to do some of the yielding, since
the calls to the subroutines are complex, verbose statements, rather
than simple ones.

I vote for the ability to have elegant, readable subroutining,
regardless of how much you in particular would use it.

|>oug

Skip Montanaro · Mar 1, 2005

Doug> def foogen(arg1):

Doug> def foogen1(arg2):
Doug> # Some code here

Doug> # Some code here
Doug> yield_all foogen1(arg3)
Doug> # Some code here
Doug> yield_all foogen1(arg4)
Doug> # Some code here
Doug> yield_all foogen1(arg5)
Doug> # Some code here
Doug> yield_all foogen1(arg6)

If this idea advances I'd rather see extra syntactic sugar introduced to
complement the current yield statement instead of adding a new keyword.
It's a bit clumsy to come up with something that will work syntactically
since the next token following the yield keyword can be any identifier.
You'd thus need another keyword there. Something like:

def foogen(arg1):

def foogen1(arg2):
# Some code here

# Some code here
yield from foogen1(arg3)
# Some code here
yield from foogen1(arg4)
# Some code here
yield from foogen1(arg5)
# Some code here
yield from foogen1(arg6)

It would be nicer if that was

yield all from <something>

but since "all" is a valid identifier that might break existing code, though
maybe the presence of the following "from" can be used to distinguish
these two cases:

yield <expr>

yield all from <expr>

Skip

Jeremy Bowers · Mar 1, 2005

yield <expr>

yield *<expr>

(Mu-hu-ha-ha-ha!)

Francis Girard · Mar 1, 2005

Hi,

You absolutely and definitively have my vote.

When I first learned the generators , I was even wondering if there was
something wrong in what I do when faced with the sub-generators problem you
describe. I was wondering "why am I doing this extra for-loop ? Is there
something wrong ? Can I return the sub-iterator itself and let the final
upper loop do the job ? But no, I absolutely have to 'yield'. What then ?"

Therefore, the suggestion you make, or something similar, would have actually
ease my learning, at least for me.

Regards,

Francis Girard

Le mardi 1 Mars 2005 19:22, Douglas Alan a écrit :

Mike C. Fletcher · Mar 1, 2005

Skip Montanaro wrote:
....

If this idea advances I'd rather see extra syntactic sugar introduced to
complement the current yield statement instead of adding a new keyword.
It's a bit clumsy to come up with something that will work syntactically
since the next token following the yield keyword can be any identifier.
You'd thus need another keyword there. Something like:

I'd agree on *not* introducing a new keyword. I run into this issue
every once in a while, but new keywords for minor syntactic sugar seems
a bit much.

# Some code here
yield from foogen1(arg3)

....

It would be nicer if that was

yield all from <something>

I don't really like the need to look past that (potentially long)
expression to see the effect of the operation. I don't mind the yield
from syntax, it nicely encapsulates the learning of "generators" so that
when you see yield up front you know something generatish is going on.

I'd be fine with:

for yield on foogen1(arg3)

or

for yield from foogen1(arg3)

which goes more toward the idea of being syntactic sugar for a for loop
that yields each value that is produced. Of course, what happens with:

[ for yield from foogen1(arg3) ]

would then have to be defined... that might make it too complex an
change. Oh well.

Have fun all,
Mike

________________________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://www.vrplumber.com
http://blog.vrplumber.com
PyCon is coming...

Steven Bethard · Mar 1, 2005

Mike said:
... it nicely encapsulates the learning of "generators" so that
when you see yield up front you know something generatish is going on.

+1 for "generatish" as VOTW (Vocabulation of the Week). =)

STeVe

David Eppstein · Mar 1, 2005

Cetainly, if <yield_all
iterator> == <for i in iterator: yield i>, I don't see how anything
is gained except for a few keystrokes.

What's gained is making one's code more readable and maintainable,
which is the one of the primary reasons that I use Python.[/QUOTE]

I don't see a lot of difference in readability and maintainability
between the two versions. And if yield_all is going to expand into the
loop, anyway, I'd prefer to make that obvious by using the for-loop
version, rather than using a keyword and pretending that passing the
iterators on has no overhead.

If we're talking about machinery behind the scenes to shortcut chains of
yield_all's, so that the time to pass items up through the chain is
smaller than it would be in the for-loop case, I'd think that would be a
better reason for a keyword, because it's not something that can be done
very easily without one in the current language. I don't know how to
make such shortcutting machinery faster than logarithmic in the worst
case (taking into account the possibility that multiple generators could
have yield_all's to the same iterator) but I think it could be made
nearly constant time in most situations. On the other hand, I'm not
convinced that this would be needed frequently enough to warrant the
complexity of trying to optimize it.

Adam Przybyla · Mar 1, 2005

Douglas Alan said:
While writing a generator, I was just thinking how Python needs a
"yield_all" statement. With the help of Google, I found a
pre-existing discussion on this from a while back in the Lightweight
Languages mailing list. I'll repost it here in order to improve the
chances of this enhancement actually happening someday. The original
poster from the LL mailing list seems mostly concerned with
algorithmic efficiency, while I'm concerned more about making my
programs shorter and easier to read. The ensuing discussion on the LL
list talks about how yield_all would be somewhat difficult to
implement if you want to get the efficiency gain desired, but I don't
think it would be very difficult to implement if that goal weren't
required, and the goal were limited to just the expressive elegance:

A Problem with Python's 'yield'

... mayby that way:
ython 2.2.3 (#1, Oct 15 2003, 23:33:35)
[GCC 3.3.1 20030930 (Red Hat Linux 3.3.1-6)] on linux2
Type "help", "copyright", "credits" or "license" for more information..... for i in range(10): yield i
........
0 1 2 3 4 5 6 7 8 9....
0 1 2 3 4 5 6 7 8 9....
0 1 2 3 4 5 6 7 8 9

yield_all=[k for k in x()]
yield_all [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Click to expand...

Click to expand...

Regards
Adam Przybyla

Francis Girard · Mar 1, 2005

Hi,

No, this won't do. What is needed is a way to yield the results of a generator
from inside another generator with having to do a for-yield-loop inside the
outter generator.

Regards,

Francis Girard

Le mardi 1 Mars 2005 22:35, Adam Przybyla a Ã©critÂ :

Â Â Â Â Â Â Â Â ... mayby that way:
ython 2.2.3 (#1, Oct 15 2003, 23:33:35)
[GCC 3.3.1 20030930 (Red Hat Linux 3.3.1-6)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

... Â for i in range(10): yield i
...

...
0 1 2 3 4 5 6 7 8 9

...
0 1 2 3 4 5 6 7 8 9

...
0 1 2 3 4 5 6 7 8 9

yield_all=[k for k in x()]
yield_all

Click to expand...

Click to expand...

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

Francis Girard · Mar 1, 2005

Oops. I meant "without having" instead of "with having" which is a syntax
error.

Regards

Le mardi 1 Mars 2005 22:53, Francis Girard a Ã©critÂ :

Douglas Alan · Mar 1, 2005

Francis Girard said:
Therefore, the suggestion you make, or something similar, would have
actually ease my learning, at least for me.

Yes, I agree 100%. Not having something like "yield_all" hurt my
ability to learn to use Python's generators quickly because I figured
that Python had to have something like yield_all. But no matter how
hard I looked, I couldn't find anything about it in the manual.

So the argument that adding a feature makes the language harder to
learn is a specious one. Sometimes an extra feature makes the
language easier to learn.

|>oug

Douglas Alan · Mar 1, 2005

David Eppstein said:
I don't see a lot of difference in readability and maintainability
between the two versions.

Click to expand...

In that case, your brain works nothing like mine.

|>oug

Explaining Implementing a Binary Search Tree.	6	Jun 15, 2008
Python point location of intersect between two lines	0	Feb 28, 2018
Help needed with Windows Service in Python	6	Sep 2, 2010
Changing self: if self is a tree how to set to a different self	6	Jul 10, 2008
Binary tree problem (searching)	2	Apr 4, 2006
HOWTO: Parsing email using Python part2	1	Jul 15, 2011
constructin trees in python	3	Nov 20, 2010
KeyedList	0	Feb 28, 2009

yield_all needed in Python

Douglas Alan

Andrew Dalke

Terry Reedy

Steve Holden

Antoon Pardon

Douglas Alan

Douglas Alan

Duncan Booth

Douglas Alan

Skip Montanaro

Jeremy Bowers

Francis Girard

Mike C. Fletcher

Steven Bethard

David Eppstein

Adam Przybyla

Francis Girard

Francis Girard

Douglas Alan

Douglas Alan

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads