yield_all needed in Python

D

Douglas Alan

While writing a generator, I was just thinking how Python needs a
"yield_all" statement. With the help of Google, I found a
pre-existing discussion on this from a while back in the Lightweight
Languages mailing list. I'll repost it here in order to improve the
chances of this enhancement actually happening someday. The original
poster from the LL mailing list seems mostly concerned with
algorithmic efficiency, while I'm concerned more about making my
programs shorter and easier to read. The ensuing discussion on the LL
list talks about how yield_all would be somewhat difficult to
implement if you want to get the efficiency gain desired, but I don't
think it would be very difficult to implement if that goal weren't
required, and the goal were limited to just the expressive elegance:

A Problem with Python's 'yield'

* To: LL1 Mailing List <address@hidden>
* Subject: A Problem with Python's 'yield'
* From: Eric Kidd <address@hidden>
* Date: 27 May 2003 11:15:20 -0400
* Organization:
* Sender: address@hidden

I'm going to pick on Python here, but only because the example code will
be short and sweet. :) I believe several other implementations of
generators have the same problem.

Python's generator system, used naively, turns an O(N) tree traversal
into an O(N log N) tree traversal:

class Tree:
def __init__(self, value, left=None, right=None):
self.value = value
self.left = left
self.right = right

def in_order(self):
if self.left is not None:
for v in self.left.in_order():
yield v
yield self.value
if self.right is not None:
for v in self.right.in_order():
yield v

t=Tree(2, Tree(1), Tree(3))
for v in yield_bug.t.in_order():
print v

This prints:
1
2
3

Unfortunately, this snippet calls 'yield' 5 times, because the leaf
values must be yielded twice on their way back up the tree.

We can shorten the code--and make it run in O(N) time--by adding a new
keyword to replace the "for v in ...: yield v" pattern:

def in_order(self):
if self.left is not None:
yield_all self.left.in_order():
yield self.value
if self.right is not None:
yield_all self.right.in_order():

Interestingly enough, this allows you define notions such as
"tail-recursive generation", and apply the usual bag of
recursion-optimization techniques.

Cheers,
Eric

|>oug
 
A

Andrew Dalke

While writing a generator, I was just thinking how Python needs a
"yield_all" statement. With the help of Google, I found a pre-existing
discussion on this from a while back in the Lightweight Languages
mailing list. I'll repost it here in order to improve the chances of
this enhancement actually happening someday.

You should also have looked for the responses to that. Tim Peter's
response is available from
http://aspn.activestate.com/ASPN/Mail/Message/624273
as linked from
http://aspn.activestate.com/ASPN/Mail/Message/python-dev/758572
Here is the most relevant parts.

I'm not bothered -- this comes with the territory. If/when
full-fledged coroutines make it in too, people worried about that can
use them instead. Curious fact: I *was* worried about the worst-case
time aspects of "simple generators" in Icon years ago, but in practice
never ever got burned by it. And rewriting stuff to use Icon
co-expressions instead invariably resulted in messier code that ran
significantly slower in virtually all cases, except for the ones I
*contrived* to prove the O() difference.

BTW, Python almost never worries about worst-case behavior, and people
using Python dicts instead of, e.g., balanced trees, get to carry their
shame home with them hours earlier each day <wink> .

Andrew
(e-mail address removed)
 
T

Terry Reedy

Douglas Alan said:
We can shorten the code--and make it run in O(N) time--by adding a
new
keyword to replace the "for v in ...: yield v" pattern:

Maybe. Until you define the semantics of yield_all and at least outline an
implementation, I am not convinced of 'run in o(n) time'. There was once a
several-post discussion of a related idea of having yield somehow,
magically, skip intermediate generators that only yielded value on up,
without tranformation. But it was never clear how to do this practically
without negatively impacting all generators. Cetainly, if <yield_all
iterator> == <for i in iterator: yield i>, I don't see how anything is
def in_order(self):
if self.left is not None:
yield_all self.left.in_order():
yield self.value
if self.right is not None:
yield_all self.right.in_order():

If and when I write a text-based double-recursion to iteration transformer,
a pseudokeyword might be be an idea for indicating that stacked yields are
identify functions and therefore bypassable.

Terry J. Reedy
 
S

Steve Holden

Terry said:
Maybe. Until you define the semantics of yield_all and at least outline an
implementation, I am not convinced of 'run in o(n) time'. There was once a
several-post discussion of a related idea of having yield somehow,
magically, skip intermediate generators that only yielded value on up,
without tranformation. But it was never clear how to do this practically
without negatively impacting all generators. Cetainly, if <yield_all
iterator> == <for i in iterator: yield i>, I don't see how anything is
gained except for a few keystrokes. If <yield_all iterator> == <yield
list(i for i in iterator)> then the replacement is a semantic change.
La plus ca change, la plus c'est la meme chose (I trust native French
speakers will excuse the laziness that led to the absence of accent).

This is very reminiscent of discussions several years ago about tail
recursion and how it would be a great thing to optimise the edge cases.
Of course we didn't have generators then, so we couldn't complain about
*their* inefficiencies then.
If and when I write a text-based double-recursion to iteration transformer,
a pseudokeyword might be be an idea for indicating that stacked yields are
identify functions and therefore bypassable.
The key words in the above being "use" and "case", I suspect.

python:-always-something-new-to-bitch-about-ly y'rs - steve
 
A

Antoon Pardon

Op 2005-03-01 said:
La plus ca change, la plus c'est la meme chose (I trust native French
speakers will excuse the laziness that led to the absence of accent).


Small diversion:

You weren't lazy enough because you added words. The idiom AFAIK is:

Plus ça change, plus ça reste la même chose.

You shouldn't add the "la", I think that came from translating
too literally, adding an article to a comparative in french
turns it into a superlative. So instead of writing:

The more it changes, the more is stays the same

You wrote something like:

Most it changes, most it is the same thing.
 
D

Douglas Alan

You should also have looked for the responses to that. Tim Peter's
response is available from

[...]

Here is the most relevant parts.
[...]

BTW, Python almost never worries about worst-case behavior, and people
using Python dicts instead of, e.g., balanced trees, get to carry their
shame home with them hours earlier each day <wink> .

If you'll reread what I wrote, you'll see that I'm not concerned with
performance, but rather my concern is that I want the syntactic sugar.
I'm tired of writing code that looks like

def foogen(arg1):

def foogen1(arg2):
# Some code here

# Some code here
for e in foogen1(arg3): yield e
# Some code here
for e in foogen1(arg4): yield e
# Some code here
for e in foogen1(arg5): yield e
# Some code here
for e in foogen1(arg6): yield e

when it would be much prettier and easier to read if it looked like:

def foogen(arg1):

def foogen1(arg2):
# Some code here

# Some code here
yield_all foogen1(arg3)
# Some code here
yield_all foogen1(arg4)
# Some code here
yield_all foogen1(arg5)
# Some code here
yield_all foogen1(arg6)

|>oug
 
D

Douglas Alan

Terry Reedy said:
Cetainly, if <yield_all
iterator> == <for i in iterator: yield i>, I don't see how anything
is gained except for a few keystrokes.

What's gained is making one's code more readable and maintainable,
which is the one of the primary reasons that I use Python.

|>oug
 
D

Duncan Booth

Douglas said:
What's gained is making one's code more readable and maintainable,
which is the one of the primary reasons that I use Python.

On of the reasons why Python is readable is that the core language is
comparatively small. Adding a new reserved word simply to save a few
characters is a difficult choice, and each case has to be judged on its
merits, but it seems to me that in this case the extra syntax is a burden
that would have to be learned by all Python programmers with very little
benefit.

Remember that many generators will want to do slightly more than just yield
from another iterator, and the for loop allows you to put in additional
processing easily whereas 'yield_all' has very limited application e.g.

for tok in tokenstream():
if tok.type != COMMENT:
yield tok

I just scanned a random collection of my Python files: out of 50 yield
statements I found only 3 which could be rewritten using yield_all.
 
D

Douglas Alan

Duncan Booth said:
Douglas Alan wrote:
On of the reasons why Python is readable is that the core language is
comparatively small.

It's not that small anymore. What it *is* is relatively conceptually
simple and readily comprehensible (i.e. "lightweight"), unlike
languages like C++ and Perl.
Adding a new reserved word simply to save a few
characters

It's not to "save a few characters". It's to make it immediately
clear what is happening.
is a difficult choice, and each case has to be judged on its merits,
but it seems to me that in this case the extra syntax is a burden
that would have to be learned by all Python programmers with very
little benefit.

The amount of effort to learn what "yield_all" does compared to the
amount of effort to understand generators in general is so miniscule,
as to be negligible. Besides, by this argument, the standard library
should be kept as small as possible too, since people have to learn
all that stuff in order to understand someone else's code.
Remember that many generators will want to do slightly more than just yield
from another iterator, and the for loop allows you to put in additional
processing easily whereas 'yield_all' has very limited application e.g.
for tok in tokenstream():
if tok.type != COMMENT:
yield tok
I just scanned a random collection of my Python files: out of 50 yield
statements I found only 3 which could be rewritten using yield_all.

For me, it's a matter of providing the ability to implement
subroutines elegantly within generators. Without yield_all, it is not
elegent at all to use subroutines to do some of the yielding, since
the calls to the subroutines are complex, verbose statements, rather
than simple ones.

I vote for the ability to have elegant, readable subroutining,
regardless of how much you in particular would use it.

|>oug
 
S

Skip Montanaro

Doug> def foogen(arg1):

Doug> def foogen1(arg2):
Doug> # Some code here

Doug> # Some code here
Doug> yield_all foogen1(arg3)
Doug> # Some code here
Doug> yield_all foogen1(arg4)
Doug> # Some code here
Doug> yield_all foogen1(arg5)
Doug> # Some code here
Doug> yield_all foogen1(arg6)

If this idea advances I'd rather see extra syntactic sugar introduced to
complement the current yield statement instead of adding a new keyword.
It's a bit clumsy to come up with something that will work syntactically
since the next token following the yield keyword can be any identifier.
You'd thus need another keyword there. Something like:

def foogen(arg1):

def foogen1(arg2):
# Some code here

# Some code here
yield from foogen1(arg3)
# Some code here
yield from foogen1(arg4)
# Some code here
yield from foogen1(arg5)
# Some code here
yield from foogen1(arg6)

It would be nicer if that was

yield all from <something>

but since "all" is a valid identifier that might break existing code, though
maybe the presence of the following "from" can be used to distinguish
these two cases:

yield <expr>

yield all from <expr>

Skip
 
F

Francis Girard

Hi,

You absolutely and definitively have my vote.

When I first learned the generators , I was even wondering if there was
something wrong in what I do when faced with the sub-generators problem you
describe. I was wondering "why am I doing this extra for-loop ? Is there
something wrong ? Can I return the sub-iterator itself and let the final
upper loop do the job ? But no, I absolutely have to 'yield'. What then ?"

Therefore, the suggestion you make, or something similar, would have actually
ease my learning, at least for me.

Regards,

Francis Girard

Le mardi 1 Mars 2005 19:22, Douglas Alan a écrit :
 
M

Mike C. Fletcher

Skip Montanaro wrote:
....
If this idea advances I'd rather see extra syntactic sugar introduced to
complement the current yield statement instead of adding a new keyword.
It's a bit clumsy to come up with something that will work syntactically
since the next token following the yield keyword can be any identifier.
You'd thus need another keyword there. Something like:
I'd agree on *not* introducing a new keyword. I run into this issue
every once in a while, but new keywords for minor syntactic sugar seems
a bit much.
# Some code here
yield from foogen1(arg3)

....

It would be nicer if that was

yield all from <something>
I don't really like the need to look past that (potentially long)
expression to see the effect of the operation. I don't mind the yield
from syntax, it nicely encapsulates the learning of "generators" so that
when you see yield up front you know something generatish is going on.

I'd be fine with:

for yield on foogen1(arg3)

or

for yield from foogen1(arg3)

which goes more toward the idea of being syntactic sugar for a for loop
that yields each value that is produced. Of course, what happens with:

[ for yield from foogen1(arg3) ]

would then have to be defined... that might make it too complex an
change. Oh well.

Have fun all,
Mike

________________________________________________
Mike C. Fletcher
Designer, VR Plumber, Coder
http://www.vrplumber.com
http://blog.vrplumber.com
PyCon is coming...
 
S

Steven Bethard

Mike said:
... it nicely encapsulates the learning of "generators" so that
when you see yield up front you know something generatish is going on.

+1 for "generatish" as VOTW (Vocabulation of the Week). =)

STeVe
 
D

David Eppstein

Cetainly, if <yield_all
iterator> == <for i in iterator: yield i>, I don't see how anything
is gained except for a few keystrokes.

What's gained is making one's code more readable and maintainable,
which is the one of the primary reasons that I use Python.[/QUOTE]

I don't see a lot of difference in readability and maintainability
between the two versions. And if yield_all is going to expand into the
loop, anyway, I'd prefer to make that obvious by using the for-loop
version, rather than using a keyword and pretending that passing the
iterators on has no overhead.

If we're talking about machinery behind the scenes to shortcut chains of
yield_all's, so that the time to pass items up through the chain is
smaller than it would be in the for-loop case, I'd think that would be a
better reason for a keyword, because it's not something that can be done
very easily without one in the current language. I don't know how to
make such shortcutting machinery faster than logarithmic in the worst
case (taking into account the possibility that multiple generators could
have yield_all's to the same iterator) but I think it could be made
nearly constant time in most situations. On the other hand, I'm not
convinced that this would be needed frequently enough to warrant the
complexity of trying to optimize it.
 
A

Adam Przybyla

Douglas Alan said:
While writing a generator, I was just thinking how Python needs a
"yield_all" statement. With the help of Google, I found a
pre-existing discussion on this from a while back in the Lightweight
Languages mailing list. I'll repost it here in order to improve the
chances of this enhancement actually happening someday. The original
poster from the LL mailing list seems mostly concerned with
algorithmic efficiency, while I'm concerned more about making my
programs shorter and easier to read. The ensuing discussion on the LL
list talks about how yield_all would be somewhat difficult to
implement if you want to get the efficiency gain desired, but I don't
think it would be very difficult to implement if that goal weren't
required, and the goal were limited to just the expressive elegance:

A Problem with Python's 'yield'
... mayby that way:
ython 2.2.3 (#1, Oct 15 2003, 23:33:35)
[GCC 3.3.1 20030930 (Red Hat Linux 3.3.1-6)] on linux2
Type "help", "copyright", "credits" or "license" for more information..... for i in range(10): yield i
........
0 1 2 3 4 5 6 7 8 9....
0 1 2 3 4 5 6 7 8 9....
0 1 2 3 4 5 6 7 8 9
yield_all=[k for k in x()]
yield_all [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
Regards
Adam Przybyla
 
F

Francis Girard

Hi,

No, this won't do. What is needed is a way to yield the results of a generator
from inside another generator with having to do a for-yield-loop inside the
outter generator.

Regards,

Francis Girard

Le mardi 1 Mars 2005 22:35, Adam Przybyla a écrit :
        ... mayby that way:
ython 2.2.3 (#1, Oct 15 2003, 23:33:35)
[GCC 3.3.1 20030930 (Red Hat Linux 3.3.1-6)] on linux2
Type "help", "copyright", "credits" or "license" for more information.

...  for i in range(10): yield i
...

...
0 1 2 3 4 5 6 7 8 9

...
0 1 2 3 4 5 6 7 8 9

...
0 1 2 3 4 5 6 7 8 9
yield_all=[k for k in x()]
yield_all

[0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
 
F

Francis Girard

Oops. I meant "without having" instead of "with having" which is a syntax
error.

Regards

Le mardi 1 Mars 2005 22:53, Francis Girard a écrit :
 
D

Douglas Alan

Francis Girard said:
Therefore, the suggestion you make, or something similar, would have
actually ease my learning, at least for me.

Yes, I agree 100%. Not having something like "yield_all" hurt my
ability to learn to use Python's generators quickly because I figured
that Python had to have something like yield_all. But no matter how
hard I looked, I couldn't find anything about it in the manual.

So the argument that adding a feature makes the language harder to
learn is a specious one. Sometimes an extra feature makes the
language easier to learn.

|>oug
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top