Yielding a chain of values

T

Talin

I'm finding that a lot of places within my code, I want to return the
output of a generator from another generator. Currently the only method
I know of to do this is to explicitly loop over the results from the
inner generator, and yield each one:

for x in inner():
yield x

I was wondering if there was a more efficient and concise way to do
this. And if there isn't, then what about extending the * syntax used
for lists, i.e.:

yield *inner()

The * would indicate that you want to iterate through the given
expression, and yield each value in turn. You could also use it on
ordinary lists:

yield *[1, 2, 3 ]

Anyway, just an idle thought on a Sunday morning :)
 
P

Peter Hansen

Talin said:
I'm finding that a lot of places within my code, I want to return the
output of a generator from another generator. Currently the only method
I know of to do this is to explicitly loop over the results from the
inner generator, and yield each one:

for x in inner():
yield x

I was wondering if there was a more efficient and concise way to do
this. And if there isn't, then what about extending the * syntax used
for lists, i.e.:

yield *inner()

It's not the first time it's been suggested. You can check the archives
perhaps for past discussions.

I think syntax like that could be a good idea too, for readability
reasons perhaps, but I don't think it's really important for efficiency
reasons. If you think about it, this involves one stack frame being set
up for the call to the generator, and then a series of quick context
switches (or whatever they're called in this situation) between the
current stack frame and the inner() one, as each yielded value is
produced, yield, then re-yielded up to the calling frame (with another
quick context switch). No additional stack frames are generated, and
very few byte codes are involved:
.... for x in inner():
.... yield x
.... 2 0 SETUP_LOOP 21 (to 24)
3 LOAD_GLOBAL 0 (inner)
6 CALL_FUNCTION 0
9 GET_ITER 13 STORE_FAST 0 (x)

3 16 LOAD_FAST 0 (x)
19 YIELD_VALUE
20 JUMP_ABSOLUTE 10
....

Yes, that is some overhead, but unless you are going many levels deep
(and that's usually a design smell of some kind) this isn't likely to be
noticed amongst the rest of the code which is presumably doing something
non-trivial to produce the values in the first place, and to consume
them ultimately.

The basic loop could be handled entirely from C with an appropriate
syntax addition as you suggest, but executing those four byte code
instructions is very quick, and there is no repeated (Python) function
call overhead as you would expect if inner() were not a generator.

-Peter
 
M

Michael Hudson

Talin said:
I'm finding that a lot of places within my code, I want to return the
output of a generator from another generator. Currently the only
method I know of to do this is to explicitly loop over the results
from the inner generator, and yield each one:

for x in inner():
yield x

I was wondering if there was a more efficient and concise way to do
this. And if there isn't,

Greenlets, perhaps? (for which, see google).

Cheers,
mwh
 
B

Bengt Richter

Greenlets, perhaps? (for which, see google).
Maybe

yield in inner()

could be sugar for the above and become something optimized?

Regards,
Bengt Richter
 
R

Reinhold Birkenfeld

Bengt said:
Maybe

yield in inner()

could be sugar for the above and become something optimized?

The problem here is that yield isn't a statement any more. It's now an
expression, so it is not easy to find new syntax around it.

Reinhold
 
B

Bengt Richter

The problem here is that yield isn't a statement any more. It's now an
expression, so it is not easy to find new syntax around it.
No, the idea was that it's still a statement, but what it
yields is "in inner()" which UIAM is illegal now, and would
signify "whatever sequence of elements is in inner()" --
really yield in seq -- I don't know what inner() was, but I assumed
an iterable.

Regards,
Bengt Richter
 
P

Peter Hansen

Bengt said:
No, the idea was that it's still a statement, but what it
yields is "in inner()" which UIAM is illegal now, and would
signify "whatever sequence of elements is in inner()" --
really yield in seq -- I don't know what inner() was, but I assumed
an iterable.

I believe he was referring indirectly to
http://www.python.org/peps/pep-0342.html (see item #1 in the
"Specification Summary" section), where yield will become an expression.
This PEP has been accepted, thus his use of present tense, confusing
though it is when it's not in the released version of Python yet.

Or I might be wrong. ;-)

-Peter
 
B

Bengt Richter

I believe he was referring indirectly to
http://www.python.org/peps/pep-0342.html (see item #1 in the
"Specification Summary" section), where yield will become an expression.
This PEP has been accepted, thus his use of present tense, confusing
though it is when it's not in the released version of Python yet.

Or I might be wrong. ;-)
Well, maybe it's right both ways ;-) I.e., even though yield "is" now
an expression, it is valid to use it as an expression-statement which
evaluates the expression and discards the value. So I think you could
still use the currently illegal "yield in" token sequence to mean that
what follows is to be taken as an iterable whose full sequence is
to be yielded sequentially as if

yield in iterable

were sugar for

for _ in iterable: yield _

Regards,
Bengt Richter
 
M

Matt Hammond

Well, maybe it's right both ways ;-) I.e., even though yield "is" now
an expression, it is valid to use it as an expression-statement which
evaluates the expression and discards the value. So I think you could
still use the currently illegal "yield in" token sequence to mean that
what follows is to be taken as an iterable whose full sequence is
to be yielded sequentially as if

yield in iterable

were sugar for

for _ in iterable: yield _

"yield in" could make sense when thought of as an expression too.

x = yield in iterable

Would behave like a list comprehension. x would be assigned a list
containing
the results of the successive yields. Equivalent to:

x = [ yield r for r in iterable ]

regards


Matt
 
R

Reinhold Birkenfeld

Matt said:
Well, maybe it's right both ways ;-) I.e., even though yield "is" now
an expression, it is valid to use it as an expression-statement which
evaluates the expression and discards the value. So I think you could
still use the currently illegal "yield in" token sequence to mean that
what follows is to be taken as an iterable whose full sequence is
to be yielded sequentially as if

yield in iterable

were sugar for

for _ in iterable: yield _

"yield in" could make sense when thought of as an expression too.

x = yield in iterable

Would behave like a list comprehension. x would be assigned a list
containing
the results of the successive yields. Equivalent to:

x = [ yield r for r in iterable ]

Which is quite different from

x = (yield) in iterable

which is currently (PEP 342) equivalent to

_ = (yield)
x = _ in iterable

So, no further tinkering with yield, I'm afraid.

Reinhold
 
K

Kay Schluehr

Reinhold said:
x = [ yield r for r in iterable ]

Which is quite different from

x = (yield) in iterable

which is currently (PEP 342) equivalent to

_ = (yield)
x = _ in iterable

So, no further tinkering with yield, I'm afraid.

Reinhold

Is the statement

yield from iterable

also in danger to be ambigous?

The resolution of "(yield) from iterable" into

_ = (yield)
x = _ from iterable

would not result in valid Python syntax.

Kay
 
R

Reinhold Birkenfeld

Kay said:
Reinhold said:
x = [ yield r for r in iterable ]

Which is quite different from

x = (yield) in iterable

which is currently (PEP 342) equivalent to

_ = (yield)
x = _ in iterable

So, no further tinkering with yield, I'm afraid.

Reinhold

Is the statement

yield from iterable

also in danger to be ambigous?

The resolution of "(yield) from iterable" into

_ = (yield)
x = _ from iterable

would not result in valid Python syntax.

Right.

Problem is, how would you define the "from" syntax: Is its use as
an expression allowed? What value does it have, then?

Reinhold
 
K

Kay Schluehr

Reinhold said:
Kay said:
Reinhold said:
x = [ yield r for r in iterable ]

Which is quite different from

x = (yield) in iterable

which is currently (PEP 342) equivalent to

_ = (yield)
x = _ in iterable

So, no further tinkering with yield, I'm afraid.

Reinhold

Is the statement

yield from iterable

also in danger to be ambigous?

The resolution of "(yield) from iterable" into

_ = (yield)
x = _ from iterable

would not result in valid Python syntax.

Right.

Problem is, how would you define the "from" syntax: Is its use as
an expression allowed? What value does it have, then?

Reinhold

Do you mention statements like this?

x = (yield from [1,2,3])

I do think that such "yield-comprehensions" may be valid and raise a
StopIteration exception after being called three times by means of
next() or send().

Kay
 
V

viridia

This is why my original proposal used the '*' operator rather than a
keyword. The reasoning behind this is as follows: When calling a
function, a parameter of the form "*expression" expands to a list of
arguments. From the Python reference manual:

"If the syntax '*expression' appears in the function call, 'expression'
must evaluate to a sequence. Elements from this sequence are treated as
if they were additional positional arguments."

By (somewhat stretched) analogy, "yield *expression" would "expand" the
sequence into a series of multiple yields.

Of course, now that yield is an expression (as stated above), you would
indeed have to account for the fact that the yield statement used in
this way would yield a sequence of values:

for x in yield *[ 1, 2, 3 ]:
print x

So the '*' would not only signal that multiple values were being
yielded, but that multiple values are also being returned.

-- Talin
 
Y

yairchu

dude - this business is so confusing that you actually have to *think*
about it!
but python is all about simplicity.
with python, when I program - I don't think *about* it - I think it. or
something - don't make me think about it.

so how about a "reyield" or some other new keyword (cause reyield is
too quircky) instead of joining stuff which once ment something (one
thing)?

huh?
Yair.
yairchu a@T gmail
 
K

Kay Schluehr

dude - this business is so confusing that you actually have to *think*
about it!
but python is all about simplicity.
with python, when I program - I don't think *about* it - I think it. or
something - don't make me think about it.

so how about a "reyield" or some other new keyword (cause reyield is
too quircky) instead of joining stuff which once ment something (one
thing)?

What about dleiy? I guess it thinks me.

Kay
 
A

aurora00

unless you are going many levels deep
(and that's usually a design smell of some kind)

No, its not a bug. its a feature! See the discussion in the recursive
generator thread below:

http://groups.google.com/group/comp...q=recursive+generator&rnum=1#36f2b915eba66eac

In my opinion, traversing recursive data structure is where generator
shine best. Alternative implementation using iterator is lot more
difficult and lot less elegant. Unfortunate the right now recursive
generator would carry a price tag of O(n^2) to the level of depth.
 
T

Terry Reedy

(and that's usually a design smell of some kind)

No, its not a bug. its a feature! See the discussion in the recursive
generator thread below: http://groups.google.com/group/comp...q=recursive+generator&rnum=1#36f2b915eba66eac

In my opinion, traversing recursive data structure is where generator
shine best. Alternative implementation using iterator is lot more
difficult and lot less elegant. Unfortunate the right now recursive
generator would carry a price tag of O(n^2) to the level of depth.

The problem with asymptotic 'big O' notation is that is leaves out both the
constant multiplier and lesser terms and promotes the misperception that
'lower order' asymtotic behavior is always preferable. But much real
computation is done with small and effectively non-asymptotic values where
the omitted components *are* significant.

In this case, the O(depth^2) cost applies, I believe, to resumptions (and
yields) of suspended generators, which are *much* faster than function
calls, so that the omitted multiplier is relatively small. Given that
there is also at least one function call cost for each tree node, I expect
one would need a somewhat deep (intentionally vague without specific timing
data) and unbalanced tree for the resume price to be worrisome.

In any case, having an easily written and understood version can help in
testing a faster and more complicated version, especially on random,
non-corner case examples.

Terry J. Reedy
 
A

aurora00

I'm not really worry that much over O(n^2) performace (especially
having optimized some O(n^3) SQL operations :-o !)

The things is this really should be an O(n) operation. Having a yield
all statement or expression is useful in its own right and also
potentially a way to optimized away the O(n^2) issue.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top