proposed PEP: iterator splicing

P

Paul Rubin

The boilerplate

def some_gen():
...
for x in some_other_gen():
yield x
...

is so common (including the case where some_other_gen is the same as
some_gen, i.e. it's a recursive call) that I find myself wanting
a more direct way to express it:

def some_gen():
...
yield *some_other_gen()

comes to mind. Less clutter, and avoids yet another temp variable
polluting the namespace.

Thoughts?
 
C

Carsten Haese

On 14 Apr 2007 22:17:08 -0700, Paul Rubin wrote
The boilerplate

def some_gen():
...
for x in some_other_gen():
yield x
...

is so common (including the case where some_other_gen is the same as
some_gen, i.e. it's a recursive call) that I find myself wanting
a more direct way to express it:

def some_gen():
...
yield *some_other_gen()

comes to mind. Less clutter, and avoids yet another temp variable
polluting the namespace.

Thoughts?

Explicit is better than implicit. The explicit boilerplate code is more
readable and tells the reader exactly what's going on. Since the namespace is
the local namespace of your generator function, the namespace pollution is
minimal.

-Carsten
 
P

Paul Rubin

John Nagle said:
Are we in danger of running out of temp variables?

There is unfortunately no way to contain the scope of a loop index to the
inside of the loop. Therefore introducing more useless loop indexes creates
more scorekeeping work and bug attractants. Better to get rid of them.
 
A

Anton Vredegoor

Paul said:
def some_gen():
...
yield *some_other_gen()

comes to mind. Less clutter, and avoids yet another temp variable
polluting the namespace.

Thoughts?

Well, not directly related to your question, but maybe these are some
ideas that would help determine what we think generators are and what we
would like them to become.

I'm currently also fascinated by the new generator possibilities, for
example sending back a value to the generator by making yield return a
value. What I would like to use it for is when I have a very long
generator and I need just a slice of the values. That would mean running
through a loop, discarding all the values until the generator is in the
desired state and only then start doing something with the output.
Instead I would like to directly set or 'wind' -like a file- a generator
into some specific state. That would mean having some convention for
generators signaling their length (like xrange):

Note: xrange didn't create a list, but it has a length!

Also we would need some convention for a generator to signal that it can
jump to a certain state without computing all previous values. That
means the computations are independent and could for example be
distributed across different processors or threads.
>>> it = range(100)
>>> it[50:]
[50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67,
68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85,
86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99]
But currently this doesn't work for xrange:
>>> it = xrange(100)
>>> it[50:]
Traceback (most recent call last):

Even though xrange *could* know somehow what its slice would look like.

Another problem I have is with the itertools module:
Traceback (most recent call last):

I want islice and related functions to use long integers for indexing.
But of course this only makes sense when there already are generator
slices possible, else there would be no practical way to reach such big
numbers by silently looping through the parts of the sequence until one
reaches the point one is interested in.

I also have thought about the thing you are proposing, there is
itertools.chain of course but that only works when one can call it from
'outside' the generator.

Suppose one wants to generate all unique permutations of something. One
idea would be to sort the sequence and then start generating successors
until one reaches the point the sequence is completely reversed. But
what if one wants to start with the actual state the sequence is in? One
could generate successors until one reaches the 'end' and then continue
by generating successors from the 'beginning' until one reaches the
original state. Note that by changing the cmp function this generator
could also iterate in reverse from any point. There only would need to
be a way to change the cmp function of a running generator instance.

from operator import ge,le
from itertools import chain

def mutate(R,i,j):
a,b,c,d,e = R[:i],R[i:i+1],R[i+1:j],R[j:j+1],R[j+1:]
return a+d+(c+b+e)[::-1]

def _pgen(L, cmpf = ge):
R = L[:]
yield R
n = len(R)
if n >= 2:
while True:
i,j = n-2,n-1
while cmpf(R,R[i+1]):
i -= 1
if i == -1:
return
while cmpf(R,R[j]):
j -= 1
R = mutate(R,i,j)
yield R

def perm(L):
F = _pgen(L)
B = _pgen(L,le)
B.next()
return chain(F,B)

def test():
P = '12124'
g = perm(P)
for i,x in enumerate(g):
print '%4i) %s' %(i, x)

if __name__ == '__main__':
test()

A.
 
S

Steven Bethard

Paul said:
The boilerplate

def some_gen():
...
for x in some_other_gen():
yield x
...

is so common (including the case where some_other_gen is the same as
some_gen, i.e. it's a recursive call) that I find myself wanting
a more direct way to express it:

def some_gen():
...
yield *some_other_gen()

comes to mind. Less clutter, and avoids yet another temp variable
polluting the namespace.

Thoughts?

This has been brought up before and there was mixed support:

http://www.python.org/dev/summary/2006-01-16_2006-01-31/#yielding-from-a-sub-generator

My guess is that the only way it has any chance is if someone takes the
time to implement it and posts a full-fledged PEP to python-dev.

STeVe
 
K

Kay Schluehr

I'm currently also fascinated by the new generator possibilities, for
example sending back a value to the generator by making yield return a
value. What I would like to use it for is when I have a very long
generator and I need just a slice of the values. That would mean running
through a loop, discarding all the values until the generator is in the
desired state and only then start doing something with the output.
Instead I would like to directly set or 'wind' -like a file- a generator
into some specific state.

Maybe you should start by developing a design pattern first and
publish it in the Cookbook. I have the fuzzy impression that the idea
you are after, requires more powerfull control structures such as
delimited continuations that are beyond ths scope of Pythons simple
coroutines.

Kay
 
A

Anton Vredegoor

Kay said:
Maybe you should start by developing a design pattern first and
publish it in the Cookbook. I have the fuzzy impression that the idea
you are after, requires more powerfull control structures such as
delimited continuations that are beyond ths scope of Pythons simple
coroutines.

I was reacting to a request for thoughts. I firmly believe that there
should be a relatively unrestricted brainstorming phase before one tries
to implement things. I have often noticed a lot of counterproductive
bickering about use cases and implementations on the developers list,
but luckily not so much here. There is something like premature
optimizations with respect to creative processes too!

A.
 
G

Georg Brandl

Steven said:
This has been brought up before and there was mixed support:

http://www.python.org/dev/summary/2006-01-16_2006-01-31/#yielding-from-a-sub-generator

My guess is that the only way it has any chance is if someone takes the
time to implement it and posts a full-fledged PEP to python-dev.

BTW, I've implemented a different feature, namely extended unpacking, such as

a, *b, c = range(10)

where I already had to add the concept of a ``starred'' expression.
If someone wants to do it, we can probably share some code.


Georg
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Staff online

Members online

Forum statistics

Threads
473,764
Messages
2,569,566
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top