Generators.

Discussion in 'Python' started by Jorge Cardona, Dec 6, 2009.

  1. Hi,

    I was trying to create a function that receive a generator and return
    a list but that each elements were computed in a diferent core of my
    machine. I start using islice function in order to split the job in a
    way that if there is "n" cores each "i" core will compute the elements
    i,i+n,i+2n,..., but islice has a weird (to me) behavior, look:

    from itertools import islice

    def f(x):
    print("eval: %d"%x)
    return x

    X = range(10)
    g = (f(x) for x in X)

    print(list(x for x in islice(g,0,None,2)))

    $ python2.5 test.py
    eval: 0
    eval: 1
    eval: 2
    eval: 3
    eval: 4
    eval: 5
    eval: 6
    eval: 7
    eval: 8
    eval: 9
    [0, 2, 4, 6, 8]
    $ python2.7 test.py
    eval: 0
    eval: 1
    eval: 2
    eval: 3
    eval: 4
    eval: 5
    eval: 6
    eval: 7
    eval: 8
    eval: 9
    [0, 2, 4, 6, 8]
    $ python3.0 test.py
    eval: 0
    eval: 1
    eval: 2
    eval: 3
    eval: 4
    eval: 5
    eval: 6
    eval: 7
    eval: 8
    eval: 9
    [0, 2, 4, 6, 8]

    islice execute the function at the generator and drop the elements
    that aren't in the slice. I found that pretty weird, the way that i
    see generators is like an association between and indexing set (an
    iterator or another generator) and a computation that is made indexed
    by the indexing set, and islice is like a "transformation" on the
    indexing set,it doesn't matter the result of the function, the slice
    should act only on the indexing set, some other "transformation" like
    takewhile act on the result so, the execution it has to be made, but
    in the islice, or other "transformation" that act only in the indexing
    set, the function shouldn't be executed at each element, but only on
    that new set that result of the application of the "transformation" on
    the original set.

    I search a little bit and found that gi_frame.f_locals['.0'] holds the
    inner indexing element of a generator, so i decide to define my own
    islice like this:

    from itertools import islice
    import sys


    if sys.version_info[0] != 3:
    def next(it):
    return it.next()


    def f(x):
    print("eval: %d"%x)
    return x


    def islice(iterable, *args):
    s = slice(*args)

    # search the deepest iter (Base indexing set)
    it = iterable
    while hasattr(it, 'gi_frame'):
    it = it.gi_frame.f_locals['.0']

    # Consume the base indexing set until the first element
    for i in range(s.start):
    it.next()

    for e in iterable:
    yield e

    # Consume the base indexing set until the next element
    for i in range(s.step-1):
    next(it)


    X = range(10)
    g = (f(x) for x in X)

    print(list(x for x in islice(g,0,None,2)))


    jcardona@terminus:/tmp$ python2.5 test.py
    eval: 0
    eval: 2
    eval: 4
    eval: 6
    eval: 8
    [0, 2, 4, 6, 8]
    jcardona@terminus:/tmp$ python2.7 test.py
    eval: 0
    eval: 2
    eval: 4
    eval: 6
    eval: 8
    [0, 2, 4, 6, 8]
    jcardona@terminus:/tmp$ python3.0 test.py
    eval: 0
    eval: 2
    eval: 4
    eval: 6
    eval: 8
    [0, 2, 4, 6, 8]

    Well, it works for what i need, but is not very neat, and i think that
    there it should be a formal way to act on the base indexing iterator,
    such way exists? Is there a better approach to get what i need?

    Thanks.


    --
    Jorge Eduardo Cardona

    jorgeecardona.blogspot.com
    ------------------------------------------------
    Linux registered user #391186
    Registered machine #291871
    ------------------------------------------------
     
    Jorge Cardona, Dec 6, 2009
    #1
    1. Advertising

  2. Jorge Cardona

    Lie Ryan Guest

    On 12/7/2009 7:22 AM, Jorge Cardona wrote:
    > Hi,
    >
    > I was trying to create a function that receive a generator and return
    > a list but that each elements were computed in a diferent core of my
    > machine. I start using islice function in order to split the job in a
    > way that if there is "n" cores each "i" core will compute the elements
    > i,i+n,i+2n,..., but islice has a weird (to me) behavior, look:


    it's nothing weird, python just do what you're telling it to:

    transform all x in X with f(x)
    > g = (f(x) for x in X)

    then slice the result of that
    > print(list(x for x in islice(g,0,None,2)))


    what you want to do is to slice before you transform:
    >>> g = (x for x in islice(X, 0, None, 2))
    >>> print(list(f(x) for x in g))

    eval: 0
    eval: 2
    eval: 4
    eval: 6
    eval: 8
    [0, 2, 4, 6, 8]

    > islice execute the function at the generator and drop the elements
    > that aren't in the slice. I found that pretty weird, the way that i
    > see generators is like an association between and indexing set (an
    > iterator or another generator) and a computation that is made indexed
    > by the indexing set, and islice is like a "transformation" on the
    > indexing set,it doesn't matter the result of the function, the slice
    > should act only on the indexing set, some other "transformation" like
    > takewhile act on the result so, the execution it has to be made, but
    > in the islice, or other "transformation" that act only in the indexing
    > set, the function shouldn't be executed at each element, but only on
    > that new set that result of the application of the "transformation" on
    > the original set.


    that seems like an extremely lazy evaluation, I don't know if even a
    true lazy language do that. Python is a strict language, with a few
    laziness provided by generators, in the end it's still a strict language.

    > Well, it works for what i need, but is not very neat, and i think that
    > there it should be a formal way to act on the base indexing iterator,
    > such way exists? Is there a better approach to get what i need?


    Reverse your operation.
     
    Lie Ryan, Dec 7, 2009
    #2
    1. Advertising

  3. 2009/12/7 Lie Ryan <>:
    > On 12/7/2009 7:22 AM, Jorge Cardona wrote:
    >>
    >> Hi,
    >>
    >> I was trying to create a function that receive a generator and return
    >> a list but that each elements were computed in a diferent core of my
    >> machine. I start using islice function in order to split the job in a
    >> way that if there is "n" cores each "i" core will compute the elements
    >> i,i+n,i+2n,..., but islice has a weird (to me) behavior, look:

    >
    > it's nothing weird, python just do what you're telling it to:
    >
    > transform all x in X with f(x)
    >>
    >> g = (f(x) for x in X)

    >
    > then slice the result of that
    >>
    >> print(list(x for x in islice(g,0,None,2)))

    >


    When i wrote that first line of code i thought that i was creating a
    generator that will later compute the elements of X with function f,
    if i will like to transform them immediately i would use [] instead
    (), so, the result is not where i want to slice, even so, is exactly
    the same to slice, after or before, the only difference is that after
    the compute i'm losing 5 in unnecessary execution of the function f.

    > what you want to do is to slice before you transform:
    >>>> g = (x for x in islice(X, 0, None, 2))
    >>>> print(list(f(x) for x in g))

    > eval: 0
    > eval: 2
    > eval: 4
    > eval: 6
    > eval: 8
    > [0, 2, 4, 6, 8]
    >


    What i want to do is a function that receive any kind of generator and
    execute it in several cores (after a fork) and return the data, so, i
    can't slice the set X before create the generator.

    >> islice execute the function at the generator and drop the elements
    >> that aren't in the slice. I found that pretty weird, the way that i
    >> see generators is like an association between and indexing set (an
    >> iterator or another generator) and a computation that is made indexed
    >> by the indexing set, and islice is like a "transformation" on the
    >> indexing set,it doesn't matter the result of the function, the slice
    >> should act only on the indexing set, some other "transformation" like
    >> takewhile act on the result so, the execution it has to be made, but
    >> in the islice, or other "transformation" that act only in the indexing
    >> set, the function shouldn't be executed at each element, but only on
    >> that new set that result of the application of the "transformation" on
    >> the original set.

    >
    > that seems like an extremely lazy evaluation, I don't know if even a true
    > lazy language do that. Python is a strict language, with a few laziness
    > provided by generators, in the end it's still a strict language.
    >


    Yes, it looks like lazy evaluation, but, i don't see why there is not
    a better control over the iterable associated to a generator, even
    with Python that is a strict language, it will increase the
    functionality of it, and the performance too, imagine that you pass a
    function that takes 1 sec in run, and for several reason you can't
    slice before (as the smp function that i want to create), the final
    performance with the actual islice it gets really reduced
    Just creating the separation between those transformation that act on
    the index(islice, or tee) on those that act on the result(dropwhile,
    takewhile, etc.) the control could be fine enough to increase
    usability (that's the way i think right now), and you will be able to
    combine generator without lose performance.

    >> Well, it works for what i need, but is not very neat, and i think that
    >> there it should be a formal way to act on the base indexing iterator,
    >> such way exists? Is there a better approach to get what i need?

    >
    > Reverse your operation.
    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >




    --
    Jorge Eduardo Cardona

    jorgeecardona.blogspot.com
    ------------------------------------------------
    Linux registered user #391186
    Registered machine #291871
    ------------------------------------------------
     
    Jorge Cardona, Dec 7, 2009
    #3
  4. Jorge Cardona

    Taylor Guest

    On Dec 7, 1:29 pm, Jorge Cardona <> wrote:
    > 2009/12/7 Lie Ryan <>:
    >
    >
    >
    > > On 12/7/2009 7:22 AM, Jorge Cardona wrote:

    >
    > >> Hi,

    >
    > >> I was trying to create a function that receive a generator and return
    > >> a list but that each elements were computed in a diferent core of my
    > >> machine. I start using islice function in order to split the job in a
    > >> way that if there is "n" cores each "i" core will compute the elements
    > >> i,i+n,i+2n,..., but islice has a weird (to me) behavior, look:

    >
    > > it's nothing weird, python just do what you're telling it to:

    >
    > > transform all x in X with f(x)

    >
    > >> g = (f(x) for x in X)

    >
    > > then slice the result of that

    >
    > >> print(list(x for x in islice(g,0,None,2)))

    >
    > When i wrote that first line of code i thought that i was creating a
    > generator that will later compute  the elements of X with function f,
    > if i will like to transform them immediately i would use [] instead
    > (), so, the result is not where i want to slice, even so, is exactly
    > the same to slice, after or before, the only difference is that after
    > the compute i'm losing 5 in unnecessary execution of the function f.
    >
    > > what you want to do is to slice before you transform:
    > >>>> g = (x for x in islice(X, 0, None, 2))
    > >>>> print(list(f(x) for x in g))

    > > eval: 0
    > > eval: 2
    > > eval: 4
    > > eval: 6
    > > eval: 8
    > > [0, 2, 4, 6, 8]

    >
    > What i want to do is a function that receive any kind of generator and
    > execute it in several cores (after a fork) and return the data, so, i
    > can't slice the set X before create the generator.
    >
    >
    >
    > >> islice execute the function at the generator and drop the elements
    > >> that aren't in the slice. I found that pretty weird, the way that i
    > >> see generators is like an association between and indexing set (an
    > >> iterator or another generator) and a computation that is made indexed
    > >> by the indexing set, and islice is like a "transformation" on the
    > >> indexing set,it doesn't matter the result of the function, the slice
    > >> should act only on the indexing set, some other "transformation" like
    > >> takewhile act on the result so, the execution it has to be made, but
    > >> in the islice, or other "transformation" that act only in the indexing
    > >> set, the function shouldn't be executed at each element, but only on
    > >> that new set that result of the application of the "transformation" on
    > >> the original set.

    >
    > > that seems like an extremely lazy evaluation, I don't know if even a true
    > > lazy language do that. Python is a strict language, with a few laziness
    > > provided by generators, in the end it's still a strict language.

    >
    > Yes, it looks like lazy evaluation, but, i don't see why there is not
    > a better control over the iterable associated to a generator, even
    > with Python that is a strict language, it will increase the
    > functionality of it, and the performance too, imagine that you pass a
    > function that takes 1 sec in run, and for several reason you can't
    > slice before (as the smp function that i want to create), the final
    > performance with the actual islice it gets really reduced
    > Just creating the separation between those transformation that act on
    > the index(islice, or tee) on those that act on the result(dropwhile,
    > takewhile, etc.) the control could be fine enough to increase
    > usability (that's the way i think right now), and you will be able to
    > combine generator without lose performance.
    >
    > >> Well, it works for what i need, but is not very neat, and i think that
    > >> there it should be a formal way to act on the base indexing iterator,
    > >> such way exists? Is there a better approach to get what i need?

    >
    > > Reverse your operation.
    > > --
    > >http://mail.python.org/mailman/listinfo/python-list

    >
    > --
    > Jorge Eduardo Cardona
    >
    > jorgeecardona.blogspot.com
    > ------------------------------------------------
    > Linux registered user  #391186
    > Registered machine    #291871
    > ------------------------------------------------


    What would you have your islice do for the following generator?

    def fib(n):
    a,b=0,1
    for x in range(n):
    a,b=b,a+b
    yield a

    In order for some value it yields to be correct, it needs execute all
    the previous times. If it happens that some of the results aren't
    used, they are thrown out. As is, using your islice (on, say, fib(10))
    gives a KeyError for the key '.0'.
    Point is, generators don't work that way. You aren't guaranteed to be
    able to jump around, only to find the next value. If you need to split
    a particular generator up into parts that can be computed separately,
    your best bet is probably to rewrite that generator.
     
    Taylor, Dec 8, 2009
    #4
  5. Jorge Cardona

    Lie Ryan Guest

    First, I apologize for rearranging your message out of order.

    On 12/8/2009 5:29 AM, Jorge Cardona wrote:
    >>> islice execute the function at the generator and drop the elements
    >>> that aren't in the slice. I found that pretty weird, the way that i
    >>> see generators is like an association between and indexing set (an
    >>> iterator or another generator) and a computation that is made indexed
    >>> by the indexing set, and islice is like a "transformation" on the
    >>> indexing set,it doesn't matter the result of the function, the slice
    >>> should act only on the indexing set, some other "transformation" like
    >>> takewhile act on the result so, the execution it has to be made, but
    >>> in the islice, or other "transformation" that act only in the indexing
    >>> set, the function shouldn't be executed at each element, but only on
    >>> that new set that result of the application of the "transformation" on
    >>> the original set.

    >>
    >> that seems like an extremely lazy evaluation, I don't know if even a true
    >> lazy language do that. Python is a strict language, with a few laziness
    >> provided by generators, in the end it's still a strict language.
    >>

    >
    > Yes, it looks like lazy evaluation, but, i don't see why there is not
    > a better control over the iterable associated to a generator, even
    > with Python that is a strict language, it will increase the
    > functionality of it, and the performance too, imagine that you pass a
    > function that takes 1 sec in run, and for several reason you can't
    > slice before (as the smp function that i want to create), the final
    > performance with the actual islice it gets really reduced
    > Just creating the separation between those transformation that act on
    > the index(islice, or tee) on those that act on the result(dropwhile,
    > takewhile, etc.) the control could be fine enough to increase
    > usability (that's the way i think right now), and you will be able to
    > combine generator without lose performance.


    Theoretically yes, but the semantic of generators in python is they work
    on an Iterable (i.e. objects that have __iter__), instead of a Sequence
    (i.e. objects that have __getitem__). That means semantically,
    generators would call obj.__iter__() and call the iter.__next__() and do
    its operation for each items returned by the iterator's iterable's
    __next__().

    The lazy semantic would be hard to fit the current generator model
    without changing the semantics of generator to require a Sequence that
    supports indexing.

    > Yes, it looks like lazy evaluation, but, i don't see why there is not
    > a better control over the iterable associated to a generator, even
    > with Python that is a strict language


    You can control the laziness by making it explicitly lazy:

    from functools import partial
    def f(x):
    print("eval: %d"%x)
    return x

    X = range(10)
    g = (partial(f, x) for x in X)

    print(list(x() for x in islice(g,0,None,2)))
    # # or without partial:
    # g = ((lambda: f(x)) for x in X)
    # print(list(f() for f in islice(g,0,None,2)))

    In a default-strict language, you have to explicitly say if you want
    lazy execution.

    > What i want to do is a function that receive any kind of generator and
    > execute it in several cores (after a fork) and return the data, so, i
    > can't slice the set X before create the generator.


    beware that a generator's contract is to return a valid iterator *once*
    only. You can use itertools.tee() to create more generators, but tee
    built a list of the results internally.
     
    Lie Ryan, Dec 8, 2009
    #5
  6. 2009/12/7 Taylor <>:
    > On Dec 7, 1:29 pm, Jorge Cardona <> wrote:
    >> 2009/12/7 Lie Ryan <>:
    >>
    >>
    >>
    >> > On 12/7/2009 7:22 AM, Jorge Cardona wrote:

    >>
    >> >> Hi,

    >>
    >> >> I was trying to create a function that receive a generator and return
    >> >> a list but that each elements were computed in a diferent core of my
    >> >> machine. I start using islice function in order to split the job in a
    >> >> way that if there is "n" cores each "i" core will compute the elements
    >> >> i,i+n,i+2n,..., but islice has a weird (to me) behavior, look:

    >>
    >> > it's nothing weird, python just do what you're telling it to:

    >>
    >> > transform all x in X with f(x)

    >>
    >> >> g = (f(x) for x in X)

    >>
    >> > then slice the result of that

    >>
    >> >> print(list(x for x in islice(g,0,None,2)))

    >>
    >> When i wrote that first line of code i thought that i was creating a
    >> generator that will later compute  the elements of X with function f,
    >> if i will like to transform them immediately i would use [] instead
    >> (), so, the result is not where i want to slice, even so, is exactly
    >> the same to slice, after or before, the only difference is that after
    >> the compute i'm losing 5 in unnecessary execution of the function f.
    >>
    >> > what you want to do is to slice before you transform:
    >> >>>> g = (x for x in islice(X, 0, None, 2))
    >> >>>> print(list(f(x) for x in g))
    >> > eval: 0
    >> > eval: 2
    >> > eval: 4
    >> > eval: 6
    >> > eval: 8
    >> > [0, 2, 4, 6, 8]

    >>
    >> What i want to do is a function that receive any kind of generator and
    >> execute it in several cores (after a fork) and return the data, so, i
    >> can't slice the set X before create the generator.
    >>
    >>
    >>
    >> >> islice execute the function at the generator and drop the elements
    >> >> that aren't in the slice. I found that pretty weird, the way that i
    >> >> see generators is like an association between and indexing set (an
    >> >> iterator or another generator) and a computation that is made indexed
    >> >> by the indexing set, and islice is like a "transformation" on the
    >> >> indexing set,it doesn't matter the result of the function, the slice
    >> >> should act only on the indexing set, some other "transformation" like
    >> >> takewhile act on the result so, the execution it has to be made, but
    >> >> in the islice, or other "transformation" that act only in the indexing
    >> >> set, the function shouldn't be executed at each element, but only on
    >> >> that new set that result of the application of the "transformation" on
    >> >> the original set.

    >>
    >> > that seems like an extremely lazy evaluation, I don't know if even a true
    >> > lazy language do that. Python is a strict language, with a few laziness
    >> > provided by generators, in the end it's still a strict language.

    >>
    >> Yes, it looks like lazy evaluation, but, i don't see why there is not
    >> a better control over the iterable associated to a generator, even
    >> with Python that is a strict language, it will increase the
    >> functionality of it, and the performance too, imagine that you pass a
    >> function that takes 1 sec in run, and for several reason you can't
    >> slice before (as the smp function that i want to create), the final
    >> performance with the actual islice it gets really reduced
    >> Just creating the separation between those transformation that act on
    >> the index(islice, or tee) on those that act on the result(dropwhile,
    >> takewhile, etc.) the control could be fine enough to increase
    >> usability (that's the way i think right now), and you will be able to
    >> combine generator without lose performance.
    >>
    >> >> Well, it works for what i need, but is not very neat, and i think that
    >> >> there it should be a formal way to act on the base indexing iterator,
    >> >> such way exists? Is there a better approach to get what i need?

    >>
    >> > Reverse your operation.
    >> > --
    >> >http://mail.python.org/mailman/listinfo/python-list

    >>
    >> --
    >> Jorge Eduardo Cardona
    >>
    >> jorgeecardona.blogspot.com
    >> ------------------------------------------------
    >> Linux registered user  #391186
    >> Registered machine    #291871
    >> ------------------------------------------------

    >
    > What would you have your islice do for the following generator?
    >
    > def fib(n):
    >    a,b=0,1
    >    for x in range(n):
    >        a,b=b,a+b
    >        yield a
    >
    > In order for some value it yields to be correct, it needs execute all
    > the previous times. If it happens that some of the results aren't
    > used, they are thrown out. As is, using your islice (on, say, fib(10))
    > gives a KeyError for the key '.0'.
    > Point is, generators don't work that way. You aren't guaranteed to be
    > able to jump around, only to find the next value. If you need to split
    > a particular generator up into parts that can be computed separately,
    > your best bet is probably to rewrite that generator.


    yes, it doesn't work with that example, f_locals has only the n
    argument. Let me rewrite the islice like this:

    def islice(iterable, *args):
    s = slice(*args)

    # search the deepest iter (Base indexing set)
    it = iterable
    while hasattr(it, 'gi_frame') and ('.0' in it.gi_frame.f_locals):
    it = it.gi_frame.f_locals['.0']

    # Consume the base indexing set until the first element
    for i in range(s.start):
    it.next()

    for e in iterable:
    yield e
    # Consume the base indexing set until the next element
    for i in range(s.step-1):
    next(it)

    And then:

    def fib(n):
    a,b=0,1
    for x in range(n):
    a,b=b,a+b
    yield a

    g = fib(10)
    print(list(x for x in islice(g,0, None,2)))

    Will result in:
    [1, 3, 8, 21, 55]

    How I see that particular example is that the fibonacci generator is a
    base indexing set, look that you didn't define it in terms of an
    association with another set and a particular function, even when
    there is a range(n) and a sum operation, I can't think at this moment
    in a way to represent fibonacci in generator expression terms.

    The fact that there is a state that has to be preserved between each
    called item of the generator defined with fib(n) (in order to avoid a
    recursive implementation) makes impossible to define the underlying
    function as fib(n) because it really depends on that state
    (fib(state)), so, a unique indexing set couldn't be used here without
    maintain an auxiliary set that holds that state at each compute.
    Even with fibonacci there is only need to maintain the state variable,
    and create a full range of numbers without known a priori the n number
    that maps to the result, with factorial one can hold an state that
    represent the past factorial and call the function with a a present n.
    Maintain the state variables is permitted by the definition of the
    generator as a function with yield, and in your fib it's maintained in
    a and b.

    (((Is there any "clean" way to define a (fib(x) for x in count())
    without falls in recursive errors and high memory usage?, but as a
    generator expression, i can't think in one right now.)))

    A new element is shown here (well, or I just note that at this
    moment), is that a generator defined as a function with yield could
    not be seen as an associations between an indexing set and a function,
    but just as an indexing set of new generators as in:

    g = (f(x) for x in fib(10))
    print(list(x for x in islice(g,0, None,2)))

    that result in:

    eval: 1
    eval: 2
    eval: 5
    eval: 13
    eval: 34
    [1, 2, 5, 13, 34]

    I just download the unittests for itertools from svn of python2.7:

    test_islice (__main__.TestGC) ... ok

    and adding islice above:

    test_islice (__main__.TestGC) ... ok


    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >




    --
    Jorge Eduardo Cardona

    jorgeecardona.blogspot.com
    ------------------------------------------------
    Linux registered user #391186
    Registered machine #291871
    ------------------------------------------------
     
    Jorge Cardona, Dec 8, 2009
    #6
  7. 2009/12/8 Lie Ryan <>:
    > First, I apologize for rearranging your message out of order.
    >
    > On 12/8/2009 5:29 AM, Jorge Cardona wrote:
    >>>>
    >>>> islice execute the function at the generator and drop the elements
    >>>> that aren't in the slice. I found that pretty weird, the way that i
    >>>> see generators is like an association between and indexing set (an
    >>>> iterator or another generator) and a computation that is made indexed
    >>>> by the indexing set, and islice is like a "transformation" on the
    >>>> indexing set,it doesn't matter the result of the function, the slice
    >>>> should act only on the indexing set, some other "transformation" like
    >>>> takewhile act on the result so, the execution it has to be made, but
    >>>> in the islice, or other "transformation" that act only in the indexing
    >>>> set, the function shouldn't be executed at each element, but only on
    >>>> that new set that result of the application of the "transformation" on
    >>>> the original set.
    >>>
    >>> that seems like an extremely lazy evaluation, I don't know if even a true
    >>> lazy language do that. Python is a strict language, with a few laziness
    >>> provided by generators, in the end it's still a strict language.
    >>>

    >>
    >> Yes, it looks like lazy evaluation, but, i don't see why there is not
    >> a better control over the iterable associated to a generator, even
    >> with Python that is a strict language, it will increase the
    >> functionality of it, and the performance too, imagine that you pass a
    >> function that takes 1 sec in run, and for several reason you can't
    >> slice before (as the smp function that i want to create), the final
    >> performance with the actual islice it gets really reduced
    >> Just creating the separation between those transformation that act on
    >> the index(islice, or tee) on those that act on the result(dropwhile,
    >> takewhile, etc.) the control could be fine enough to increase
    >> usability (that's the way i think right now), and you will be able to
    >> combine generator without lose performance.

    >
    > Theoretically yes, but the semantic of generators in python is they work on
    > an Iterable (i.e. objects that have __iter__), instead of a Sequence (i.e..
    > objects that have __getitem__). That means semantically, generators would
    > call obj.__iter__() and call the iter.__next__() and do its operation for
    > each items returned by the iterator's iterable's __next__().
    >
    > The lazy semantic would be hard to fit the current generator model without
    > changing the semantics of generator to require a Sequence that supports
    > indexing.
    >


    Why?

    The goal is add a formal way to separate the transformation of a
    generator in those that act on the indexing set and those that act on
    the result set.

    Well, a little (and not so elaborated) example could be:

    from itertools import islice


    class MyGenerator:
    def __init__(self, function, indexing_set):
    self._indexing_set = (x for x in indexing_set)
    self._function = function

    def indexing_set(self):
    return (x for x in self._indexing_set)

    def result_set(self):
    return (self._function(x) for x in self.indexing_set())

    def function(self):
    return self._function

    def f(x):
    print("eval: %d"%x)
    return x

    def myslice(iterable, *args):
    return MyGenerator(iterable.f, islice(iterable.indexing_set(),*args))


    g = MyGenerator(f, xrange(10))
    print(list(g.result_set()))

    g = MyGenerator(f, xrange(10))
    new_g = myslice(g,0,None,2)
    print(list(new_g.result_set()))

    that returns:

    eval: 0
    eval: 1
    eval: 2
    eval: 3
    eval: 4
    eval: 5
    eval: 6
    eval: 7
    eval: 8
    eval: 9
    [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]
    eval: 0
    eval: 2
    eval: 4
    eval: 6
    eval: 8
    [0, 2, 4, 6, 8]

    I don't see why is needed add a sequence to support the indexing, but
    some separation feature of the base components of the generator
    (function, indexing_set).

    >> Yes, it looks like lazy evaluation, but, i don't see why there is not
    >> a better control over the iterable associated to a generator, even
    >> with Python that is a strict language

    >
    > You can control the laziness by making it explicitly lazy:
    >
    > from functools import partial
    > def f(x):
    >    print("eval: %d"%x)
    >    return x
    >
    > X = range(10)
    > g = (partial(f, x) for x in X)
    >
    > print(list(x() for x in islice(g,0,None,2)))
    > # # or without partial:
    > # g = ((lambda: f(x)) for x in X)
    > # print(list(f() for f in islice(g,0,None,2)))
    >


    I keep here the problem in that i shouldn't be able to define the
    original generator because the function receive the already defined
    generator.

    > In a default-strict language, you have to explicitly say if you want lazy
    > execution.
    >
    >> What i want to do is a function that receive any kind of generator and
    >> execute it in several cores (after a fork) and return the data, so, i
    >> can't slice the set X before create the generator.

    >
    > beware that a generator's contract is to return a valid iterator *once*
    > only. You can use itertools.tee() to create more generators, but tee built a
    > list of the results internally.


    Oh, yes, i used tee first, but i note then that I wasn't using the
    same iterator in the same process, so, when the fork is made I can
    use the initial generator in different processes without this problem,
    so tee is not necessary in this case.

    > --
    > http://mail.python.org/mailman/listinfo/python-list
    >




    --
    Jorge Eduardo Cardona

    jorgeecardona.blogspot.com
    ------------------------------------------------
    Linux registered user #391186
    Registered machine #291871
    ------------------------------------------------
     
    Jorge Cardona, Dec 8, 2009
    #7
  8. Jorge Cardona

    Lie Ryan Guest

    On 12/9/2009 3:52 AM, Jorge Cardona wrote:
    > 2009/12/8 Lie Ryan<>:
    >> First, I apologize for rearranging your message out of order.
    >>
    >> Theoretically yes, but the semantic of generators in python is they work on
    >> an Iterable (i.e. objects that have __iter__), instead of a Sequence (i.e..
    >> objects that have __getitem__). That means semantically, generators would
    >> call obj.__iter__() and call the iter.__next__() and do its operation for
    >> each items returned by the iterator's iterable's __next__().
    >>
    >> The lazy semantic would be hard to fit the current generator model without
    >> changing the semantics of generator to require a Sequence that supports
    >> indexing.
    >>

    >
    > Why?
    >
    > The goal is add a formal way to separate the transformation of a
    > generator in those that act on the indexing set and those that act on
    > the result set.


    <snip code>

    Well, that's just a pretty convoluted way to reverse the order of the
    operation (hey, why not reverse it altogether?). Nevertheless, it still
    requires a semantic change in that all generators must be able to
    produce the underlying stream, which is not always the case. Take this
    small example:

    def foo(x):
    i = 0
    while True:
    yield i
    i += 1

    What would you propose the .indexing_set to be? Surely not the same as
    ..result_set? How would you propose to skip over and islice such
    generator? without executing the in-betweens?

    >> from functools import partial
    >> def f(x):
    >> print("eval: %d"%x)
    >> return x
    >>
    >> X = range(10)
    >> g = (partial(f, x) for x in X)
    >>
    >> print(list(x() for x in islice(g,0,None,2)))
    >> # # or without partial:
    >> # g = ((lambda: f(x)) for x in X)
    >> # print(list(f() for f in islice(g,0,None,2)))
    >>

    >
    > I keep here the problem in that i shouldn't be able to define the
    > original generator because the function receive the already defined
    > generator.
    >
    >> In a default-strict language, you have to explicitly say if you want lazy
    >> execution.
    >>
    >>> What i want to do is a function that receive any kind of generator and
    >>> execute it in several cores (after a fork) and return the data, so, i
    >>> can't slice the set X before create the generator.

    >>
    >> beware that a generator's contract is to return a valid iterator *once*
    >> only. You can use itertools.tee() to create more generators, but tee built a
    >> list of the results internally.

    >
    > Oh, yes, i used tee first, but i note then that I wasn't using the
    > same iterator in the same process, so, when the fork is made I can
    > use the initial generator in different processes without this problem,
    > so tee is not necessary in this case.


    You would have to change tee as well:

    >>> import itertools
    >>> def foo(x):

    .... print('eval: %s' % x)
    .... return x + 1
    ....
    >>> l = [1, 2, 3, 4, 5, 6, 7]
    >>> it = iter(l)
    >>> it = (foo(x) for x in it)
    >>> a, b = itertools.tee(it)
    >>> # now on to you
    >>> it_a = itertools.islice(a, 0, None, 2)
    >>> it_b = itertools.islice(b, 1, None, 2)
    >>> next(it_b)

    eval: 1
    eval: 2
    3
    >>> next(it_b)

    eval: 3
    eval: 4
    5
    >>> next(it_b)

    eval: 5
    eval: 6
    7
     
    Lie Ryan, Dec 9, 2009
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Rhino
    Replies:
    4
    Views:
    5,784
    Roedy Green
    Jan 13, 2006
  2. Pavel

    Code Generators

    Pavel, May 14, 2006, in forum: Java
    Replies:
    7
    Views:
    727
    dimitar
    May 19, 2006
  3. Mark

    Memoizing Generators

    Mark, Jul 8, 2003, in forum: Python
    Replies:
    2
    Views:
    352
  4. Duncan Booth

    generators improvement

    Duncan Booth, Aug 19, 2003, in forum: Python
    Replies:
    4
    Views:
    294
    Oleg Leschov
    Aug 20, 2003
  5. news.west.cox.net

    can generators be nested?

    news.west.cox.net, Aug 28, 2003, in forum: Python
    Replies:
    1
    Views:
    289
    Peter Otten
    Aug 28, 2003
Loading...

Share This Page