Python 3000 idea -- + on iterables -> itertools.chain

Discussion in 'Python' started by John Reese, Nov 13, 2006.

  1. John Reese

    John Reese Guest

    It seems like it would be clear and mostly backwards compatible if the
    + operator on any iterables created a new iterable that iterated
    throught first its left operand and then its right, in the style of
    itertools.chain. This would allow summation of generator expressions,
    among other things, to have the obvious meaning.

    Any thoughts? Has this been discussed before? I didn't see it
    mentioned in PEP 3100.

    The exception to the compatibility argument is of course those
    iterables for which + is already defined, like tuples and lists, for
    that set of code that assumes that the result is of that same type,
    explicitly or implicitly by calling len or indexing or whathaveyou.
    In those cases, you could call tuple or list on the result. There are
    any number of other things in Python 3000 switching from lists to
    one-at-a-time iterators, like dict.items(), so presumably this form of
    incompatibility isn't a showstopper.
     
    John Reese, Nov 13, 2006
    #1
    1. Advertising

  2. John Reese wrote:

    > It seems like it would be clear and mostly backwards compatible if the
    > + operator on any iterables created a new iterable that iterated
    > throught first its left operand and then its right, in the style of
    > itertools.chain.


    you do know that "iterable" is an informal interface, right? to what
    class would you add this operation?

    </F>
     
    Fredrik Lundh, Nov 13, 2006
    #2
    1. Advertising

  3. Fredrik Lundh wrote:

    > John Reese wrote:
    >
    > > It seems like it would be clear and mostly backwards compatible if the
    > > + operator on any iterables created a new iterable that iterated
    > > throught first its left operand and then its right, in the style of
    > > itertools.chain.

    >
    > you do know that "iterable" is an informal interface, right? to what
    > class would you add this operation?
    >
    > </F>


    The base object class would be one candidate, similarly to the way
    __nonzero__ is defined to use __len__, or __contains__ to use __iter__.

    Alternatively, iter() could be a wrapper type (or perhaps mixin)
    instead of a function, something like:

    from itertools import chain, tee, islice

    import __builtin__
    _builtin_iter = __builtin__.iter

    class iter(object):

    def __init__(self, iterable):
    self._it = _builtin_iter(iterable)

    def __iter__(self):
    return self
    def next(self):
    return self._it.next()

    def __getitem__(self, index):
    if isinstance(index, int):
    try: return islice(self._it, index, index+1).next()
    except StopIteration:
    raise IndexError('Index %d out of range' % index)
    else:
    start,stop,step = index.start, index.stop, index.step
    if start is None: start = 0
    if step is None: step = 1
    return islice(self._it, start, stop, step)

    def __add__(self, other):
    return chain(self._it, other)
    def __radd__(self,other):
    return chain(other, self._it)

    def __mul__(self, num):
    return chain(*tee(self._it,num))

    __rmul__ = __mul__

    __builtin__.iter = iter


    if __name__ == '__main__':
    def irange(*args):
    return iter(xrange(*args))

    assert list(irange(5)[:3]) == range(5)[:3]
    assert list(irange(5)[3:]) == range(5)[3:]
    assert list(irange(5)[1:3]) == range(5)[1:3]
    assert list(irange(5)[3:1]) == range(5)[3:1]
    assert list(irange(5)[:]) == range(5)[:]
    assert irange(5)[3] == range(5)[3]

    s = range(5) + range(7,9)
    assert list(irange(5) + irange(7,9)) == s
    assert list(irange(5) + range(7,9)) == s
    assert list(range(5) + irange(7,9)) == s

    s = range(5) * 3
    assert list(irange(5) * 3) == s
    assert list(3 * irange(5)) == s


    George
     
    George Sakkis, Nov 13, 2006
    #3
  4. George Sakkis wrote:

    > The base object class would be one candidate, similarly to the way
    > __nonzero__ is defined to use __len__, or __contains__ to use __iter__.
    >
    > Alternatively, iter() could be a wrapper type (or perhaps mixin)
    > instead of a function, something like:


    so you're proposing to either make *all* objects respond to "+", or
    introduce limited *iterator* algebra.

    not sure how that matches the OP's wish for "mostly backwards
    compatible" support for *iterable* algebra, really...

    (iirc, GvR has shot down a few earlier "let's provide sugar for iter-
    tools" proposals. no time to dig up the links right now, but it's in
    the python-dev archives, somewhere...)

    </F>
     
    Fredrik Lundh, Nov 13, 2006
    #4
  5. John Reese

    Georg Brandl Guest

    George Sakkis wrote:
    > Fredrik Lundh wrote:
    >
    >> John Reese wrote:
    >>
    >> > It seems like it would be clear and mostly backwards compatible if the
    >> > + operator on any iterables created a new iterable that iterated
    >> > throught first its left operand and then its right, in the style of
    >> > itertools.chain.

    >>
    >> you do know that "iterable" is an informal interface, right? to what
    >> class would you add this operation?
    >>
    >> </F>

    >
    > The base object class would be one candidate, similarly to the way
    > __nonzero__ is defined to use __len__, or __contains__ to use __iter__.


    What has a better chance of success in my eyes is an extension to yield
    all items from an iterable without using an explicit for loop: instead of

    for item in iterable:
    yield item

    you could write

    yield from iterable

    or

    yield *iterable

    etc.

    Georg
     
    Georg Brandl, Nov 13, 2006
    #5
  6. Fredrik Lundh wrote:

    > George Sakkis wrote:
    >
    > > The base object class would be one candidate, similarly to the way
    > > __nonzero__ is defined to use __len__, or __contains__ to use __iter__.
    > >
    > > Alternatively, iter() could be a wrapper type (or perhaps mixin)
    > > instead of a function, something like:

    >
    > so you're proposing to either make *all* objects respond to "+", or
    > introduce limited *iterator* algebra.


    If by 'respond to "+"' is implied that you can get a "TypeError:
    iterable argument required", as you get now for attempting "x in y" for
    non-iterable y, why not ? Although I like the iterator algebra idea
    better.

    > not sure how that matches the OP's wish for "mostly backwards
    > compatible" support for *iterable* algebra, really...


    Given the subject of the thread, backwards compatibility is not the
    main prerequisite. Besides, it's an *extension* idea; allow operations
    that were not allowed before, not the other way around or modifying
    existing semantics. Of course, programs that attempt forbidden
    expressions on purpose so that they can catch and handle the exception
    would break when suddenly no exception is raised, but I doubt there are
    many of those...

    George
     
    George Sakkis, Nov 13, 2006
    #6
  7. John Reese

    Carl Banks Guest

    Georg Brandl wrote:
    > What has a better chance of success in my eyes is an extension to yield
    > all items from an iterable without using an explicit for loop: instead of
    >
    > for item in iterable:
    > yield item
    >
    > you could write
    >
    > yield from iterable
    >
    > or
    >
    > yield *iterable


    Since this is nothing but an alternate way to spell a very specific
    (and not-too-common) for loop, I expect this has zero chance of
    success.


    Carl Banks
     
    Carl Banks, Nov 13, 2006
    #7
  8. John Reese

    Carl Banks Guest

    George Sakkis wrote:
    > Fredrik Lundh wrote:
    >
    > > George Sakkis wrote:
    > >
    > > > The base object class would be one candidate, similarly to the way
    > > > __nonzero__ is defined to use __len__, or __contains__ to use __iter__.
    > > >
    > > > Alternatively, iter() could be a wrapper type (or perhaps mixin)
    > > > instead of a function, something like:

    > >
    > > so you're proposing to either make *all* objects respond to "+", or
    > > introduce limited *iterator* algebra.

    >
    > If by 'respond to "+"' is implied that you can get a "TypeError:
    > iterable argument required", as you get now for attempting "x in y" for
    > non-iterable y, why not ?


    Bad idea on many, many levels. Don't go there.


    > Although I like the iterator algebra idea
    > better.
    >
    > > not sure how that matches the OP's wish for "mostly backwards
    > > compatible" support for *iterable* algebra, really...

    >
    > Given the subject of the thread, backwards compatibility is not the
    > main prerequisite. Besides, it's an *extension* idea; allow operations
    > that were not allowed before, not the other way around or modifying
    > existing semantics.


    You missed the important word (in spite of Fredrick's emphasis):
    iterable. Your iter class solution only works for *iterators* (and not
    even all iterators); the OP wanted it to work for any *iterable*.

    "Iterator" and "iterable" are protocols. The only way to implement
    what the OP wanted is to change iterable protocol, which means changing
    the documentation to say that iterable objects must implement __add__
    and that it must chain the iterables, and updating all iterable types
    to do this. Besides the large amount of work that this will need,
    there are other problems.

    1. It increases the burden on third party iterable developers.
    Protocols should be kept as simple as possible for this reason.
    2. Many iterable types already implement __add__ (list, tuple, string),
    so this new requirement would complicate these guys a lot.

    > Of course, programs that attempt forbidden
    > expressions on purpose so that they can catch and handle the exception
    > would break when suddenly no exception is raised, but I doubt there are
    > many of those...


    3. While not breaking backwards compatibility in the strictest sense,
    the adverse effect on incorrect code shouldn't be brushed aside. It
    would be a bad thing if this incorrect code:

    a = ["hello"]
    b = "world"
    a+b

    suddenly started failing silently instead of raising an exception.


    Carl Banks
     
    Carl Banks, Nov 13, 2006
    #8
  9. Carl Banks wrote:

    > George Sakkis wrote:
    > > Fredrik Lundh wrote:
    > >
    > > > George Sakkis wrote:
    > > >
    > > > > The base object class would be one candidate, similarly to the way
    > > > > __nonzero__ is defined to use __len__, or __contains__ to use __iter__.
    > > > >
    > > > > Alternatively, iter() could be a wrapper type (or perhaps mixin)
    > > > > instead of a function, something like:
    > > >
    > > > so you're proposing to either make *all* objects respond to "+", or
    > > > introduce limited *iterator* algebra.

    > >
    > > If by 'respond to "+"' is implied that you can get a "TypeError:
    > > iterable argument required", as you get now for attempting "x in y" for
    > > non-iterable y, why not ?

    >
    > Bad idea on many, many levels. Don't go there.


    Do you also find the way "in" works today a bad idea ?

    > > Although I like the iterator algebra idea
    > > better.
    > >
    > > > not sure how that matches the OP's wish for "mostly backwards
    > > > compatible" support for *iterable* algebra, really...

    > >
    > > Given the subject of the thread, backwards compatibility is not the
    > > main prerequisite. Besides, it's an *extension* idea; allow operations
    > > that were not allowed before, not the other way around or modifying
    > > existing semantics.

    >
    > You missed the important word (in spite of Fredrick's emphasis):
    > iterable. Your iter class solution only works for *iterators* (and not
    > even all iterators); the OP wanted it to work for any *iterable*.


    I didn't miss the important word, I know the distinction between
    iterables and iterators; that's why I said I like the iterator algebra
    idea better (compared to extending the object class so that effectively
    creates an iterable algebra).

    > "Iterator" and "iterable" are protocols. The only way to implement
    > what the OP wanted is to change iterable protocol, which means changing
    > the documentation to say that iterable objects must implement __add__
    > and that it must chain the iterables, and updating all iterable types
    > to do this. Besides the large amount of work that this will need,
    > there are other problems.
    >
    > 1. It increases the burden on third party iterable developers.
    > Protocols should be kept as simple as possible for this reason.
    > 2. Many iterable types already implement __add__ (list, tuple, string),
    > so this new requirement would complicate these guys a lot.


    If __add__ was ever to be part of the *iterable* protocol, it would be
    silly to implement it for every new iterable type; the implementation
    would always be the same (i.e. chain(self,other)), so it should be put
    in a base class all iterables extend from. That would be either a
    mixin class, or object. This is parallel to how __contains__ is part of
    the sequence protocol, but if you (the 3rd party sequence developer)
    don't define one, a default __contains__ that relies on __getitem__ is
    created for you.

    > > Of course, programs that attempt forbidden
    > > expressions on purpose so that they can catch and handle the exception
    > > would break when suddenly no exception is raised, but I doubt there are
    > > many of those...

    >
    > 3. While not breaking backwards compatibility in the strictest sense,
    > the adverse effect on incorrect code shouldn't be brushed aside. It
    > would be a bad thing if this incorrect code:
    >
    > a = ["hello"]
    > b = "world"
    > a+b
    >
    > suddenly started failing silently instead of raising an exception.


    That's a good example for why I prefer an iterator rather than an
    iterable algebra; the latter is too implicit as "a + b" doesn't call
    only __add__, but __iter__ as well. On the other hand, with a concrete
    iterator type "iter(a) + iter(b)" is not any more error-prone than
    'int(3) + int("2")' or 'str(3) + str("2")'.

    What's the objection to an *iterator* base type and the algebra it
    introduces explicitly ?

    George
     
    George Sakkis, Nov 13, 2006
    #9
  10. John Reese

    Georg Brandl Guest

    Carl Banks wrote:
    > Georg Brandl wrote:
    >> What has a better chance of success in my eyes is an extension to yield
    >> all items from an iterable without using an explicit for loop: instead of
    >>
    >> for item in iterable:
    >> yield item
    >>
    >> you could write
    >>
    >> yield from iterable
    >>
    >> or
    >>
    >> yield *iterable

    >
    > Since this is nothing but an alternate way to spell a very specific
    > (and not-too-common) for loop, I expect this has zero chance of
    > success.


    well, it could also be optimized internally, i.e. with a new opcode.

    Georg
     
    Georg Brandl, Nov 13, 2006
    #10
  11. John Reese

    Carl Banks Guest

    George Sakkis wrote:
    > Carl Banks wrote:
    > > George Sakkis wrote:
    > > > If by 'respond to "+"' is implied that you can get a "TypeError:
    > > > iterable argument required", as you get now for attempting "x in y" for
    > > > non-iterable y, why not ?

    > >
    > > Bad idea on many, many levels. Don't go there.

    >
    > Do you also find the way "in" works today a bad idea ?


    Augh. I don't like it much, but (assuming that there are good use
    cases for testing containment in iterables that don't define
    __contains__) it seems to be the best way to accomplish it for
    iterables in general. However, "in" isn't even comparable to "add"
    here.

    First of all, unlike "add", the nature of "in" more of less requires
    that the second operand is some kind of collection, so surprises are
    kept to a minimum. Second, testing containment is just a bit more
    important, and thus deserving of a special case, than chaining
    iterables.

    The problem is taking a very general, already highly overloaded
    operator +, and adding a special case to the interpreter for one of the
    least common uses. It's just a bad idea.


    > > 3. While not breaking backwards compatibility in the strictest sense,
    > > the adverse effect on incorrect code shouldn't be brushed aside. It
    > > would be a bad thing if this incorrect code:
    > >
    > > a = ["hello"]
    > > b = "world"
    > > a+b
    > >
    > > suddenly started failing silently instead of raising an exception.

    >
    > That's a good example for why I prefer an iterator rather than an
    > iterable algebra; the latter is too implicit as "a + b" doesn't call
    > only __add__, but __iter__ as well. On the other hand, with a concrete
    > iterator type "iter(a) + iter(b)" is not any more error-prone than
    > 'int(3) + int("2")' or 'str(3) + str("2")'.
    >
    > What's the objection to an *iterator* base type and the algebra it
    > introduces explicitly ?


    Well, it still makes it more work to implement iterator protocol, which
    is enough reason to make me -1 on it. Anyways, I don't think it's very
    useful to have it for iterators because most people write functions for
    iterables. You'd have to write "iter(a)+iter(b)" to chain two
    iterables, which pretty much undoes the main convenience of the +
    operator (i.e., brevity). But it isn't dangerous.


    Carl Banks
     
    Carl Banks, Nov 14, 2006
    #11
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    0
    Views:
    874
  2. Steven Bethard
    Replies:
    0
    Views:
    399
    Steven Bethard
    Mar 12, 2005
  3. Raymond Hettinger
    Replies:
    17
    Views:
    552
    Simon Brunning
    Feb 18, 2008
  4. Curt Hibbs
    Replies:
    1
    Views:
    241
    olof sivertsson
    Dec 18, 2005
  5. Curt Hibbs
    Replies:
    2
    Views:
    247
    Curt Hibbs
    Dec 18, 2005
Loading...

Share This Page