Wishlist item: itertools.flatten

Discussion in 'Python' started by Ville Vainio, Mar 11, 2005.

  1. Ville Vainio

    Ville Vainio Guest

    For quick-and-dirty stuff, it's often convenient to flatten a sequence
    (which perl does, surprise surprise, by default):

    [1,2,[3,"hello",[[4]]]] ->

    [1, 2, 3, 'hello', 4]

    One such implementation is at

    http://aspn.activestate.com/ASPN/Mail/Message/python-tutor/2302348

    but something like this would be handy in itertools as well.

    It seems trivial, but I managed to screw up several times when trying
    to produce my own implementation (infinite recursion).

    --
    Ville Vainio http://tinyurl.com/2prnb
    Ville Vainio, Mar 11, 2005
    #1
    1. Advertising

  2. On 11 Mar 2005 13:32:45 +0200, rumours say that Ville Vainio
    <> might have written:

    >For quick-and-dirty stuff, it's often convenient to flatten a sequence
    >(which perl does, surprise surprise, by default):
    >
    >[1,2,[3,"hello",[[4]]]] ->
    >
    >[1, 2, 3, 'hello', 4]
    >
    >One such implementation is at
    >
    >http://aspn.activestate.com/ASPN/Mail/Message/python-tutor/2302348
    >
    >but something like this would be handy in itertools as well.
    >
    >It seems trivial, but I managed to screw up several times when trying
    >to produce my own implementation (infinite recursion).


    See Python Library Reference, "5.16.3 Recipes". Now that all and any (also
    presented as recipes there) are considered to be included, perhaps flatten gets
    a chance too.

    This is just a personal opinion, but I detest restraints on library (itertools
    module in this case) expansion when talking about such useful *building blocks*.
    What happened to "batteries included"?
    --
    TZOTZIOY, I speak England very best.
    "Be strict when sending and tolerant when receiving." (from RFC1958)
    I really should keep that in mind when talking with people, actually...
    Christos TZOTZIOY Georgiou, Mar 11, 2005
    #2
    1. Advertising

  3. Ville Vainio

    Ville Vainio Guest

    >>>>> "Christos" == TZOTZIOY <Christos> writes:

    >> For quick-and-dirty stuff, it's often convenient to flatten a sequence
    >> (which perl does, surprise surprise, by default):
    >>
    >> [1,2,[3,"hello",[[4]]]] ->
    >>
    >> [1, 2, 3, 'hello', 4]


    Christos> See Python Library Reference, "5.16.3 Recipes". Now
    Christos> that all and any (also

    The recipe is:

    def flatten(listOfLists):
    return list(chain(*listOfLists))

    That one is trivial, because it only flattens one level. The
    flattening I'm talking about involves flattening to arbitrary depth to
    get a single sequence of "atoms". The itertools implementation might
    also be able to avoid recursion by using a stack.

    Christos> This is just a personal opinion, but I detest restraints
    Christos> on library (itertools module in this case) expansion
    Christos> when talking about such useful *building blocks*.

    Yeah - esp. in the case of flattening. If it was deemed useful enough
    to be the default behavior in perl (which is admittedly braindamaged),
    it should surely warrant being included as a single function in the
    stdlib.

    --
    Ville Vainio http://tinyurl.com/2prnb
    Ville Vainio, Mar 11, 2005
    #3
  4. On 12 Mar 2005 00:44:39 +0200, rumours say that Ville Vainio
    <> might have written:

    > Christos> This is just a personal opinion, but I detest restraints
    > Christos> on library (itertools module in this case) expansion
    > Christos> when talking about such useful *building blocks*.
    >
    >Yeah - esp. in the case of flattening. If it was deemed useful enough
    >to be the default behavior in perl (which is admittedly braindamaged),
    >it should surely warrant being included as a single function in the
    >stdlib.


    Or a window function, which I have needed enough times in *separate* occasions
    to add it in one of my personal "stdlib" modules (hinting it could be part of
    itertools):

    window('hello', 2) => 'he', 'el', 'll', 'lo'
    --
    TZOTZIOY, I speak England very best.
    "Be strict when sending and tolerant when receiving." (from RFC1958)
    I really should keep that in mind when talking with people, actually...
    Christos TZOTZIOY Georgiou, Mar 11, 2005
    #4
  5. Ville Vainio wrote:
    >>>>>>"Christos" == TZOTZIOY <Christos> writes:

    >
    >
    > >> For quick-and-dirty stuff, it's often convenient to flatten a sequence
    > >> (which perl does, surprise surprise, by default):
    > >>
    > >> [1,2,[3,"hello",[[4]]]] ->
    > >>
    > >> [1, 2, 3, 'hello', 4]

    >
    > Christos> See Python Library Reference, "5.16.3 Recipes". Now
    > Christos> that all and any (also
    >
    > The recipe is:
    >
    > def flatten(listOfLists):
    > return list(chain(*listOfLists))
    >
    > That one is trivial, because it only flattens one level. The
    > flattening I'm talking about involves flattening to arbitrary depth to
    > get a single sequence of "atoms". The itertools implementation might
    > also be able to avoid recursion by using a stack.


    Here's a non-recursive implementation. There are lots around. One issue is
    specifying iterable types which should be atomic (notably strings). This uses a
    simple hardwired test for that.

    def flatten(iterable):
    stack = []
    iterator = iter(iterable)
    while 1:
    try:
    item = iterator.next()
    if hasattr(item,"__iter__"): # Avoids iterating over strings
    stack.append(iterator)
    iterator = iter(item)
    else:
    yield item
    except StopIteration:
    if stack:
    iterator = stack.pop()
    else:
    raise StopIteration

    >>> list(flatten([[[1,2,[3,[4,5]],[6,7,[8,[9],(10,11,12)]]]]]))

    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12]
    >>> list(flatten(["abc","def",["ghi", "jkl"]]))

    ['abc', 'def', 'ghi', 'jkl']

    Michael
    Michael Spencer, Mar 11, 2005
    #5
  6. On Sat, Mar 12, 2005 at 12:55:38AM +0200, Christos TZOTZIOY Georgiou wrote:
    > On 12 Mar 2005 00:44:39 +0200, rumours say that Ville Vainio
    > <> might have written:
    >
    > > Christos> This is just a personal opinion, but I detest restraints
    > > Christos> on library (itertools module in this case) expansion
    > > Christos> when talking about such useful *building blocks*.
    > >
    > >Yeah - esp. in the case of flattening. If it was deemed useful enough
    > >to be the default behavior in perl (which is admittedly braindamaged),
    > >it should surely warrant being included as a single function in the
    > >stdlib.

    >
    > Or a window function, which I have needed enough times in *separate* occasions
    > to add it in one of my personal "stdlib" modules (hinting it could be part of
    > itertools):
    >
    > window('hello', 2) => 'he', 'el', 'll', 'lo'


    This was considered for 2.4, and I put in a patch if you want a C-level
    implementation[1]. The patch wasn't much faster than doing it in python (two
    times, IIRC), the python version is trivial, and almost no one wanted it. So
    Hettinger and I agreed it should be left out.

    -Jack

    [1] http://sourceforge.net/tracker/?group_id=5470&atid=305470&func=detail&aid=756253
    Jack Diederich, Mar 12, 2005
    #6
  7. Ville Vainio

    Ville Vainio Guest

    >>>>> "Michael" == Michael Spencer <> writes:

    Michael> Here's a non-recursive implementation.

    Thanks.

    Michael> There are lots around.

    Yet another fact that suggest the inclusion in stdlib would make sense
    ;-).

    --
    Ville Vainio http://tinyurl.com/2prnb
    Ville Vainio, Mar 12, 2005
    #7
  8. Ville Vainio

    Guest

    , Mar 12, 2005
    #8
  9. Re: Itertools wishlists

    [Ville Vainio]
    > >For quick-and-dirty stuff, it's often convenient to flatten a sequence
    > >(which perl does, surprise surprise, by default):

    . . .
    > >but something like this would be handy in itertools as well.
    > >
    > >It seems trivial, but I managed to screw up several times when trying
    > >to produce my own implementation (infinite recursion).


    [Christos TZOTZIOY Georgiou]
    > See Python Library Reference, "5.16.3 Recipes". Now that all and any (also
    > presented as recipes there) are considered to be included, perhaps flatten

    gets
    > a chance too.
    >
    > This is just a personal opinion, but I detest restraints on library (itertools
    > module in this case) expansion when talking about such useful *building

    blocks*.
    > What happened to "batteries included"?


    FWIW, requests for additions to the itertools module have not fallen on deaf
    ears. There are no arbitrary restraints on building out this module. Each
    request has gotten careful thought and a couple of them were accepted in Py2.4
    (itertools.tee and itertools.groupby). If you would like to know the reasoning
    behind any particular acceptance, rejection, or deferral, then just ask.

    itertools.window() with n=2 got rejected. Almost all proposed uses had better
    solutions (such as an accumulator variable or fibonacci sequence style logic:
    a, b = b, a+b). Writing it in C afforded only small speed advantage over a
    solution using izip() and tee().

    itertools.window() with n>2 was also rejected. While use cases arise in markov
    chains, statistics (moving averages, etc), and cryptanalysis (trigraph
    analysis), there were almost always better solutions. window() spent most of
    its time creating new tuples and shifting each of the common elements by one
    position. A solution using collections.deque is generally superior because the
    popleft() and append() operations do not entail moving all the common elements.
    It was instructive to examine a use case with a large n, sliding window
    compression -- the Right Solution (tm) does *not* entail continuously shifting
    all of the data elements. IOW, providing a window() function in itertools is a
    mistake because it leads people away from better solutions.

    The jury is still out on flatten(). The principal implementation problem is not
    recursing into iterable objects that the caller wants to treat as atomic. While
    that can be done, there doesn't seem to be a clear winner among the solutions
    that have arisen. Also, the solutions to that problem make the resulting
    function more difficult to learn, remember, review, etc. The nature of
    flattening is such that a C implementation doesn't offer any special advantage
    over the various competing pure python versions. And, there is also the issue
    of use cases. It appears to be much more fun to toy around with developing
    flatten() recipes than it is to work on applications that require it. That is
    not to say that it doesn't come-up or that it isn't helpful in Mathematica;
    however, it is somewhat rare and not always the right solution even when it
    could be used.

    itertools.queue() was rejected because it didn't fit naturally into
    applications -- you had to completely twist your logic around just to
    accommodate it. Besides, it is already simple to iterate over a list while
    appending items to it as needed.

    itertools.roundrobin() was too specialized (being the engine behind a
    multi-input task server) and that specialty often demanded more complex
    capabilities than offered by roundrobin. For many situations, collections.deque
    offers a better solution.

    itertools.accumulate() returned the successive steps of a reduce() operation.
    It too had precedents in APL and Mathematica. However, it loses its appeal with
    functions more complex than operator.add or operator.mul. The effort to write a
    more complex function is better spent writing a simple generator.
    Alternatively, many potential applications were better served by in-lining the
    function. The other issue was usability. accumulate() suffered from the same
    complexities as reduce() -- it took too much thought to read, write, and review
    (quick which variable comes first, the cumulative value or the new data element;
    does it fold left or fold right; is there a default initial value; yada yada).

    itertools.multigen() transforms single shot generators into something
    re-iterable. The offered use cases were not compelling. The potential for
    misuse is high and the logic behind the idea doesn't appear to be
    self-consistent.

    itertools.remove_value() is like a lisp/scheme multirember. Yawn. Just use a
    genexp: (elem for elem in iterable if elem != value).

    The jury is still out on itertools.eq() which compares any two iterables for
    equality. Sometimes you want to compare [1,2,3] to (1,2,3) and consider only
    the contents of the container rather than the type of the container. It is
    somewhat handy and obvious; however, someone is bound to misuse it and apply it
    to arbitrarily ordered containers such as sets and dictionaries; apply it to
    infinite iterators; or just inadvertently exhaust an iterator prior to actually
    needing its contents.

    itertools.consume() would run work like map() with no arguments and no return
    value (equivalent to list(it) where your throw-away the result list). This is
    more clearly coded in pure python and it runs only slightly faster. Also, it is
    only useful with functions that have side-effects and that is at odds with the
    ideas of functional programming where itertools have their roots.

    itertools.ilines() would iterate over a buffer and yield upon hitting a
    universal newline -- essentially this is an in-memory version of what the file
    iterator does with text files. This kind of operation is more appropriately
    added to StringIO.

    In addition to the above, people routinely request that all sorts of random
    ideas be put in itertools. "That shouldn't be a builtin; stick it in
    itertools." We likely need some other module for reduction functions like
    any(), all(), no(), quantify(), take(), etc. In general, the itertools module
    would be ill served by becoming a dumping ground.

    Another thought is that it takes some time and skill to learn to use the tools
    and how to combine them. The time and skill seems to rise exponentially with
    the number of tools in the module. So, it would be a mistake to add a few
    obscure, rarely used tools because that would impact the usability of the
    existing toolset.

    'nuff said, mister whatever happened to batteries included ;-)



    Raymond Hettinger


    P.S. It's not an accident that the recipes in the itertools docs are in a form
    that is directly cut and pastable into a working application.
    Raymond Hettinger, Mar 13, 2005
    #9
  10. Ville Vainio

    Guest

    Re: Itertools wishlists

    Thank you for your very good and interesting answer Raymond. In the
    Itertool library there are functions not really simple to use/remember,
    but a flatten() and a partition() can probably be easy to remember
    (your window() function is probably a sliding window, so it's not a
    partition(), I presume).


    >Also, the solutions to that problem make the resulting function more

    difficult to learn, remember, review, etc.<

    I agree, the basic flatten function can be easy (in Mathematica there
    isn't the difference between tuples and lists, and you cannot flatten a
    string), but it can become less simple as an useful implementation for
    Python. This was my suggestion for a possible flatten():

    flatten(sequence, level=-1, tuples=True, strings=False, safe=False)
    - tuples=True then it flattens tuples too.
    - strings=True then it flattens strings with len(s)>1 too.
    - safe if True it cheeks (with something like an iterative isrecursive)
    for
    recursive references inside the sequence.
    - level allows to specify the mapping level:
    level=0 no flattening.
    level=1 the flattening is applied to the first level only.
    level=2 the flattening is applied to the first and second level only.
    level=m where m>=actual depth. This is as level=-1.
    Etc.
    And like in the indexing of lists:
    level=-1 or None (default) means the flattening is applied up to the
    leaves.
    level=-2 flattens up to pre-leaves.
    Etc.


    >The nature of flattening is such that a C implementation doesn't offer

    any special advantage over the various competing pure python versions.<

    Even a well tuned (non recursive) standard python version can be okay,
    it avoids people to design lots of slower\wrong functions by
    themselves.


    >And, there is also the issue of use cases. It appears to be much more

    fun to toy around with developing flatten() recipes than it is to work
    on applications that require it.<

    It's not easy to define "require" because usually there are many ways
    to solve every problem.

    There are two situations that I've found can make use of the
    flatten/partition, but you can probably find better ways to do the same
    thing (and I can appreciate suggestions):

    1)
    The function f returns two values, but you need a flat list as result:
    def f(x): return x, x**2
    r = flatten( f(i) for i in range(10) )
    print r

    Alternative:
    def f(x): return x, x**2
    r = []
    for i in range(10): r.extend( f(i) )
    print r
    (You can also use two append())


    2)
    A file with columns of numbers separated by a space/tab:
    n n n n
    n n n n
    n n n n
    ....

    ll = open("namefile").read().splitlines()
    r = [map(float, l.split()) for l in ll]

    Alternative:
    ll = open("namefile").read().split()
    r = partition(map(float, ll), 4)

    This second version can be a little faster, but it has the disadvantage
    that the column number 4 is hard coded.


    >We likely need some other module for reduction functions like any(),

    all(), no(), quantify(), take(), etc.<

    Okay.


    Your itertools.consume() looks like the Mathematica Scan[] function.

    Bye,
    Bearophile

    [Remove HUGS if you want to mail me directly]
    , Mar 13, 2005
    #10
  11. Ville Vainio

    Terry Reedy Guest

    Re: Itertools wishlists

    "Raymond Hettinger" <> wrote in message
    news:YqRYd.1381$qN3.1209@trndny01...
    > FWIW, requests for additions to the itertools module have not fallen on
    > deaf
    > ears. There are no arbitrary restraints on building out this module.
    > Each
    > request has gotten careful thought and a couple of them were accepted in
    > Py2.4
    > (itertools.tee and itertools.groupby). If you would like to know the
    > reasoning
    > behind any particular acceptance, rejection, or deferral, then just ask.


    [nice rundown snipped]

    Is this list of reasonings permanently posted somewhere, like on the wiki?
    It's too good to just disappear.

    TJR
    Terry Reedy, Mar 13, 2005
    #11
  12. Re: Itertools wishlists

    [bearophile]
    > This was my suggestion for a possible flatten():
    >
    > flatten(sequence, level=-1, tuples=True, strings=False, safe=False)
    > - tuples=True then it flattens tuples too.
    > - strings=True then it flattens strings with len(s)>1 too.
    > - safe if True it cheeks (with something like an iterative isrecursive)
    > for
    > recursive references inside the sequence.
    > - level allows to specify the mapping level:
    > level=0 no flattening.
    > level=1 the flattening is applied to the first level only.
    > level=2 the flattening is applied to the first and second level only.
    > level=m where m>=actual depth. This is as level=-1.
    > Etc.
    > And like in the indexing of lists:
    > level=-1 or None (default) means the flattening is applied up to the
    > leaves.
    > level=-2 flattens up to pre-leaves.
    > Etc.


    My suggestion is to stop smoking Crack and check into rehab ;-)

    Each one of the options listed is a reason that flatten() shouldn't be an
    itertool. It fails tests of obviousness, learnability, complexity of
    implementation, and simplicity of API. The options also suggest that the
    abstraction is not as basic or universal as we would hope.



    > >And, there is also the issue of use cases. It appears to be much more

    > fun to toy around with developing flatten() recipes than it is to work
    > on applications that require it.<
    >
    > It's not easy to define "require" because usually there are many ways
    > to solve every problem.


    Perhaps "require" was the wrong word. The issue is that appear to be very few
    real situations where flatten() would be the tool of choice.




    > There are two situations that I've found can make use of the
    > flatten/partition, but you can probably find better ways to do the same
    > thing (and I can appreciate suggestions):
    >
    > 1)
    > The function f returns two values, but you need a flat list as result:
    > def f(x): return x, x**2
    > r = flatten( f(i) for i in range(10) )
    > print r


    This is not a good way to store the function's results. It unnecessarily throws
    away structure. Unless creating some type of serialization function, it likely
    the wrong thing to do. Also, the example appears to be contrived and not
    representative of real code. If there is actually a need to treat multiple
    return values as being undifferentiated, then your alternate solution with
    extend() is the most appropriate solution.


    > 2)
    > A file with columns of numbers separated by a space/tab:

    .. . .
    > ll = open("namefile").read().splitlines()
    > r = [map(float, l.split()) for l in ll]


    If you need that to be flattened one level, it would have been better to do all
    the splits at once:

    # split on tabs, spaces, and newlines
    r = map(float, open('namefile').read().split())

    Generalizing the two results, it may be fair to say that the desire to flatten
    is a code smell indicating that structure is being unnecessarily destroyed or
    that earlier processing introduced unwanted structure. Let the data guide the
    programming.



    Raymond Hettinger
    Raymond Hettinger, Mar 14, 2005
    #12
  13. Ville Vainio

    Ville Vainio Guest

    Re: Itertools wishlists

    >>>>> "Raymond" == Raymond Hettinger <> writes:

    Raymond> Each one of the options listed is a reason that flatten()
    Raymond> shouldn't be an itertool. It fails tests of obviousness,
    Raymond> learnability, complexity of implementation, and
    Raymond> simplicity of API. The options also suggest that the
    Raymond> abstraction is not as basic or universal as we would
    Raymond> hope.

    A simpler API:

    def flatten(sequence, atomic_test = lambda o: isinstance(o,basestring)):
    """ don't recurse into iterables if atomic_test -> True """

    I believe speaking of the "levels" of flattening is contorted here.

    Raymond> Perhaps "require" was the wrong word. The issue is that
    Raymond> appear to be very few real situations where flatten()
    Raymond> would be the tool of choice.

    Suppose that I get a very complex data structure involving lists of
    tuples of tuples [....] of strings. I just want to quickly search the
    sequence for valid file names, without going through elaborate
    unpacking. Then I just do

    files = (f fof f in flatten(monster_data_struct) if os.path.isfile(str(f)))

    Yep, this is a real use case (ipython + some of my own data munging
    tools).

    Raymond> Generalizing the two results, it may be fair to say that
    Raymond> the desire to flatten is a code smell indicating that
    Raymond> structure is being unnecessarily destroyed or that
    Raymond> earlier processing introduced unwanted structure. Let
    Raymond> the data guide the programming.

    You are looking the problem from a specific mindset, that of writing
    good clean pythonic code. flatten is for situations when you need an
    implementation 20 seconds ago (where someone might have recommended
    perl in the past, and which is a perfectly valid niche for Python as
    well).

    It's not a matter of life & death for me, obviously (it's in my own
    stdlib). I still can't see how its existence would make rest of
    itertools magically harder to learn. When I come up with a problem
    where I imagine itertools might come in handy, I check the docs to see
    whether there is anything appropriate for the problem. I don't
    memorize all the functions, just the fact that such functions
    exist.

    Also, the following itertool functions are not very useful anymore,
    with the advent of genexps:

    ifilter(pred, seq) --> elements of seq where pred(elem) is True
    ifilterfalse(pred, seq) --> elements of seq where pred(elem) is False
    imap(fun, p, q, ...) --> fun(p0, q0), fun(p1, q1), ...
    starmap(fun, seq) --> fun(*seq[0]), fun(*seq[1]), ...

    I don't believe a genuinely useful 'flatten' would increase the
    cognitive load any more than these.

    --
    Ville Vainio http://tinyurl.com/2prnb
    Ville Vainio, Mar 14, 2005
    #13
  14. Michael Spencer wrote:
    > if hasattr(item,"__iter__"): # Avoids iterating over strings


    That's probably the cleanest way to avoid strings, but it's
    unfortunately not a good idea IMHO. Many objects (IndexedCatalog's
    Result objects are what I'm concerned about, but there are undoubtedly
    others) still rely entirely on the old-style sequence iteration protocol
    even though they're proper containers, so checking for __iter__ wouldn't
    work on them. Explicitly doing isinstance(item, basestring) is probably
    the best option until older objects move to the new iteration protocol.
    Leif K-Brooks, Mar 14, 2005
    #14
  15. Re: Itertools wishlists

    Ville Vainio wrote:
    > A simpler API:
    >
    > def flatten(sequence, atomic_test = lambda o: isinstance(o,basestring)):
    > """ don't recurse into iterables if atomic_test -> True """


    Yes, this is also the API I would have suggested. Simple, but flexible
    enough to handle the odd cases with the occasional user-defined iterable
    non-containers.

    STeVe
    Steven Bethard, Mar 14, 2005
    #15
  16. Leif K-Brooks wrote:
    > Michael Spencer wrote:
    >
    >> if hasattr(item,"__iter__"): # Avoids iterating over strings

    >
    >
    > That's probably the cleanest way to avoid strings, but it's
    > unfortunately not a good idea IMHO. Many objects (IndexedCatalog's
    > Result objects are what I'm concerned about, but there are undoubtedly
    > others) still rely entirely on the old-style sequence iteration protocol
    > even though they're proper containers, so checking for __iter__ wouldn't
    > work on them. Explicitly doing isinstance(item, basestring) is probably
    > the best option until older objects move to the new iteration protocol.

    Sure, the preceding part of my post:

    > One issue is specifying iterable types which should be atomic (notably strings). This uses a simple hardwired test for that.


    ....makes it clear that 'hasattr(item,"__iter__")' is not claiming to be a
    general atomicity test.

    Using something like:
    if not isinstance(item, basestring):
    try:
    iterator = iter(item)
    etc...

    ....defines a rule that works for your case, but it also means that any
    non-basestring class that implements __getitem__ gets flattened. That may be
    what you want...or not.

    I think flatten is a special-case function, and it is therefore most practical
    to define its atomicity rules on a case-by-case basis.

    Michael
    Michael Spencer, Mar 14, 2005
    #16
  17. Ville Vainio

    Guest

    Re: Itertools wishlists

    Thank you for your answers, Raymond Hettinger.

    >The options also suggest that the abstraction is not as basic or

    universal as we would hope.<

    I don't understand, but this is normal.


    > ll = open("namefile").read().split()
    > r = partition(map(float, ll), 4)


    >If you need that to be flattened one level, it would have been better

    to do all the splits at once:<

    Uhm, my purpose was the opposite, as you can see from the alternative
    version. It was an example of using partition(), that is kind of
    opposite of flatten().


    >Generalizing the two results, it may be fair to say that the desire to

    flatten is a code smell indicating that structure is being
    unnecessarily destroyed or that earlier processing introduced unwanted
    structure.<

    Probably for my programming style flatten() is useful, but this is a
    very subjective thing (and I've already coded my flatten that I use not
    much frequently, without the level parameter inspired by a similar
    Matemathica one), I've also implemented little other things from the
    Delphi and Logo language, for my own use in Python.
    For me it's not easy to improve/extend Python, maybe I have to stop
    trying it, and I have just to use this language as you people are
    giving us...

    Than you, a bearish hug,
    Bearophile
    , Mar 14, 2005
    #17
  18. Re: Itertools wishlists

    On Sun, 13 Mar 2005 06:52:40 GMT, rumours say that "Raymond Hettinger"
    <> might have written:

    [snip of lots of stuff]

    >itertools.window() with n=2 got rejected. Almost all proposed uses had better
    >solutions (such as an accumulator variable or fibonacci sequence style logic:
    >a, b = b, a+b). Writing it in C afforded only small speed advantage over a
    >solution using izip() and tee().


    Speed or C implementation was not my objection. I don't care if C itertools
    gets renamed to _itertools and some Python itertools imports _itertools first
    thing a la heapq, and then the most probably useful doc recipes get defined in
    Python; in fact, I would like that. Do you think that window is rarely used?
    You didn't say that in your post, you just argued against implementing it in C.
    I agree. Don't implement it in C. Implement it as in the recipe. Just make it
    readily available. I volunteer for a patch if we agree on it[1].

    For example, if speed is our main concern, why is there PEP 309 for partial
    function application as a class? The tools are already there, and
    new.instancemethod is our friend. In the case of binding only the first
    argument, which is the most common, calling the resulting object is 2% faster
    than calling the function itself (at least on my laptop). Of course, PEP 309 is
    not about speed; it's about availability of /correct/ and /useful/
    functionality. Kind of like recipes in the itertools documentation.

    >'nuff said, mister whatever happened to batteries included ;-)


    Cool extensive summarising reply --however, how relevant is it? None of the
    itertools.(queue|roundrobin|accumulate|multi_gen|remove_value|eq|consume|ilines)
    that you reference exists as a recipe in the docs.

    Did I make you believe I cared about the fate of any function judged unworthy
    even for the documentation? Because, if it is my fault, let me correct that: I
    am talking about the recipes in the documentation (those found worthy enough to
    be included in the docs, but not found worthy enough to be included in the lib),
    and specifically for window and flatten: why aren't they considered as
    batteries? About flatten, you say things are still warm. About window, you say
    it's not really efficient, use izip and tee, or use a deque for n>2. Ok. I
    never spoke about speed. I spoke about tool availability.

    IIRC you were cooled down in Python-Dev from adding more functions into
    itertools; it was kind of a surprise for me now to see /you/ of all people
    defending this choice. FWIW, I like speed (who doesn't if nothing else is
    affected?); I have an unpolished CPython pcode decompiler / compiler that offers
    a decorator for function-level optimising and a function that modifies *.py[co]
    files replacing patterns like the one you yourself lately worried about [2],
    which module I leave as-is partly waiting the AST branch to be finished and
    partly because now and then you implement some part of it in the peephole
    optimiser. So no objections about speed, just note that I didn't mention it in
    my previous post.

    AFAIAC, in my own stdlib-supporting modules what I need is readily available;
    possibly I'm just whining that putting recipes in the docs as an 'extended
    toolset' instead of in a module is a joking compromise...

    >P.S. It's not an accident that the recipes in the itertools docs are in a form
    >that is directly cut and pastable into a working application.


    [1] Like I said, I volunteer to copy-paste from the docs into a Python module
    for the stdlib. Thanks for the eye-opening hint!-)

    [2] STORE_<sth> x/LOAD_<sth> x into DUP/STORE_<sth> x, local binding of multiple
    LOAD_sth x/LOAD_ATTR/LOAD_ATTR where x is an imported module into a LOAD_FAST or
    a LOAD_CONST based on whether it's the modifier or the decorator,
    LOAD_CONST*/BUILD_LIST into LOAD_CONST a_tuple/UNPACK_SEQUENCE/BUILD_TUPLE,
    "a,b,c=d,e,f" reordering and dropping of the tuple building/unpacking (which is
    a case already improved in C) etc.
    --
    TZOTZIOY, I speak England very best.
    "Be strict when sending and tolerant when receiving." (from RFC1958)
    I really should keep that in mind when talking with people, actually...
    Christos TZOTZIOY Georgiou, Mar 16, 2005
    #18
  19. Re: Itertools wishlists

    >itertools.window() with n=2 got rejected. Almost all proposed uses had better
    > >solutions (such as an accumulator variable or fibonacci sequence style logic:
    > >a, b = b, a+b). Writing it in C afforded only small speed advantage over a
    > >solution using izip() and tee().


    [Christos TZOTZIOY Georgiou]
    > Speed or C implementation was not my objection.

    . . .
    > Just make it
    > readily available. I volunteer for a patch if we agree on it[1].


    No thanks. A patch was developed, discussed, and rejected long ago. The main
    reason was that existing approaches were often better (at least in the context
    of the use cases that we reviewed). Performance was mentioned because that is
    sometimes used to justify a new tool. In this case, that justification was not
    strong.


    > None of the
    >

    itertools.(queue|roundrobin|accumulate|multi_gen|remove_value|eq|consume|ilines)
    > that you reference exists as a recipe in the docs.


    FWIW, the recipe for roundrobin is at:
    http://docs.python.org/lib/deque-recipes.html .

    The only other one worth further discussion is iterequals(). I held-off
    publishing that one because the opportunites for error were too great (see the
    previous post for details).



    > Did I make you believe I cared about the fate of any function judged unworthy
    > even for the documentation?


    No. My note was mainly for the benefit of those who had an interest in what
    type of ideas had been discussed and the reasoning behind their
    inclusion/exclusion. It needed to be documented somewhere and the newsgroup
    discussion on a couple of proposals provided an opportunity to put those notes
    on record.



    > I'm just whining that putting recipes in the docs as an 'extended
    > toolset' instead of in a module is a joking compromise...


    Not really. The recipes have several uses and none of them are compromises.

    Firstly, they serve as a proving ground which helps inform further development.
    As a recipe, they can be readily altered, improved, or removed without breaking
    anything. However, once I include them as part of the module, the API gets
    frozen, as do any accidents of implementation. Once released, the process of
    making repairs or alterations becomes slow and painful. Over time, some of the
    recipes will disappear. some will evolve, and some will work their way into a
    module (witness any() and all() as recent examples).

    By way of comparision, consider the evolution of set()/frozenset() which went
    through stages as recipes, as a PEP, then as Python module, and finally as C
    coded built-ins. That multi-stage process was deliberate and resulted in the
    final version being excellent. Similarly, many new decorators are going to
    start their lives as wiki entries or recipes. Ultimately, some will make it
    into the standard library. It would be a mistake to make that transition too
    quickly.

    The other purpose of the itertool recipes is to serve as a teaching tool showing
    how to combine the tools and how to integrate them with other Python code. IMO,
    most of the recipes are more useful in this capacity than as immediate solutions
    to particular problems.



    Raymond Hettinger
    Raymond Hettinger, Mar 17, 2005
    #19
  20. Re: Itertools wishlists

    On Thu, 17 Mar 2005 06:09:44 GMT, rumours say that "Raymond Hettinger"
    <> might have written:

    [snip]

    >> Did I make you believe I cared about the fate of any function judged unworthy
    >> even for the documentation?

    >
    >No. My note was mainly for the benefit of those who had an interest in what
    >type of ideas had been discussed and the reasoning behind their
    >inclusion/exclusion. It needed to be documented somewhere and the newsgroup
    >discussion on a couple of proposals provided an opportunity to put those notes
    >on record.


    In that case, I thank you too (like Terry) for the trouble writing down those
    notes.

    >> I'm just whining that putting recipes in the docs as an 'extended
    >> toolset' instead of in a module is a joking compromise...

    >
    >Not really. The recipes have several uses and none of them are compromises.


    Of course they aren't compromises. Who said they were? The subsentence "is a
    joking compromise", to which your "Not really" applies, has "putting" as
    subject, not "uses of recipes". It's possible that my lack of fluency in the
    English language confused you.

    [snip]

    >By way of comparision, consider the evolution of set()/frozenset() which went
    >through stages as recipes, as a PEP, then as Python module, and finally as C
    >coded built-ins. That multi-stage process was deliberate and resulted in the
    >final version being excellent. Similarly, many new decorators are going to
    >start their lives as wiki entries or recipes. Ultimately, some will make it
    >into the standard library. It would be a mistake to make that transition too
    >quickly.


    That (long cycle/excellent results) is surely true. And sets are *very* useful
    as the majority would agree. Thanks about sets, too.

    >The other purpose of the itertool recipes is to serve as a teaching tool showing
    >how to combine the tools and how to integrate them with other Python code. IMO,
    >most of the recipes are more useful in this capacity than as immediate solutions
    >to particular problems.


    Well, I have to respect your opinion and so I drop the subject... but with my
    dying breath, re:

    >to serve as a teaching tool showing
    >>how to combine the tools and how to integrate them with other Python code.


    , I cry "that's why we hint at people to /read/ the /source/ of the standard
    library..." :)

    Cheers.
    --
    TZOTZIOY, I speak England very best.
    "Be strict when sending and tolerant when receiving." (from RFC1958)
    I really should keep that in mind when talking with people, actually...
    Christos TZOTZIOY Georgiou, Mar 17, 2005
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Francis Avila
    Replies:
    23
    Views:
    753
    Francis Avila
    Oct 30, 2003
  2. Francis Avila
    Replies:
    0
    Views:
    445
    Francis Avila
    Nov 2, 2003
  3. Steven Bethard
    Replies:
    0
    Views:
    378
    Steven Bethard
    Mar 12, 2005
  4. Replies:
    2
    Views:
    369
    Steve Pope
    Sep 27, 2006
  5. Raymond Hettinger
    Replies:
    17
    Views:
    527
    Simon Brunning
    Feb 18, 2008
Loading...

Share This Page