Flattening lists

Discussion in 'Python' started by mk, Feb 5, 2009.

  1. mk

    mk Guest

    Hello everybody,

    Any better solution than this?

    def flatten(x):
    res = []
    for el in x:
    if isinstance(el,list):
    res.extend(flatten(el))
    else:
    res.append(el)
    return res

    a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
    print flatten(a)


    [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

    Regards,
    mk
    mk, Feb 5, 2009
    #1
    1. Advertising

  2. On Feb 5, 1:17 pm, mk <> wrote:
    > Hello everybody,
    >
    > Any better solution than this?
    >
    > def flatten(x):


    Just out of interest, how often do people really need
    such a recursive flatten, as opposed to a single-level
    version?

    I often find myself needing a 'concat' method that
    turns a list of lists (or iterable of iterables) into
    a single list; itertools.chain does this quite nicely.
    But I don't think I've ever encountered a need for the
    full recursive version.

    Mark
    Mark Dickinson, Feb 5, 2009
    #2
    1. Advertising

  3. On Feb 5, 2:17 pm, mk <> wrote:
    > Hello everybody,
    >
    > Any better solution than this?
    >
    > def flatten(x):
    >      res = []
    >      for el in x:
    >          if isinstance(el,list):
    >              res.extend(flatten(el))
    >          else:
    >              res.append(el)
    >      return res
    >
    > a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
    > print flatten(a)
    >
    > [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    >
    > Regards,
    > mk


    Looks fine to me. In some situations you may also use hasattr(el,
    '__iter__') instead of isinstance(el, list) (it depends if you want to
    flatten generic iterables or only lists).
    Michele Simionato, Feb 5, 2009
    #3
  4. mk

    mk Guest

    Mark Dickinson wrote:
    > I often find myself needing a 'concat' method that
    > turns a list of lists (or iterable of iterables) into
    > a single list; itertools.chain does this quite nicely.
    > But I don't think I've ever encountered a need for the
    > full recursive version.



    You're most probably right in this; however, my main goal here was
    finding 'more Pythonic' way of doing this and learning this way rather
    than the practical purpose of flattening deeply nested lists.

    Regards,
    mk
    mk, Feb 5, 2009
    #4
  5. mk

    mk Guest

    Michele Simionato wrote:

    > Looks fine to me. In some situations you may also use hasattr(el,
    > '__iter__') instead of isinstance(el, list) (it depends if you want to
    > flatten generic iterables or only lists).


    Thanks! Such stuff is what I'm looking for.

    Regards,
    mk
    mk, Feb 5, 2009
    #5
  6. mk <> writes:

    > Hello everybody,
    >
    > Any better solution than this?
    >
    > def flatten(x):
    > res = []
    > for el in x:
    > if isinstance(el,list):
    > res.extend(flatten(el))
    > else:
    > res.append(el)
    > return res
    >
    > a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
    > print flatten(a)
    >
    >
    > [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    >
    > Regards,
    > mk


    http://mail.python.org/pipermail/python-list/2005-July/330367.html
    J Kenneth King, Feb 5, 2009
    #6
  7. mk

    Aahz Guest

    In article <>,
    Michele Simionato <> wrote:
    >
    >Looks fine to me. In some situations you may also use hasattr(el,
    >'__iter__') instead of isinstance(el, list) (it depends if you want to
    >flatten generic iterables or only lists).


    Of course, once you do that, you need to special-case strings...
    --
    Aahz () <*> http://www.pythoncraft.com/

    Weinberg's Second Law: If builders built buildings the way programmers wrote
    programs, then the first woodpecker that came along would destroy civilization.
    Aahz, Feb 5, 2009
    #7
  8. mk

    Tobiah Guest


    > Hello everybody,
    >
    > Any better solution than this?


    a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
    print str(a).replace('[', '').replace(']', '').split(', ')

    ;)
    Tobiah, Feb 5, 2009
    #8
  9. mk

    Guest

    Quoth J Kenneth King <>:
    > mk <> writes:
    >
    > > Hello everybody,
    > >
    > > Any better solution than this?
    > >
    > > def flatten(x):
    > > res = []
    > > for el in x:
    > > if isinstance(el,list):
    > > res.extend(flatten(el))
    > > else:
    > > res.append(el)
    > > return res
    > >
    > > a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
    > > print flatten(a)
    > >
    > >
    > > [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
    > >
    > > Regards,
    > > mk

    >
    > http://mail.python.org/pipermail/python-list/2005-July/330367.html


    That's worth reading. I'm not sure why I'm finding this fun, but who
    cares. I tried a couple of other functions after reading that article,
    and it looks like a generator that scans the nested lists is actually
    the winner, and also is in many ways the most elegant implementation.
    Of course, as noted in the emails following above article, the test data
    is really inadequate for proper optimization testing ;)

    -----------------------------------------------------------------
    from __future__ import print_function
    from timeit import Timer
    from itertools import chain

    # This is the one from the article quoted above.
    def flatten6(seq):
    i = 0
    while (i != len(seq)):
    while hasattr(seq, '__iter__'):
    seq[i:i+1] = seq
    i = i + 1
    return seq


    #This is my favorite from a readability standpoint out of
    #all the things I tried. It also performs the best.
    def flatten8(seq):
    for x in seq:
    if not hasattr(x, '__iter__'): yield x
    else:
    for y in flatten8(x):
    yield y



    l = [[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]



    if __name__=="__main__":
    print(l)
    print('flatten6', flatten6(l))
    print('flatten8', list(flatten8(l)))
    print('flatten6', Timer("flatten6(l)", "from temp3 import flatten6, l").timeit())
    print('flatten8', Timer("list(flatten8(l))", "from temp3 import flatten8, l").timeit())


    -----------------------------------------------------------------

    >src/python/Python-3.0/python temp3.py

    [[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]
    flatten6 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
    flatten8 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
    flatten6 32.8386368752
    flatten8 30.7509689331

    >python temp3.py

    [[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]
    flatten6 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
    flatten8 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
    flatten6 34.730714798
    flatten8 32.3252940178

    --RDM
    , Feb 5, 2009
    #9
  10. mk

    Tobiah Guest

    On Thu, 05 Feb 2009 11:06:39 -0800, Tobiah wrote:

    >
    >> Hello everybody,
    >>
    >> Any better solution than this?

    >
    > a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]] print str(a).replace('[',
    > '').replace(']', '').split(', ')
    >
    > ;)


    Or:

    a = ['text', 'string', 3, [4, 5, 6], [[7, 8], [9, 10]]]
    print eval("[" + str(a).replace('[', '').replace(']', '') + "]")

    Just tongue in cheek...
    Tobiah, Feb 5, 2009
    #10
  11. On Feb 5, 7:24 pm, (Aahz) wrote:
    > In article <..com>,
    > Michele Simionato  <> wrote:
    >
    >
    >
    > >Looks fine to me. In some situations you may also use hasattr(el,
    > >'__iter__') instead of isinstance(el, list) (it depends if you want to
    > >flatten generic iterables or only lists).

    >
    > Of course, once you do that, you need to special-case strings...


    Strings are iterable but have no __iter__ method, which is fine in
    this context, since I would say 99.9% of times one wants to treat them
    as atomic objects, so no need to special case.
    Michele Simionato, Feb 5, 2009
    #11
  12. On Feb 5, 1:16 pm, Michele Simionato <>
    wrote:
    > On Feb 5, 7:24 pm, (Aahz) wrote:
    >
    > > In article <>,
    > > Michele Simionato  <> wrote:

    >
    > > >Looks fine to me. In some situations you may also use hasattr(el,
    > > >'__iter__') instead of isinstance(el, list) (it depends if you want to
    > > >flatten generic iterables or only lists).

    >
    > > Of course, once you do that, you need to special-case strings...

    >
    > Strings are iterable but have no __iter__ method, which is fine in
    > this context, since I would say 99.9% of times one wants to treat them
    > as atomic objects, so no need to special case.


    Don't worry, that little oddity was fixed for you:

    Python 3.0+ (unknown, Dec 8 2008, 14:26:15)
    [GCC 4.3.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> str.__iter__

    <slot wrapper '__iter__' of 'str' objects>
    >>> bytes.__iter__

    <slot wrapper '__iter__' of 'bytes' objects>
    >>> bytearray.__iter__

    <slot wrapper '__iter__' of 'bytearray' objects>


    I'm in the "why do you need more than 1 depth?" camp. Dispatching
    based on your own type should be given an extra look. Dispatching
    based passed in types should be given three extra looks.

    I didn't realize itertools.chain(*iterable) worked. I guess that
    needs to be pushed as the canonical form.
    Rhamphoryncus, Feb 6, 2009
    #12
  13. mk

    Mensanator Guest

    On Feb 6, 3:23 pm, Rhamphoryncus <> wrote:
    > On Feb 5, 1:16 pm, Michele Simionato <>
    > wrote:
    >
    > > On Feb 5, 7:24 pm, (Aahz) wrote:

    >
    > > > In article <>,
    > > > Michele Simionato  <> wrote:

    >
    > > > >Looks fine to me. In some situations you may also use hasattr(el,
    > > > >'__iter__') instead of isinstance(el, list) (it depends if you want to
    > > > >flatten generic iterables or only lists).

    >
    > > > Of course, once you do that, you need to special-case strings...

    >
    > > Strings are iterable but have no __iter__ method, which is fine in
    > > this context, since I would say 99.9% of times one wants to treat them
    > > as atomic objects, so no need to special case.

    >
    > Don't worry, that little oddity was fixed for you:
    >
    > Python 3.0+ (unknown, Dec  8 2008, 14:26:15)
    > [GCC 4.3.2] on linux2
    > Type "help", "copyright", "credits" or "license" for more information.>>> str.__iter__
    >
    > <slot wrapper '__iter__' of 'str' objects>>>> bytes.__iter__
    >
    > <slot wrapper '__iter__' of 'bytes' objects>>>> bytearray.__iter__
    >
    > <slot wrapper '__iter__' of 'bytearray' objects>
    >
    > I'm in the "why do you need more than 1 depth?" camp.  Dispatching
    > based on your own type should be given an extra look.  Dispatching
    > based passed in types should be given three extra looks.
    >
    > I didn't realize itertools.chain(*iterable) worked.  I guess that
    > needs to be pushed as the canonical form.


    What about this (from the Recipes section of the itertools manual)?

    def flatten(listOfLists):
    return list(chain.from_iterable(listOfLists))
    Mensanator, Feb 6, 2009
    #13
  14. On Feb 6, 10:23 pm, Rhamphoryncus <> wrote:
    > On Feb 5, 1:16 pm, Michele Simionato <>
    > wrote:
    >
    > > On Feb 5, 7:24 pm, (Aahz) wrote:

    >
    > > > In article <>,
    > > > Michele Simionato  <> wrote:

    >
    > > > >Looks fine to me. In some situations you may also use hasattr(el,
    > > > >'__iter__') instead of isinstance(el, list) (it depends if you want to
    > > > >flatten generic iterables or only lists).

    >
    > > > Of course, once you do that, you need to special-case strings...

    >
    > > Strings are iterable but have no __iter__ method, which is fine in
    > > this context, since I would say 99.9% of times one wants to treat them
    > > as atomic objects, so no need to special case.

    >
    > Don't worry, that little oddity was fixed for you:


    Acc! I have a few places in my code with checks of the
    kind ``hasattr(x, '__iter__')`` and I guess those spots
    will be tricky when converting to Python 3. I guess
    2to3 cannot help either :-(
    Michele Simionato, Feb 7, 2009
    #14
  15. mk

    Guest

    Quoth Mensanator <>:
    > On Feb 6, 3:23=A0pm, Rhamphoryncus <> wrote:
    > > On Feb 5, 1:16=A0pm, Michele Simionato <>
    > > wrote:
    > >
    > > > On Feb 5, 7:24=A0pm, (Aahz) wrote:
    > > > > In article <>,
    > > > > Michele Simionato =A0<> wrote:
    > > > > >Looks fine to me. In some situations you may also use hasattr(el,
    > > > > >'__iter__') instead of isinstance(el, list) (it depends if you want to
    > > > > >flatten generic iterables or only lists).
    > > > > Of course, once you do that, you need to special-case strings...

    > >
    > > > Strings are iterable but have no __iter__ method, which is fine in
    > > > this context, since I would say 99.9% of times one wants to treat them
    > > > as atomic objects, so no need to special case.

    > >
    > > Don't worry, that little oddity was fixed for you:
    > >
    > > Python 3.0+ (unknown, Dec =A08 2008, 14:26:15)
    > > [GCC 4.3.2] on linux2
    > > Type "help", "copyright", "credits" or "license" for more information.
    > > >>> str.__iter__

    > > <slot wrapper '__iter__' of 'str' objects
    > > >>> bytes.__iter__

    > > <slot wrapper '__iter__' of 'bytes' objects
    > > >>> bytearray.__iter__

    > > <slot wrapper '__iter__' of 'bytearray' objects>
    > >
    > > I'm in the "why do you need more than 1 depth?" camp. Dispatching
    > > based on your own type should be given an extra look. Dispatching
    > > based passed in types should be given three extra looks.
    > >
    > > I didn't realize itertools.chain(*iterable) worked. I guess that
    > > needs to be pushed as the canonical form.

    >
    > What about this (from the Recipes section of the itertools manual)?
    >
    > def flatten(listOfLists):
    > return list(chain.from_iterable(listOfLists))


    Python 2.6.1 (r261:67515, Jan 7 2009, 17:09:13)
    [GCC 4.3.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from itertools import chain
    >>> list(chain.from_iterable([1, 2, [3, 4]]))

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: 'int' object is not iterable
    >>> list(chain(*[1, 2, [3, 4]]))

    Traceback (most recent call last):
    File "<stdin>", line 1, in <module>
    TypeError: 'int' object is not iterable
    >>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))

    ['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]

    --RDM
    , Feb 7, 2009
    #15
  16. On Feb 6, 10:21 pm, wrote:
    > Quoth Mensanator <>:
    > > def flatten(listOfLists):
    > >     return list(chain.from_iterable(listOfLists))

    >
    >     Python 2.6.1 (r261:67515, Jan  7 2009, 17:09:13)
    >     [GCC 4.3.2] on linux2
    >     Type "help", "copyright", "credits" or "license" for more information.
    >     >>> from itertools import chain
    >     >>> list(chain.from_iterable([1, 2, [3, 4]]))
    >     Traceback (most recent call last):
    >       File "<stdin>", line 1, in <module>
    >     TypeError: 'int' object is not iterable
    >     >>> list(chain(*[1, 2, [3, 4]]))
    >     Traceback (most recent call last):
    >       File "<stdin>", line 1, in <module>
    >     TypeError: 'int' object is not iterable
    >     >>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))
    >     ['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]


    What usecase do you have for such inconsistently structured data?

    If I'm building a tree I use my own type for the nodes, keeping them
    purely internal, so I can always use isinstance without worrying about
    getting something inconvenient passed in.
    Rhamphoryncus, Feb 7, 2009
    #16
  17. mk

    Guest

    Rhamphoryncus <> wrote:
    > On Feb 6, 10:21=A0pm, wrote:
    > > Quoth Mensanator <>:
    > > > def flatten(listOfLists):
    > > > =A0 =A0 return list(chain.from_iterable(listOfLists))

    > >
    > > =A0 =A0 Python 2.6.1 (r261:67515, Jan =A07 2009, 17:09:13)
    > > =A0 =A0 [GCC 4.3.2] on linux2
    > > =A0 =A0 Type "help", "copyright", "credits" or "license" for more informa=

    > tion.
    > > =A0 =A0 >>> from itertools import chain
    > > =A0 =A0 >>> list(chain.from_iterable([1, 2, [3, 4]]))
    > > =A0 =A0 Traceback (most recent call last):
    > > =A0 =A0 =A0 File "<stdin>", line 1, in <module>
    > > =A0 =A0 TypeError: 'int' object is not iterable
    > > =A0 =A0 >>> list(chain(*[1, 2, [3, 4]]))
    > > =A0 =A0 Traceback (most recent call last):
    > > =A0 =A0 =A0 File "<stdin>", line 1, in <module>
    > > =A0 =A0 TypeError: 'int' object is not iterable
    > > =A0 =A0 >>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))
    > > =A0 =A0 ['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]

    >
    > What usecase do you have for such inconsistently structured data?
    >
    > If I'm building a tree I use my own type for the nodes, keeping them
    > purely internal, so I can always use isinstance without worrying about
    > getting something inconvenient passed in.


    I don't have any use cases myself, I'm just pointing out that this
    doesn't answer the concerns of the OP, who presumably does.

    --RDM
    , Feb 7, 2009
    #17
  18. mk

    Guest Guest

    On Sat, 7 Feb 2009 01:06:06 -0800 (PST)
    Rhamphoryncus <> wrote:

    > On Feb 6, 10:21 pm, wrote:
    > > Quoth Mensanator <>:
    > > > def flatten(listOfLists):
    > > >     return list(chain.from_iterable(listOfLists))

    > >
    > >     Python 2.6.1 (r261:67515, Jan  7 2009, 17:09:13)
    > >     [GCC 4.3.2] on linux2
    > >     Type "help", "copyright", "credits" or "license" for more
    > > information. >>> from itertools import chain
    > >     >>> list(chain.from_iterable([1, 2, [3, 4]]))
    > >     Traceback (most recent call last):
    > >       File "<stdin>", line 1, in <module>
    > >     TypeError: 'int' object is not iterable
    > >     >>> list(chain(*[1, 2, [3, 4]]))
    > >     Traceback (most recent call last):
    > >       File "<stdin>", line 1, in <module>
    > >     TypeError: 'int' object is not iterable
    > >     >>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))
    > >     ['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]

    >
    > What usecase do you have for such inconsistently structured data?


    I have a similar use case in pyspread, which is a Python spreadsheet
    that employs numpy object arrays. Since the Python objects in the numpy
    arrays are derived from user input, they can be anything, including
    nested lists as well as strings, etc.

    Since I consider my work-around that treats strings as a special case a
    rather ugly hack, I would welcome a robust, generic approach to the
    OP's problem.

    Martin
    Guest, Feb 7, 2009
    #18
  19. On Feb 7, 1:39 pm, <> wrote:
    > On Sat, 7 Feb 2009 01:06:06 -0800 (PST)
    > Rhamphoryncus <> wrote:
    >
    > > What usecase do you have for such inconsistently structured data?

    >
    > I have a similar use case in pyspread, which is a Python spreadsheet
    > that employs numpy object arrays. Since the Python objects in the numpy
    > arrays are derived from user input, they can be anything, including
    > nested lists as well as strings, etc.
    >
    > Since I consider my work-around that treats strings as a special case a
    > rather ugly hack, I would welcome a robust, generic approach to the
    > OP's problem.


    Can you explain this in a little more detail?
    Rhamphoryncus, Feb 7, 2009
    #19
  20. mk

    Guest Guest

    On Sat, 7 Feb 2009 12:50:22 -0800 (PST)
    Rhamphoryncus <> wrote:

    > On Feb 7, 1:39 pm, <> wrote:
    > > On Sat, 7 Feb 2009 01:06:06 -0800 (PST)
    > > Rhamphoryncus <> wrote:
    > >
    > > > What usecase do you have for such inconsistently structured data?

    > >
    > > I have a similar use case in pyspread, which is a Python spreadsheet
    > > that employs numpy object arrays. Since the Python objects in the
    > > numpy arrays are derived from user input, they can be anything,
    > > including nested lists as well as strings, etc.
    > >
    > > Since I consider my work-around that treats strings as a special
    > > case a rather ugly hack, I would welcome a robust, generic approach
    > > to the OP's problem.

    >
    > Can you explain this in a little more detail?


    In the application, there is one main numpy array of type "O".
    Each element of the array corresponds to one cell in a grid. The user
    may enter a Python expression into the grid cell. The input is
    evaled and the result is stored in the numpy array (the actual process
    is a bit more complicated). Therefore, the object inside a numpy array
    element may be an inconsistent, nested, iterable type.

    The user now may access the result grid via __getitem__. When doing
    this, a numpy array that is as flat as possible while comprising the
    maximum possible data depth is returned, i.e.:

    1. Non-string and non-unicode iterables of similar length for each of
    the cells form extra dimensions.

    2. In order to remove different container types, the result is
    flattened, cast into a numpy.array and re-shaped.

    3. Dimensions of length 1 are eliminated.

    Therefore, the user can conveniently use numpy ufuncs on the results.

    I am referring to the flatten operation in step 2
    Guest, Feb 7, 2009
    #20
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. delgados129
    Replies:
    2
    Views:
    784
    delgados129
    Apr 25, 2005
  2. David Gersic

    Flattening out an XML document

    David Gersic, May 24, 2005, in forum: XML
    Replies:
    0
    Views:
    388
    David Gersic
    May 24, 2005
  3. gangesmaster

    a flattening operator?

    gangesmaster, Apr 18, 2006, in forum: Python
    Replies:
    2
    Views:
    284
    Michael Tobis
    Apr 22, 2006
  4. =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==

    List of lists of lists of lists...

    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==, May 8, 2006, in forum: Python
    Replies:
    5
    Views:
    389
    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==
    May 15, 2006
  5. Esmail
    Replies:
    2
    Views:
    282
    Neil Cerutti
    Nov 30, 2009
Loading...

Share This Page