# Flattening lists

Discussion in 'Python' started by mk, Feb 5, 2009.

1. ### mkGuest

Hello everybody,

Any better solution than this?

def flatten(x):
res = []
for el in x:
if isinstance(el,list):
res.extend(flatten(el))
else:
res.append(el)
return res

a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
print flatten(a)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Regards,
mk

mk, Feb 5, 2009

2. ### Mark DickinsonGuest

On Feb 5, 1:17 pm, mk <> wrote:
> Hello everybody,
>
> Any better solution than this?
>
> def flatten(x):

Just out of interest, how often do people really need
such a recursive flatten, as opposed to a single-level
version?

I often find myself needing a 'concat' method that
turns a list of lists (or iterable of iterables) into
a single list; itertools.chain does this quite nicely.
But I don't think I've ever encountered a need for the
full recursive version.

Mark

Mark Dickinson, Feb 5, 2009

3. ### Michele SimionatoGuest

On Feb 5, 2:17 pm, mk <> wrote:
> Hello everybody,
>
> Any better solution than this?
>
> def flatten(x):
>      res = []
>      for el in x:
>          if isinstance(el,list):
>              res.extend(flatten(el))
>          else:
>              res.append(el)
>      return res
>
> a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
> print flatten(a)
>
> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>
> Regards,
> mk

Looks fine to me. In some situations you may also use hasattr(el,
'__iter__') instead of isinstance(el, list) (it depends if you want to
flatten generic iterables or only lists).

Michele Simionato, Feb 5, 2009
4. ### mkGuest

Mark Dickinson wrote:
> I often find myself needing a 'concat' method that
> turns a list of lists (or iterable of iterables) into
> a single list; itertools.chain does this quite nicely.
> But I don't think I've ever encountered a need for the
> full recursive version.

You're most probably right in this; however, my main goal here was
finding 'more Pythonic' way of doing this and learning this way rather
than the practical purpose of flattening deeply nested lists.

Regards,
mk

mk, Feb 5, 2009
5. ### mkGuest

Michele Simionato wrote:

> Looks fine to me. In some situations you may also use hasattr(el,
> '__iter__') instead of isinstance(el, list) (it depends if you want to
> flatten generic iterables or only lists).

Thanks! Such stuff is what I'm looking for.

Regards,
mk

mk, Feb 5, 2009
6. ### J Kenneth KingGuest

mk <> writes:

> Hello everybody,
>
> Any better solution than this?
>
> def flatten(x):
> res = []
> for el in x:
> if isinstance(el,list):
> res.extend(flatten(el))
> else:
> res.append(el)
> return res
>
> a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
> print flatten(a)
>
>
> [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
>
> Regards,
> mk

http://mail.python.org/pipermail/python-list/2005-July/330367.html

J Kenneth King, Feb 5, 2009
7. ### AahzGuest

In article <>,
Michele Simionato <> wrote:
>
>Looks fine to me. In some situations you may also use hasattr(el,
>'__iter__') instead of isinstance(el, list) (it depends if you want to
>flatten generic iterables or only lists).

Of course, once you do that, you need to special-case strings...
--
Aahz () <*> http://www.pythoncraft.com/

Weinberg's Second Law: If builders built buildings the way programmers wrote
programs, then the first woodpecker that came along would destroy civilization.

Aahz, Feb 5, 2009
8. ### TobiahGuest

> Hello everybody,
>
> Any better solution than this?

a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
print str(a).replace('[', '').replace(']', '').split(', ')

Tobiah, Feb 5, 2009
9. ### Guest

Quoth J Kenneth King <>:
> mk <> writes:
>
> > Hello everybody,
> >
> > Any better solution than this?
> >
> > def flatten(x):
> > res = []
> > for el in x:
> > if isinstance(el,list):
> > res.extend(flatten(el))
> > else:
> > res.append(el)
> > return res
> >
> > a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
> > print flatten(a)
> >
> >
> > [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
> >
> > Regards,
> > mk

>
> http://mail.python.org/pipermail/python-list/2005-July/330367.html

That's worth reading. I'm not sure why I'm finding this fun, but who
cares. I tried a couple of other functions after reading that article,
and it looks like a generator that scans the nested lists is actually
the winner, and also is in many ways the most elegant implementation.
Of course, as noted in the emails following above article, the test data
is really inadequate for proper optimization testing

-----------------------------------------------------------------
from __future__ import print_function
from timeit import Timer
from itertools import chain

# This is the one from the article quoted above.
def flatten6(seq):
i = 0
while (i != len(seq)):
while hasattr(seq, '__iter__'):
seq[i:i+1] = seq
i = i + 1
return seq

#This is my favorite from a readability standpoint out of
#all the things I tried. It also performs the best.
def flatten8(seq):
for x in seq:
if not hasattr(x, '__iter__'): yield x
else:
for y in flatten8(x):
yield y

l = [[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]

if __name__=="__main__":
print(l)
print('flatten6', flatten6(l))
print('flatten8', list(flatten8(l)))
print('flatten6', Timer("flatten6(l)", "from temp3 import flatten6, l").timeit())
print('flatten8', Timer("list(flatten8(l))", "from temp3 import flatten8, l").timeit())

-----------------------------------------------------------------

>src/python/Python-3.0/python temp3.py

[[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]
flatten6 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten8 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten6 32.8386368752
flatten8 30.7509689331

>python temp3.py

[[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]
flatten6 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten8 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten6 34.730714798
flatten8 32.3252940178

--RDM

, Feb 5, 2009
10. ### TobiahGuest

On Thu, 05 Feb 2009 11:06:39 -0800, Tobiah wrote:

>
>> Hello everybody,
>>
>> Any better solution than this?

>
> a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]] print str(a).replace('[',
> '').replace(']', '').split(', ')
>
>

Or:

a = ['text', 'string', 3, [4, 5, 6], [[7, 8], [9, 10]]]
print eval("[" + str(a).replace('[', '').replace(']', '') + "]")

Just tongue in cheek...

Tobiah, Feb 5, 2009
11. ### Michele SimionatoGuest

On Feb 5, 7:24 pm, (Aahz) wrote:
> In article <..com>,
> Michele Simionato  <> wrote:
>
>
>
> >Looks fine to me. In some situations you may also use hasattr(el,
> >'__iter__') instead of isinstance(el, list) (it depends if you want to
> >flatten generic iterables or only lists).

>
> Of course, once you do that, you need to special-case strings...

Strings are iterable but have no __iter__ method, which is fine in
this context, since I would say 99.9% of times one wants to treat them
as atomic objects, so no need to special case.

Michele Simionato, Feb 5, 2009
12. ### RhamphoryncusGuest

On Feb 5, 1:16 pm, Michele Simionato <>
wrote:
> On Feb 5, 7:24 pm, (Aahz) wrote:
>
> > In article <>,
> > Michele Simionato  <> wrote:

>
> > >Looks fine to me. In some situations you may also use hasattr(el,
> > >'__iter__') instead of isinstance(el, list) (it depends if you want to
> > >flatten generic iterables or only lists).

>
> > Of course, once you do that, you need to special-case strings...

>
> Strings are iterable but have no __iter__ method, which is fine in
> this context, since I would say 99.9% of times one wants to treat them
> as atomic objects, so no need to special case.

Don't worry, that little oddity was fixed for you:

Python 3.0+ (unknown, Dec 8 2008, 14:26:15)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> str.__iter__

<slot wrapper '__iter__' of 'str' objects>
>>> bytes.__iter__

<slot wrapper '__iter__' of 'bytes' objects>
>>> bytearray.__iter__

<slot wrapper '__iter__' of 'bytearray' objects>

I'm in the "why do you need more than 1 depth?" camp. Dispatching
based on your own type should be given an extra look. Dispatching
based passed in types should be given three extra looks.

I didn't realize itertools.chain(*iterable) worked. I guess that
needs to be pushed as the canonical form.

Rhamphoryncus, Feb 6, 2009
13. ### MensanatorGuest

On Feb 6, 3:23 pm, Rhamphoryncus <> wrote:
> On Feb 5, 1:16 pm, Michele Simionato <>
> wrote:
>
> > On Feb 5, 7:24 pm, (Aahz) wrote:

>
> > > In article <>,
> > > Michele Simionato  <> wrote:

>
> > > >Looks fine to me. In some situations you may also use hasattr(el,
> > > >'__iter__') instead of isinstance(el, list) (it depends if you want to
> > > >flatten generic iterables or only lists).

>
> > > Of course, once you do that, you need to special-case strings...

>
> > Strings are iterable but have no __iter__ method, which is fine in
> > this context, since I would say 99.9% of times one wants to treat them
> > as atomic objects, so no need to special case.

>
> Don't worry, that little oddity was fixed for you:
>
> Python 3.0+ (unknown, Dec  8 2008, 14:26:15)
> [GCC 4.3.2] on linux2
> Type "help", "copyright", "credits" or "license" for more information.>>> str.__iter__
>
> <slot wrapper '__iter__' of 'str' objects>>>> bytes.__iter__
>
> <slot wrapper '__iter__' of 'bytes' objects>>>> bytearray.__iter__
>
> <slot wrapper '__iter__' of 'bytearray' objects>
>
> I'm in the "why do you need more than 1 depth?" camp.  Dispatching
> based on your own type should be given an extra look.  Dispatching
> based passed in types should be given three extra looks.
>
> I didn't realize itertools.chain(*iterable) worked.  I guess that
> needs to be pushed as the canonical form.

What about this (from the Recipes section of the itertools manual)?

def flatten(listOfLists):
return list(chain.from_iterable(listOfLists))

Mensanator, Feb 6, 2009
14. ### Michele SimionatoGuest

On Feb 6, 10:23 pm, Rhamphoryncus <> wrote:
> On Feb 5, 1:16 pm, Michele Simionato <>
> wrote:
>
> > On Feb 5, 7:24 pm, (Aahz) wrote:

>
> > > In article <>,
> > > Michele Simionato  <> wrote:

>
> > > >Looks fine to me. In some situations you may also use hasattr(el,
> > > >'__iter__') instead of isinstance(el, list) (it depends if you want to
> > > >flatten generic iterables or only lists).

>
> > > Of course, once you do that, you need to special-case strings...

>
> > Strings are iterable but have no __iter__ method, which is fine in
> > this context, since I would say 99.9% of times one wants to treat them
> > as atomic objects, so no need to special case.

>
> Don't worry, that little oddity was fixed for you:

Acc! I have a few places in my code with checks of the
kind ``hasattr(x, '__iter__')`` and I guess those spots
will be tricky when converting to Python 3. I guess
2to3 cannot help either :-(

Michele Simionato, Feb 7, 2009
15. ### Guest

Quoth Mensanator <>:
> On Feb 6, 3:23=A0pm, Rhamphoryncus <> wrote:
> > On Feb 5, 1:16=A0pm, Michele Simionato <>
> > wrote:
> >
> > > On Feb 5, 7:24=A0pm, (Aahz) wrote:
> > > > In article <>,
> > > > Michele Simionato =A0<> wrote:
> > > > >Looks fine to me. In some situations you may also use hasattr(el,
> > > > >'__iter__') instead of isinstance(el, list) (it depends if you want to
> > > > >flatten generic iterables or only lists).
> > > > Of course, once you do that, you need to special-case strings...

> >
> > > Strings are iterable but have no __iter__ method, which is fine in
> > > this context, since I would say 99.9% of times one wants to treat them
> > > as atomic objects, so no need to special case.

> >
> > Don't worry, that little oddity was fixed for you:
> >
> > Python 3.0+ (unknown, Dec =A08 2008, 14:26:15)
> > [GCC 4.3.2] on linux2
> > Type "help", "copyright", "credits" or "license" for more information.
> > >>> str.__iter__

> > <slot wrapper '__iter__' of 'str' objects
> > >>> bytes.__iter__

> > <slot wrapper '__iter__' of 'bytes' objects
> > >>> bytearray.__iter__

> > <slot wrapper '__iter__' of 'bytearray' objects>
> >
> > I'm in the "why do you need more than 1 depth?" camp. Dispatching
> > based on your own type should be given an extra look. Dispatching
> > based passed in types should be given three extra looks.
> >
> > I didn't realize itertools.chain(*iterable) worked. I guess that
> > needs to be pushed as the canonical form.

>
> What about this (from the Recipes section of the itertools manual)?
>
> def flatten(listOfLists):
> return list(chain.from_iterable(listOfLists))

Python 2.6.1 (r261:67515, Jan 7 2009, 17:09:13)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from itertools import chain
>>> list(chain.from_iterable([1, 2, [3, 4]]))

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> list(chain(*[1, 2, [3, 4]]))

Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: 'int' object is not iterable
>>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))

['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]

--RDM

, Feb 7, 2009
16. ### RhamphoryncusGuest

On Feb 6, 10:21 pm, wrote:
> Quoth Mensanator <>:
> > def flatten(listOfLists):
> >     return list(chain.from_iterable(listOfLists))

>
>     Python 2.6.1 (r261:67515, Jan  7 2009, 17:09:13)
>     [GCC 4.3.2] on linux2
>     Type "help", "copyright", "credits" or "license" for more information.
>     >>> from itertools import chain
>     >>> list(chain.from_iterable([1, 2, [3, 4]]))
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>     TypeError: 'int' object is not iterable
>     >>> list(chain(*[1, 2, [3, 4]]))
>     Traceback (most recent call last):
>       File "<stdin>", line 1, in <module>
>     TypeError: 'int' object is not iterable
>     >>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))
>     ['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]

What usecase do you have for such inconsistently structured data?

If I'm building a tree I use my own type for the nodes, keeping them
purely internal, so I can always use isinstance without worrying about
getting something inconvenient passed in.

Rhamphoryncus, Feb 7, 2009
17. ### Guest

Rhamphoryncus <> wrote:
> On Feb 6, 10:21=A0pm, wrote:
> > Quoth Mensanator <>:
> > > def flatten(listOfLists):
> > > =A0 =A0 return list(chain.from_iterable(listOfLists))

> >
> > =A0 =A0 Python 2.6.1 (r261:67515, Jan =A07 2009, 17:09:13)
> > =A0 =A0 [GCC 4.3.2] on linux2
> > =A0 =A0 Type "help", "copyright", "credits" or "license" for more informa=

> tion.
> > =A0 =A0 >>> from itertools import chain
> > =A0 =A0 >>> list(chain.from_iterable([1, 2, [3, 4]]))
> > =A0 =A0 Traceback (most recent call last):
> > =A0 =A0 =A0 File "<stdin>", line 1, in <module>
> > =A0 =A0 TypeError: 'int' object is not iterable
> > =A0 =A0 >>> list(chain(*[1, 2, [3, 4]]))
> > =A0 =A0 Traceback (most recent call last):
> > =A0 =A0 =A0 File "<stdin>", line 1, in <module>
> > =A0 =A0 TypeError: 'int' object is not iterable
> > =A0 =A0 >>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))
> > =A0 =A0 ['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]

>
> What usecase do you have for such inconsistently structured data?
>
> If I'm building a tree I use my own type for the nodes, keeping them
> purely internal, so I can always use isinstance without worrying about
> getting something inconvenient passed in.

I don't have any use cases myself, I'm just pointing out that this
doesn't answer the concerns of the OP, who presumably does.

--RDM

, Feb 7, 2009
18. ### GuestGuest

On Sat, 7 Feb 2009 01:06:06 -0800 (PST)
Rhamphoryncus <> wrote:

> On Feb 6, 10:21 pm, wrote:
> > Quoth Mensanator <>:
> > > def flatten(listOfLists):
> > >     return list(chain.from_iterable(listOfLists))

> >
> >     Python 2.6.1 (r261:67515, Jan  7 2009, 17:09:13)
> >     [GCC 4.3.2] on linux2
> >     Type "help", "copyright", "credits" or "license" for more
> > information. >>> from itertools import chain
> >     >>> list(chain.from_iterable([1, 2, [3, 4]]))
> >     Traceback (most recent call last):
> >       File "<stdin>", line 1, in <module>
> >     TypeError: 'int' object is not iterable
> >     >>> list(chain(*[1, 2, [3, 4]]))
> >     Traceback (most recent call last):
> >       File "<stdin>", line 1, in <module>
> >     TypeError: 'int' object is not iterable
> >     >>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))
> >     ['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]

>
> What usecase do you have for such inconsistently structured data?

I have a similar use case in pyspread, which is a Python spreadsheet
that employs numpy object arrays. Since the Python objects in the numpy
arrays are derived from user input, they can be anything, including
nested lists as well as strings, etc.

Since I consider my work-around that treats strings as a special case a
rather ugly hack, I would welcome a robust, generic approach to the
OP's problem.

Martin

Guest, Feb 7, 2009
19. ### RhamphoryncusGuest

On Feb 7, 1:39 pm, <> wrote:
> On Sat, 7 Feb 2009 01:06:06 -0800 (PST)
> Rhamphoryncus <> wrote:
>
> > What usecase do you have for such inconsistently structured data?

>
> I have a similar use case in pyspread, which is a Python spreadsheet
> that employs numpy object arrays. Since the Python objects in the numpy
> arrays are derived from user input, they can be anything, including
> nested lists as well as strings, etc.
>
> Since I consider my work-around that treats strings as a special case a
> rather ugly hack, I would welcome a robust, generic approach to the
> OP's problem.

Can you explain this in a little more detail?

Rhamphoryncus, Feb 7, 2009
20. ### GuestGuest

On Sat, 7 Feb 2009 12:50:22 -0800 (PST)
Rhamphoryncus <> wrote:

> On Feb 7, 1:39 pm, <> wrote:
> > On Sat, 7 Feb 2009 01:06:06 -0800 (PST)
> > Rhamphoryncus <> wrote:
> >
> > > What usecase do you have for such inconsistently structured data?

> >
> > I have a similar use case in pyspread, which is a Python spreadsheet
> > that employs numpy object arrays. Since the Python objects in the
> > numpy arrays are derived from user input, they can be anything,
> > including nested lists as well as strings, etc.
> >
> > Since I consider my work-around that treats strings as a special
> > case a rather ugly hack, I would welcome a robust, generic approach
> > to the OP's problem.

>
> Can you explain this in a little more detail?

In the application, there is one main numpy array of type "O".
Each element of the array corresponds to one cell in a grid. The user
may enter a Python expression into the grid cell. The input is
evaled and the result is stored in the numpy array (the actual process
is a bit more complicated). Therefore, the object inside a numpy array
element may be an inconsistent, nested, iterable type.

The user now may access the result grid via __getitem__. When doing
this, a numpy array that is as flat as possible while comprising the
maximum possible data depth is returned, i.e.:

1. Non-string and non-unicode iterables of similar length for each of
the cells form extra dimensions.

2. In order to remove different container types, the result is
flattened, cast into a numpy.array and re-shaped.

3. Dimensions of length 1 are eliminated.

Therefore, the user can conveniently use numpy ufuncs on the results.

I am referring to the flatten operation in step 2

Guest, Feb 7, 2009

### Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.