Flattening lists

M

mk

Hello everybody,

Any better solution than this?

def flatten(x):
res = []
for el in x:
if isinstance(el,list):
res.extend(flatten(el))
else:
res.append(el)
return res

a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
print flatten(a)


[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Regards,
mk
 
M

Mark Dickinson

Hello everybody,

Any better solution than this?

def flatten(x):

Just out of interest, how often do people really need
such a recursive flatten, as opposed to a single-level
version?

I often find myself needing a 'concat' method that
turns a list of lists (or iterable of iterables) into
a single list; itertools.chain does this quite nicely.
But I don't think I've ever encountered a need for the
full recursive version.

Mark
 
M

Michele Simionato

Hello everybody,

Any better solution than this?

def flatten(x):
     res = []
     for el in x:
         if isinstance(el,list):
             res.extend(flatten(el))
         else:
             res.append(el)
     return res

a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
print flatten(a)

[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Regards,
mk

Looks fine to me. In some situations you may also use hasattr(el,
'__iter__') instead of isinstance(el, list) (it depends if you want to
flatten generic iterables or only lists).
 
M

mk

Mark said:
I often find myself needing a 'concat' method that
turns a list of lists (or iterable of iterables) into
a single list; itertools.chain does this quite nicely.
But I don't think I've ever encountered a need for the
full recursive version.


You're most probably right in this; however, my main goal here was
finding 'more Pythonic' way of doing this and learning this way rather
than the practical purpose of flattening deeply nested lists.

Regards,
mk
 
M

mk

Michele said:
Looks fine to me. In some situations you may also use hasattr(el,
'__iter__') instead of isinstance(el, list) (it depends if you want to
flatten generic iterables or only lists).

Thanks! Such stuff is what I'm looking for.

Regards,
mk
 
A

Aahz

Looks fine to me. In some situations you may also use hasattr(el,
'__iter__') instead of isinstance(el, list) (it depends if you want to
flatten generic iterables or only lists).

Of course, once you do that, you need to special-case strings...
 
R

rdmurray

Quoth J Kenneth King said:
mk said:
Hello everybody,

Any better solution than this?

def flatten(x):
res = []
for el in x:
if isinstance(el,list):
res.extend(flatten(el))
else:
res.append(el)
return res

a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
print flatten(a)


[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

Regards,
mk

http://mail.python.org/pipermail/python-list/2005-July/330367.html

That's worth reading. I'm not sure why I'm finding this fun, but who
cares. I tried a couple of other functions after reading that article,
and it looks like a generator that scans the nested lists is actually
the winner, and also is in many ways the most elegant implementation.
Of course, as noted in the emails following above article, the test data
is really inadequate for proper optimization testing ;)

-----------------------------------------------------------------
from __future__ import print_function
from timeit import Timer
from itertools import chain

# This is the one from the article quoted above.
def flatten6(seq):
i = 0
while (i != len(seq)):
while hasattr(seq, '__iter__'):
seq[i:i+1] = seq
i = i + 1
return seq


#This is my favorite from a readability standpoint out of
#all the things I tried. It also performs the best.
def flatten8(seq):
for x in seq:
if not hasattr(x, '__iter__'): yield x
else:
for y in flatten8(x):
yield y



l = [[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]



if __name__=="__main__":
print(l)
print('flatten6', flatten6(l))
print('flatten8', list(flatten8(l)))
print('flatten6', Timer("flatten6(l)", "from temp3 import flatten6, l").timeit())
print('flatten8', Timer("list(flatten8(l))", "from temp3 import flatten8, l").timeit())


-----------------------------------------------------------------
src/python/Python-3.0/python temp3.py
[[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]
flatten6 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten8 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten6 32.8386368752
flatten8 30.7509689331
python temp3.py
[[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]
flatten6 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten8 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten6 34.730714798
flatten8 32.3252940178

--RDM
 
T

Tobiah

Hello everybody,

Any better solution than this?

a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]] print str(a).replace('[',
'').replace(']', '').split(', ')

;)

Or:

a = ['text', 'string', 3, [4, 5, 6], [[7, 8], [9, 10]]]
print eval("[" + str(a).replace('[', '').replace(']', '') + "]")

Just tongue in cheek...
 
M

Michele Simionato

Of course, once you do that, you need to special-case strings...

Strings are iterable but have no __iter__ method, which is fine in
this context, since I would say 99.9% of times one wants to treat them
as atomic objects, so no need to special case.
 
R

Rhamphoryncus

Strings are iterable but have no __iter__ method, which is fine in
this context, since I would say 99.9% of times one wants to treat them
as atomic objects, so no need to special case.

Don't worry, that little oddity was fixed for you:

Python 3.0+ (unknown, Dec 8 2008, 14:26:15)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.<slot wrapper '__iter__' of 'bytearray' objects>


I'm in the "why do you need more than 1 depth?" camp. Dispatching
based on your own type should be given an extra look. Dispatching
based passed in types should be given three extra looks.

I didn't realize itertools.chain(*iterable) worked. I guess that
needs to be pushed as the canonical form.
 
M

Mensanator

On Feb 5, 7:24 pm, (e-mail address removed) (Aahz) wrote:
Strings are iterable but have no __iter__ method, which is fine in
this context, since I would say 99.9% of times one wants to treat them
as atomic objects, so no need to special case.

Don't worry, that little oddity was fixed for you:

Python 3.0+ (unknown, Dec  8 2008, 14:26:15)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.>>> str.__iter__

<slot wrapper '__iter__' of 'str' objects>>>> bytes.__iter__

<slot wrapper '__iter__' of 'bytes' objects>>>> bytearray.__iter__

<slot wrapper '__iter__' of 'bytearray' objects>

I'm in the "why do you need more than 1 depth?" camp.  Dispatching
based on your own type should be given an extra look.  Dispatching
based passed in types should be given three extra looks.

I didn't realize itertools.chain(*iterable) worked.  I guess that
needs to be pushed as the canonical form.

What about this (from the Recipes section of the itertools manual)?

def flatten(listOfLists):
return list(chain.from_iterable(listOfLists))
 
M

Michele Simionato

On Feb 5, 7:24 pm, (e-mail address removed) (Aahz) wrote:
Don't worry, that little oddity was fixed for you:

Acc! I have a few places in my code with checks of the
kind ``hasattr(x, '__iter__')`` and I guess those spots
will be tricky when converting to Python 3. I guess
2to3 cannot help either :-(
 
R

rdmurray

Quoth Mensanator said:
On Feb 5, 7:24=A0pm, (e-mail address removed) (Aahz) wrote:
Looks fine to me. In some situations you may also use hasattr(el,
'__iter__') instead of isinstance(el, list) (it depends if you want to
flatten generic iterables or only lists).
Of course, once you do that, you need to special-case strings...
Strings are iterable but have no __iter__ method, which is fine in
this context, since I would say 99.9% of times one wants to treat them
as atomic objects, so no need to special case.

Don't worry, that little oddity was fixed for you:

Python 3.0+ (unknown, Dec =A08 2008, 14:26:15)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
str.__iter__
<slot wrapper '__iter__' of 'str' objects
bytes.__iter__
<slot wrapper '__iter__' of 'bytes' objects
bytearray.__iter__
<slot wrapper '__iter__' of 'bytearray' objects>

I'm in the "why do you need more than 1 depth?" camp. Dispatching
based on your own type should be given an extra look. Dispatching
based passed in types should be given three extra looks.

I didn't realize itertools.chain(*iterable) worked. I guess that
needs to be pushed as the canonical form.

What about this (from the Recipes section of the itertools manual)?

def flatten(listOfLists):
return list(chain.from_iterable(listOfLists))

Python 2.6.1 (r261:67515, Jan 7 2009, 17:09:13)
[GCC 4.3.2] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> from itertools import chain
>>> list(chain.from_iterable([1, 2, [3, 4]]))
Traceback (most recent call last):
File said:
>>> list(chain(*[1, 2, [3, 4]]))
Traceback (most recent call last):
File said:
>>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))
['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]

--RDM
 
R

Rhamphoryncus

Quoth Mensanator said:
def flatten(listOfLists):
    return list(chain.from_iterable(listOfLists))

    Python 2.6.1 (r261:67515, Jan  7 2009, 17:09:13)
    [GCC 4.3.2] on linux2
    Type "help", "copyright", "credits" or "license" for more information.
    >>> from itertools import chain
    >>> list(chain.from_iterable([1, 2, [3, 4]]))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: 'int' object is not iterable
    >>> list(chain(*[1, 2, [3, 4]]))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: 'int' object is not iterable
    >>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))
    ['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]

What usecase do you have for such inconsistently structured data?

If I'm building a tree I use my own type for the nodes, keeping them
purely internal, so I can always use isinstance without worrying about
getting something inconvenient passed in.
 
R

rdmurray

Rhamphoryncus said:
Quoth Mensanator said:
def flatten(listOfLists):
=A0 =A0 return list(chain.from_iterable(listOfLists))

=A0 =A0 Python 2.6.1 (r261:67515, Jan =A07 2009, 17:09:13)
=A0 =A0 [GCC 4.3.2] on linux2
=A0 =A0 Type "help", "copyright", "credits" or "license" for more informa= tion.
=A0 =A0 >>> from itertools import chain
=A0 =A0 >>> list(chain.from_iterable([1, 2, [3, 4]]))
=A0 =A0 Traceback (most recent call last):
=A0 =A0 =A0 File "<stdin>", line 1, in <module>
=A0 =A0 TypeError: 'int' object is not iterable
=A0 =A0 >>> list(chain(*[1, 2, [3, 4]]))
=A0 =A0 Traceback (most recent call last):
=A0 =A0 =A0 File "<stdin>", line 1, in <module>
=A0 =A0 TypeError: 'int' object is not iterable
=A0 =A0 >>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))
=A0 =A0 ['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]

What usecase do you have for such inconsistently structured data?

If I'm building a tree I use my own type for the nodes, keeping them
purely internal, so I can always use isinstance without worrying about
getting something inconvenient passed in.

I don't have any use cases myself, I'm just pointing out that this
doesn't answer the concerns of the OP, who presumably does.

--RDM
 
G

Guest

Quoth Mensanator said:
def flatten(listOfLists):
    return list(chain.from_iterable(listOfLists))

    Python 2.6.1 (r261:67515, Jan  7 2009, 17:09:13)
    [GCC 4.3.2] on linux2
    Type "help", "copyright", "credits" or "license" for more
information. >>> from itertools import chain
    >>> list(chain.from_iterable([1, 2, [3, 4]]))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: 'int' object is not iterable
    >>> list(chain(*[1, 2, [3, 4]]))
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
    TypeError: 'int' object is not iterable
    >>> list(chain.from_iterable(['abcd', 'efg', [3, 4]]))
    ['a', 'b', 'c', 'd', 'e', 'f', 'g', 3, 4]

What usecase do you have for such inconsistently structured data?

I have a similar use case in pyspread, which is a Python spreadsheet
that employs numpy object arrays. Since the Python objects in the numpy
arrays are derived from user input, they can be anything, including
nested lists as well as strings, etc.

Since I consider my work-around that treats strings as a special case a
rather ugly hack, I would welcome a robust, generic approach to the
OP's problem.

Martin
 
R

Rhamphoryncus

I have a similar use case in pyspread, which is a Python spreadsheet
that employs numpy object arrays. Since the Python objects in the numpy
arrays are derived from user input, they can be anything, including
nested lists as well as strings, etc.

Since I consider my work-around that treats strings as a special case a
rather ugly hack, I would welcome a robust, generic approach to the
OP's problem.

Can you explain this in a little more detail?
 
G

Guest

Can you explain this in a little more detail?

In the application, there is one main numpy array of type "O".
Each element of the array corresponds to one cell in a grid. The user
may enter a Python expression into the grid cell. The input is
evaled and the result is stored in the numpy array (the actual process
is a bit more complicated). Therefore, the object inside a numpy array
element may be an inconsistent, nested, iterable type.

The user now may access the result grid via __getitem__. When doing
this, a numpy array that is as flat as possible while comprising the
maximum possible data depth is returned, i.e.:

1. Non-string and non-unicode iterables of similar length for each of
the cells form extra dimensions.

2. In order to remove different container types, the result is
flattened, cast into a numpy.array and re-shaped.

3. Dimensions of length 1 are eliminated.

Therefore, the user can conveniently use numpy ufuncs on the results.

I am referring to the flatten operation in step 2
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,055
Latest member
SlimSparkKetoACVReview

Latest Threads

Top