flatten a list of list

Terry · Aug 16, 2009

Hi,

Is there a simple way (the pythonic way) to flatten a list of list?
rather than my current solution:

new_list=[]
for l in list_of_list:
new_list.extend(l)

or,

new_list=reduce(lambda x,y:x.extend(y), list_of_list)

br, Terry

Chris Rebert · Aug 16, 2009

Hi,

Is there a simple way (the pythonic way) to flatten a list of list?
rather than my current solution:

new_list=[]
for l in list_of_list:
Â Â new_list.extend(l)

or,

new_list=reduce(lambda x,y:x.extend(y), list_of_list)

#only marginally better:
from operator import add
new_list = reduce(add, list_of_list)

#from the itertools recipes:
from itertools import chain
def flatten(listOfLists):
return list(chain.from_iterable(listOfLists))

Cheers,
Chris

Michael Fötsch · Aug 16, 2009

Terry said:
Is there a simple way (the pythonic way) to flatten a list of list?

This is probably the shortest it can get:

sum(list_of_lists, [])

Kind Regards,
M.F.

Steven D'Aprano · Aug 16, 2009

Hi,

Is there a simple way (the pythonic way) to flatten a list of list?
rather than my current solution:

new_list=[]
for l in list_of_list:
Â Â new_list.extend(l)

or,

new_list=reduce(lambda x,y:x.extend(y), list_of_list)

Click to expand...

#only marginally better:
from operator import add
new_list = reduce(add, list_of_list)

Surely that's going to be O(N**2)?

I'd predict that would perform even worse than O(N**2) string
concatenation, on account that a string of size N has to allocate space
for N bytes, but a list of size N allocates space for N pointers (each 4
bytes, or 8 depending on the system), rounded up to the nearest power of
two. Also CPython can optimize string concatenation to O(N) under some
circumstances, but not lists.

.... L = [ ([None]*5000) for x in xrange(%d) ]
.... from operator import add
.... """

Timer("reduce(add, L)", setup % 4).repeat(number=1000) [0.53808808326721191, 0.51404905319213867, 0.51157188415527344]
Timer("reduce(add, L)", setup % 8).repeat(number=1000) [2.1178171634674072, 3.8830602169036865, 4.72245192527771]
Timer("reduce(add, L)", setup % 16).repeat(number=1000)

Click to expand...

Click to expand...

[17.190337896347046, 21.572744131088257, 21.584265947341919]

Looks like O(N**2) behaviour to me.

Chris Rebert · Aug 16, 2009

Hi,

Is there a simple way (the pythonic way) to flatten a list of list?
rather than my current solution:

new_list=[]
for l in list_of_list:
Â Â new_list.extend(l)

or,

new_list=reduce(lambda x,y:x.extend(y), list_of_list)

Click to expand...

#only marginally better:
from operator import add
new_list = reduce(add, list_of_list)

Click to expand...

Surely that's going to be O(N**2)?

The OP asked for "simple", not "best", "most proper", or "fastest". My
comment was intended to mean that the code was marginally *simpler*,
not faster.

Cheers,
Chris

Steven D'Aprano · Aug 16, 2009

Terry said:
Terry said:

Is there a simple way (the pythonic way) to flatten a list of list?

Click to expand...

This is probably the shortest it can get:

sum(list_of_lists, [])

That's also O(N**2).

from timeit import Timer
setup = "L = [ ([None]*5000) for x in xrange(%d) ]"
Timer("sum(L, [])", setup % 4).repeat(number=1000) [0.6070549488067627, 0.54354715347290039, 0.54686999320983887]
Timer("sum(L, [])", setup % 8).repeat(number=1000) [2.1285719871520996, 3.6722278594970703, 4.0785009860992432]
Timer("sum(L, [])", setup % 16).repeat(number=1000)

Click to expand...

Click to expand...

[18.370341062545776, 20.40509295463562, 21.871708869934082]

Terry · Aug 16, 2009

Hi,
Is there a simple way (the pythonic way) to flatten a list of list?
rather than my current solution:
new_list=[]
for l in list_of_list:
new_list.extend(l)
or,
new_list=reduce(lambda x,y:x.extend(y), list_of_list)
#only marginally better:
from operator import add
new_list = reduce(add, list_of_list)

Click to expand...

Click to expand...

Surely that's going to be O(N**2)?

Click to expand...

The OP asked for "simple", not "best", "most proper", or "fastest". My
comment was intended to mean that the code was marginally *simpler*,
not faster.

Cheers,
Chris
--http://blog.rebertia.com

Well, if possible, I'd like not only to know a simple solution, but
also the 'best', the 'most proper' and the 'fastest'

If they are not the same.

br, Terry

Steven D'Aprano · Aug 16, 2009

Hi,

Is there a simple way (the pythonic way) to flatten a list of list?
rather than my current solution:

new_list=[]
for l in list_of_list:
new_list.extend(l)

I don't think that scales terribly well. In my testing, it performs about
as badly as the O(N**2) behaviours others have suggested (using sum or
reduce with add) -- perhaps a bit worse for small N, but not quite as
badly for large N.

new_list=reduce(lambda x,y:x.extend(y), list_of_list)

That doesn't even work.

list_of_list = [ [1,2,3], [2, 4, 8] ]
new_list=reduce(lambda x,y:x.extend(y), list_of_list)
new_list is None

Click to expand...

Click to expand...

True

Chris' suggestion using itertools seems pretty good:
.... L = [ [None]*5000 for _ in xrange(%d) ]
.... from itertools import chain
.... """

Timer("list(chain.from_iterable(L))", setup % 4).repeat(number=1000) [0.61839914321899414, 0.61799716949462891, 0.62065696716308594]
Timer("list(chain.from_iterable(L))", setup % 8).repeat(number=1000) [1.2618398666381836, 1.3385050296783447, 3.9113419055938721]
Timer("list(chain.from_iterable(L))", setup % 16).repeat(number=1000)

Click to expand...

Click to expand...

[3.1349358558654785, 4.8554730415344238, 5.4319999217987061]

Steven D'Aprano · Aug 16, 2009

The OP asked for "simple", not "best", "most proper", or "fastest". My
comment was intended to mean that the code was marginally *simpler*, not
faster.

Fair enough, but he also asked for Pythonic, and while some people might
argue that "terrible performance" is Pythonic, I hope you wouldn't be one
of them!

If it soothes your ruffled sense of honour *wink*, I think your solution
with itertools.chain is probably the best so far.

Chris Rebert · Aug 16, 2009

Fair enough, but he also asked for Pythonic, and while some people might
argue that "terrible performance" is Pythonic, I hope you wouldn't be one
of them!

Indeed not.

I expected it would be worse performance-wise than the
OP's original due to all the intermediate lists that get produced; it
shouldn't be used in optimized production code.

If it soothes your ruffled sense of honour *wink*, I think your solution
with itertools.chain is probably the best so far.

Except it's not really my solution, it's whoever put it in the
itertools docs's.

But I'm glad to been able to help by pointing the recipe out.

Cheers,
Chris

Bearophile · Aug 16, 2009

Chris Rebert:

The OP asked for "simple", not "best", "most proper", or "fastest". My
comment was intended to mean that the code was marginally *simpler*,
not faster.

Yep, the OP has asked for simple code. But often this is not the right
way to solve this situation. A better way is to create (or copy) a
flatten that's efficient and well tested & debugged, put it into some
library of utilities, and then use import&use it when it's needed.

Bye,
bearophile

Francesco Bochicchio · Aug 16, 2009

On Aug 16, 1:25 pm, Steven D'Aprano <st...@REMOVE-THIS-
cybersource.com.au> wrote:

....

Chris' suggestion using itertools seems pretty good:

... L = [ [None]*5000 for _ in xrange(%d) ]
... from itertools import chain
... """>>> Timer("list(chain.from_iterable(L))", setup % 4).repeat(number=1000)

[0.61839914321899414, 0.61799716949462891, 0.62065696716308594]>>> Timer("list(chain.from_iterable(L))", setup % 8).repeat(number=1000)

[1.2618398666381836, 1.3385050296783447, 3.9113419055938721]>>> Timer("list(chain.from_iterable(L))", setup % 16).repeat(number=1000)

[3.1349358558654785, 4.8554730415344238, 5.4319999217987061]

I had a peek at itertools ( which is a C module btw) and realized that
chain solves the problem by creating a chain object,
which is a sort of generator. Both creating the chain object and
converting the chain object to a list seem to be O(N), so
the whole is O(N) too ...

Then I tried this pure python version:

# ----- CODE
from timeit import Timer
setup = """\\
L = [ [None]*5000 for _ in range(%d) ]
def pychain( list_of_list ):
for l in list_of_list:
for elem in l:
yield elem
"""

print( Timer("list(pychain(L))", setup % 4).repeat(number=1000))
print( Timer("list(pychain(L))", setup % 8).repeat(number=1000))
print( Timer("list(pychain(L))", setup % 16).repeat(number=1000))
# ----- END CODE

and got times that seem to confirm it :

[2.818755865097046, 2.7880589962005615, 2.79232120513916]
[5.588631868362427, 5.588244915008545, 5.587780952453613]
[11.620548009872437, 11.39465618133545, 11.40834903717041]

For reference, here are the times of the itertools.chain solution on
my laptop:

[0.6518809795379639, 0.6491332054138184, 0.6483590602874756]
[1.3188841342926025, 1.3173959255218506, 1.3207998275756836]
[2.7200729846954346, 2.5402050018310547, 2.543621063232422]

All this with Python 3.1 compiled from source on Xubuntu 8.10.

Ciao

Alan G Isaac · Aug 16, 2009

Is there a simple way (the pythonic way) to flatten a list of list?
rather than my current solution:

new_list=[]
for l in list_of_list:
new_list.extend(l)

new_list = list(xi for lst in list_of_list for xi in lst)

hth,
Alan Isaac

Paul Rubin · Aug 16, 2009

Terry said:
Is there a simple way (the pythonic way) to flatten a list of list?
rather than my current solution:

new_list=[]
for l in list_of_list:
new_list.extend(l)

from itertools import chain
new_list = list(chain(list_of_list))

Tim Cook · Aug 17, 2009

Hi,

Is there a simple way (the pythonic way) to flatten a list of list?
rather than my current solution:

new_list=[]
for l in list_of_list:
new_list.extend(l)

or,

new_list=reduce(lambda x,y:x.extend(y), list_of_list)

br, Terry

Well, This is not simple but it is comprhensive in that it has to do
several things. I am using it to decompose deeply nested lists from
Pyparsing output that may have strings in a variety of languages.
Performance wise I do not know how it stacks up against the other
examples but it works for me.

def flatten(x):
"""flatten(sequence) -> list

Returns a single, flat list which contains all elements retrieved
from the sequence and all recursively contained sub-sequences
(iterables). All strings are converted to unicode.

"""
result = []
for el in x:
#if isinstance(el, (list, tuple)):
if hasattr(el, "__iter__") and not isinstance(el, basestring):
result.extend(flatten(el))
else:
result.append(el)

# all strings must be unicode
rtnlist=[]
for x in result:
if isinstance(x,str):
# replace any brackets so Python doesn't think it's a list
and we still have a seperator.
x=x.replace('[','_')
x=x.replace(']','_')
try:
x=unicode(x, "utf8") # need more decode types here
except UnicodeDecodeError:
x=unicode(x, "latin1")
except UnicodeDecodeError:
x=unicode(x,"iso-8859-1")
except UnicodeDecodeError:
x=unicode(x,"eucJP")

rtnlist.append(x)

return rtnlist

(discord.py) 'async_generator' has no attribute 'flatten'	1	Jul 7, 2023
flattening and rebuilding a simple list of lists	2	Nov 30, 2009
Fast Efficient way to transfer an object to another list	10	May 1, 2010
Removing items from a list simultaneously	1	Apr 21, 2009
Flatten a two-level list --> one liner?	3	Mar 8, 2007
Understanding how is a function evaluated using recursion	10	Sep 26, 2013
VHDL design flatten compilation	7	Jan 14, 2014
flatten a level one list	32	Jan 12, 2006

flatten a list of list

Terry

Chris Rebert

Michael Fötsch

Steven D'Aprano

Chris Rebert

Steven D'Aprano

Terry

Steven D'Aprano

Steven D'Aprano

Chris Rebert

Bearophile

Francesco Bochicchio

Alan G Isaac

Paul Rubin

Tim Cook

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads