Flattening lists

R

Rhamphoryncus

In the application, there is one main numpy array of type "O".
Each element of the array corresponds to one cell in a grid. The user
may enter a Python expression into the grid cell. The input is
evaled and the result is stored in the numpy array (the actual process
is a bit more complicated). Therefore, the object inside a numpy array
element may be an inconsistent, nested, iterable type.

The user now may access the result grid via __getitem__. When doing
this, a numpy array that is as flat as possible while comprising the
maximum possible data depth is returned, i.e.:

1. Non-string and non-unicode iterables of similar length for each of
the cells form extra dimensions.

2. In order to remove different container types, the result is
flattened, cast into a numpy.array and re-shaped.

3. Dimensions of length 1 are eliminated.

Therefore, the user can conveniently use numpy ufuncs on the results.

I am referring to the flatten operation in step 2

I'm afraid I still don't understand it, but I'm only minimally
familiar with numpy, nevermind your spreadsheet app.
 
P

ptn

Quoth J Kenneth King <[email protected]>:


Hello everybody,
Any better solution than this?
def flatten(x):
    res = []
    for el in x:
        if isinstance(el,list):
            res.extend(flatten(el))
        else:
            res.append(el)
    return res
a = [1, 2, 3, [4, 5, 6], [[7, 8], [9, 10]]]
print flatten(a)
[1, 2, 3, 4, 5, 6, 7, 8, 9, 10]
Regards,
mk

That's worth reading.  I'm not sure why I'm finding this fun, but who
cares.  I tried a couple of other functions after reading that article,
and it looks like a generator that scans the nested lists is actually
the winner, and also is in many ways the most elegant implementation.
Of course, as noted in the emails following above article, the test data
is really inadequate for proper optimization testing ;)

-----------------------------------------------------------------
from __future__ import print_function
from timeit import Timer
from itertools import chain

# This is the one from the article quoted above.
def flatten6(seq):
    i = 0
    while (i != len(seq)):
        while hasattr(seq, '__iter__'):
            seq[i:i+1] = seq
        i = i + 1
    return seq

#This is my favorite from a readability standpoint out of
#all the things I tried.  It also performs the best.
def flatten8(seq):
    for x in seq:
        if not hasattr(x, '__iter__'): yield x
        else:
            for y in flatten8(x):
                yield y

l = [[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]

if __name__=="__main__":
    print(l)
    print('flatten6', flatten6(l))
    print('flatten8', list(flatten8(l)))
    print('flatten6', Timer("flatten6(l)", "from temp3 import flatten6, l").timeit())
    print('flatten8', Timer("list(flatten8(l))", "from temp3 import flatten8, l").timeit())

-----------------------------------------------------------------
src/python/Python-3.0/python temp3.py

[[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]
flatten6 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten8 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten6 32.8386368752
flatten8 30.7509689331
python temp3.py

[[1, 2, 3], 5, [7, 8], 3, [9, [10, 11, 12], 4, [9, [10, 5, 7]], 1, [[[[[5, 4], 3], 4, 3], 3, 1, 45], 9], 10]]
flatten6 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten8 [1, 2, 3, 5, 7, 8, 3, 9, 10, 11, 12, 4, 9, 10, 5, 7, 1, 5, 4, 3, 4, 3, 3, 1, 45, 9, 10]
flatten6 34.730714798
flatten8 32.3252940178

--RDM


I think the generator is clearer with a try statement, like in Magnus
Lie Hetland's solution from Beginning Python:

def flatten(nested):
try:
# Don't iterate over string-like objs.
try: nested + ''
except TypeError: pass
else: raise TypeError
for sub in nested:
for elem in flatten(sub):
yield elem
except TypeError:
# The for doesn't work for single elements.
yield nested

You can't iterate over string-like objs because the all strings are
built of infinite empty lists at the beginning, leading to infinite
recursion.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top