opposite of zip()?

igor.tatarinov · Dec 15, 2007

Given a bunch of arrays, if I want to create tuples, there is
zip(arrays). What if I want to do the opposite: break a tuple up and
append the values to given arrays:
map(append, arrays, tupl)
except there is no unbound append() (List.append() does not exist,
right?).

Without append(), I am forced to write a (slow) explicit loop:
for (a, v) in zip(arrays, tupl):
a.append(v)

I assume using an index variable instead wouldn't be much faster.

Is there a better solution?

Thanks,
igor

Paddy · Dec 15, 2007

Given a bunch of arrays, if I want to create tuples, there is
zip(arrays). What if I want to do the opposite: break a tuple up and
append the values to given arrays:
map(append, arrays, tupl)
except there is no unbound append() (List.append() does not exist,
right?).

Without append(), I am forced to write a (slow) explicit loop:
for (a, v) in zip(arrays, tupl):
a.append(v)

I assume using an index variable instead wouldn't be much faster.

Is there a better solution?

Thanks,
igor

I can't quite get what you require from your explanation. Do you have
sample input & output?

Maybe this:
http://paddy3118.blogspot.com/2007/02/unzip-un-needed-in-python.html
Will help.

- Paddy.

Gary Herron · Dec 15, 2007

Given a bunch of arrays, if I want to create tuples, there is
zip(arrays). What if I want to do the opposite: break a tuple up and
append the values to given arrays:
map(append, arrays, tupl)
except there is no unbound append() (List.append() does not exist,
right?).

Without append(), I am forced to write a (slow) explicit loop:
for (a, v) in zip(arrays, tupl):
a.append(v)

I assume using an index variable instead wouldn't be much faster.

Is there a better solution?

Thanks,
igor

But it *does* exist, and its named list.append, and it works as you wanted.

list.append

Click to expand...

a = [[],[]]
map(list.append, a, (1,2)) [None, None]
a [[1], [2]]
map(list.append, a, (3,4)) [None, None]
a [[1, 3], [2, 4]]
map(list.append, a, (30,40)) [None, None]
a

Click to expand...

Click to expand...

[[1, 3, 30], [2, 4, 40]]

Gary Herron

Steven D'Aprano · Dec 15, 2007

Given a bunch of arrays, if I want to create tuples, there is
zip(arrays). What if I want to do the opposite: break a tuple up and
append the values to given arrays:
map(append, arrays, tupl)
except there is no unbound append() (List.append() does not exist,
right?).

Don't guess, test.
<method 'append' of 'list' objects>

Apparently it does. Here's how *not* to use it to do what you want:

arrays = [[1, 2, 3, 4], [101, 102, 103, 104]]
tupl = tuple("ab")
map(lambda alist, x: alist.append(x), arrays, tupl) [None, None]
arrays

Click to expand...

Click to expand...

[[1, 2, 3, 4, 'a'], [101, 102, 103, 104, 'b']]

It works, but is confusing and hard to understand, and the lambda
probably makes it slow. Don't do it that way.

Without append(), I am forced to write a (slow) explicit loop:
for (a, v) in zip(arrays, tupl):
a.append(v)

Are you sure it's slow? Compared to what?

For the record, here's the explicit loop:

arrays = [[1, 2, 3, 4], [101, 102, 103, 104]]
tupl = tuple("ab")
zip(arrays, tupl) [([1, 2, 3, 4], 'a'), ([101, 102, 103, 104], 'b')]
for (a, v) in zip(arrays, tupl):

Click to expand...

Click to expand...

.... a.append(v)
....[[1, 2, 3, 4, 'a'], [101, 102, 103, 104, 'b']]

I think you're making it too complicated. Why use zip()?

arrays = [[1, 2, 3, 4], [101, 102, 103, 104]]
tupl = tuple("ab")
for i, alist in enumerate(arrays):

Click to expand...

Click to expand...

.... alist.append(tupl)
....[[1, 2, 3, 4, 'a'], [101, 102, 103, 104, 'b']]

Steven D'Aprano · Dec 15, 2007

Here's how *not* to use it to do what you want:

arrays = [[1, 2, 3, 4], [101, 102, 103, 104]] tupl = tuple("ab")
map(lambda alist, x: alist.append(x), arrays, tupl) [None, None]
arrays

Click to expand...

Click to expand...

[[1, 2, 3, 4, 'a'], [101, 102, 103, 104, 'b']]

It works, but is confusing and hard to understand, and the lambda
probably makes it slow. Don't do it that way.

As Gary Herron points out, you don't need to use lambda:

map(list.append, arrays, tupl)

will work. I still maintain that this is the wrong way to to it: taking
the lambda out makes the map() based solution marginally faster than the
explicit loop, but I don't believe that the gain in speed is worth the
loss in readability.

(e.g. on my PC, for an array of 900000 sub-lists, the map() version takes
0.4 second versus 0.5 second for the explicit loop. For smaller arrays,
the results are similar.)

igor.tatarinov · Dec 15, 2007

Hi folks,

Thanks, for all the help. I tried running the various options, and
here is what I found:

from array import array
from time import time

def f1(recs, cols):
for r in recs:
for i,v in enumerate(r):
cols.append(v)

def f2(recs, cols):
for r in recs:
for v,c in zip(r, cols):
c.append(v)

def f3(recs, cols):
for r in recs:
map(list.append, cols, r)

def f4(recs):
return zip(*recs)

records = [ tuple(range(10)) for i in xrange(1000000) ]

columns = tuple([] for i in xrange(10))
t = time()
f1(records, columns)
print 'f1: ', time()-t

columns = tuple([] for i in xrange(10))
t = time()
f2(records, columns)
print 'f2: ', time()-t

columns = tuple([] for i in xrange(10))
t = time()
f3(records, columns)
print 'f3: ', time()-t

t = time()
columns = f4(records)
print 'f4: ', time()-t

f1: 5.10132408142
f2: 5.06787180901
f3: 4.04700708389
f4: 19.13633203506

So there is some benefit in using map(list.append). f4 is very clever
and cool but it doesn't seem to scale.

Incidentally, it took me a while to figure out why the following
initialization doesn't work:
columns = ([],)*10
apparently you end up with 10 copies of the same list.

Finally, in my case the output columns are integer arrays (to save
memory). I can still use array.append but it's a little slower so the
difference between f1-f3 gets even smaller. f4 is not an option with
arrays.

Gary Herron · Dec 15, 2007

Hi folks,

Thanks, for all the help. I tried running the various options, and
here is what I found:

from array import array
from time import time

def f1(recs, cols):
for r in recs:
for i,v in enumerate(r):
cols.append(v)

def f2(recs, cols):
for r in recs:
for v,c in zip(r, cols):
c.append(v)

def f3(recs, cols):
for r in recs:
map(list.append, cols, r)

def f4(recs):
return zip(*recs)

records = [ tuple(range(10)) for i in xrange(1000000) ]

columns = tuple([] for i in xrange(10))
t = time()
f1(records, columns)
print 'f1: ', time()-t

columns = tuple([] for i in xrange(10))
t = time()
f2(records, columns)
print 'f2: ', time()-t

columns = tuple([] for i in xrange(10))
t = time()
f3(records, columns)
print 'f3: ', time()-t

t = time()
columns = f4(records)
print 'f4: ', time()-t

f1: 5.10132408142
f2: 5.06787180901
f3: 4.04700708389
f4: 19.13633203506

So there is some benefit in using map(list.append). f4 is very clever
and cool but it doesn't seem to scale.

Incidentally, it took me a while to figure out why the following
initialization doesn't work:
columns = ([],)*10
apparently you end up with 10 copies of the same list.

Yes. A well known gotcha in Python and a FAQ.

rasmus · Dec 15, 2007

[email protected] said:
[email protected] said:

Hi folks,

Click to expand...

Thanks, for all the help. I tried running the various options, and
here is what I found:

Click to expand...

from array import array
from time import time

Click to expand...

def f1(recs, cols):
for r in recs:
for i,v in enumerate(r):
cols.append(v)

Click to expand...

def f2(recs, cols):
for r in recs:
for v,c in zip(r, cols):
c.append(v)

Click to expand...

def f3(recs, cols):
for r in recs:
map(list.append, cols, r)

Click to expand...

def f4(recs):
return zip(*recs)

Click to expand...

records = [ tuple(range(10)) for i in xrange(1000000) ]

Click to expand...

columns = tuple([] for i in xrange(10))
t = time()
f1(records, columns)
print 'f1: ', time()-t

Click to expand...

columns = tuple([] for i in xrange(10))
t = time()
f2(records, columns)
print 'f2: ', time()-t

Click to expand...

columns = tuple([] for i in xrange(10))
t = time()
f3(records, columns)
print 'f3: ', time()-t

Click to expand...

t = time()
columns = f4(records)
print 'f4: ', time()-t

Click to expand...

f1: 5.10132408142
f2: 5.06787180901
f3: 4.04700708389
f4: 19.13633203506

Click to expand...

So there is some benefit in using map(list.append). f4 is very clever
and cool but it doesn't seem to scale.

Click to expand...

Incidentally, it took me a while to figure out why the following
initialization doesn't work:
columns = ([],)*10
apparently you end up with 10 copies of the same list.

Click to expand...

Yes. A well known gotcha in Python and a FAQ.

Finally, in my case the output columns are integer arrays (to save
memory). I can still use array.append but it's a little slower so the
difference between f1-f3 gets even smaller. f4 is not an option with
arrays.

Click to expand...

If you want another answer. The opposite of zip(lists) is zip(*
list_of_tuples)

That is:
lists == zip(zip(* lists))

I don't know about its speed though compared to the other suggestions.

Matt

greg · Dec 15, 2007

map(append, arrays, tupl)
except there is no unbound append() (List.append() does not exist,
right?).

Er, no, but list.append does:
<method 'append' of 'list' objects>

so you should be able to do

map(list.append, arrays, tupl)

provided you know that all the elements of 'arrays' are
actual lists.

Rich Harkins · Dec 17, 2007

Given a bunch of arrays, if I want to create tuples, there is
zip(arrays). What if I want to do the opposite: break a tuple up and
append the values to given arrays:
map(append, arrays, tupl)
except there is no unbound append() (List.append() does not exist,
right?).

list.append does exist (try the lower-case flavor).

Without append(), I am forced to write a (slow) explicit loop:
for (a, v) in zip(arrays, tupl):
a.append(v)

Except that isn't technically the opposite of zip. The opposite would
be a tuple of single-dimensional tuples:

def unzip(zipped):
"""
Given a sequence of size-sized sequences, produce a tuple of tuples
that represent each index within the zipped object.

Example:

>>> zipped = zip((1, 2, 3), (4, 5, 6))
>>> zipped [(1, 4), (2, 5), (3, 6)]
>>> unzip(zipped)

Click to expand...

Click to expand...

((1, 2, 3), (4, 5, 6))
"""
if len(zipped) < 1:
raise ValueError, 'At least one item is required for unzip.'
indices = range(len(zipped[0]))
return tuple(tuple(pair[index] for pair in zipped)
for index in indices)

This is probably not the most efficient hunk of code for this but this
would seem to be the correct behavior for the opposite of zip and it
should scale well.

Modifying the above with list.extend would produce a variant closer to
what I think you're asking for:

def unzip_extend(dests, zipped):
"""
Appends the unzip versions of zipped into dests. This avoids an
unnecessary allocation.

Example:

>>> zipped = zip((1, 2, 3), (4, 5, 6))
>>> zipped [(1, 4), (2, 5), (3, 6)]
>>> dests = [[], []]
>>> unzip_extend(dests, zipped)
>>> dests

Click to expand...

Click to expand...

[[1, 2, 3], [4, 5, 6]]
"""
if len(zipped) < 1:
raise ValueError, 'At least one item is required for unzip.'
for index in range(len(zipped[0])):
dests[index].extend(pair[index] for pair in zipped)

This should perform pretty well, as extend with a comprehension is
pretty fast. Not that it's truly meaningful, here's timeit on my 2GHz
laptop:

bash-3.1$ python -m timeit -s 'import unzip; zipped=zip(range(1024),
range(1024))' 'unzip.unzip_extend([[], []], zipped)'
1000 loops, best of 3: 510 usec per loop

By comparison, here's the unzip() version above:

bash-3.1$ python -m timeit -s 'import unzip; zipped=zip(range(1024),
range(1024))' 'unzip.unzip(zipped)'
1000 loops, best of 3: 504 usec per loop

Rich

Matt Nordhoff · Dec 17, 2007

Rich said:
Given a bunch of arrays, if I want to create tuples, there is
zip(arrays). What if I want to do the opposite: break a tuple up and
append the values to given arrays:
map(append, arrays, tupl)
except there is no unbound append() (List.append() does not exist,
right?).

Click to expand...

list.append does exist (try the lower-case flavor).

Without append(), I am forced to write a (slow) explicit loop:
for (a, v) in zip(arrays, tupl):
a.append(v)

Click to expand...

Except that isn't technically the opposite of zip. The opposite would
be a tuple of single-dimensional tuples:

def unzip(zipped):
"""
Given a sequence of size-sized sequences, produce a tuple of tuples
that represent each index within the zipped object.

Example:

zipped = zip((1, 2, 3), (4, 5, 6))
zipped [(1, 4), (2, 5), (3, 6)]
unzip(zipped)

Click to expand...

Click to expand...

((1, 2, 3), (4, 5, 6))
"""
if len(zipped) < 1:
raise ValueError, 'At least one item is required for unzip.'
indices = range(len(zipped[0]))
return tuple(tuple(pair[index] for pair in zipped)
for index in indices)

This is probably not the most efficient hunk of code for this but this
would seem to be the correct behavior for the opposite of zip and it
should scale well.

Modifying the above with list.extend would produce a variant closer to
what I think you're asking for:

def unzip_extend(dests, zipped):
"""
Appends the unzip versions of zipped into dests. This avoids an
unnecessary allocation.

Example:

zipped = zip((1, 2, 3), (4, 5, 6))
zipped [(1, 4), (2, 5), (3, 6)]
dests = [[], []]
unzip_extend(dests, zipped)
dests

Click to expand...

Click to expand...

[[1, 2, 3], [4, 5, 6]]
"""
if len(zipped) < 1:
raise ValueError, 'At least one item is required for unzip.'
for index in range(len(zipped[0])):
dests[index].extend(pair[index] for pair in zipped)

This should perform pretty well, as extend with a comprehension is
pretty fast. Not that it's truly meaningful, here's timeit on my 2GHz
laptop:

bash-3.1$ python -m timeit -s 'import unzip; zipped=zip(range(1024),
range(1024))' 'unzip.unzip_extend([[], []], zipped)'
1000 loops, best of 3: 510 usec per loop

By comparison, here's the unzip() version above:

bash-3.1$ python -m timeit -s 'import unzip; zipped=zip(range(1024),
range(1024))' 'unzip.unzip(zipped)'
1000 loops, best of 3: 504 usec per loop

Rich

As Paddy wrote, zip is its own unzip:

zipped = zip((1, 2, 3), (4, 5, 6))
zipped [(1, 4), (2, 5), (3, 6)]
unzipped = zip(*zipped)
unzipped

Click to expand...

Click to expand...

[(1, 2, 3), (4, 5, 6)]

Neat and completely confusing, huh?

<http://paddy3118.blogspot.com/2007/02/unzip-un-needed-in-python.html>
--

Rich Harkins · Dec 17, 2007

Matt Nordhoff wrote:
[snip]

As Paddy wrote, zip is its own unzip:

zipped = zip((1, 2, 3), (4, 5, 6))
zipped [(1, 4), (2, 5), (3, 6)]
unzipped = zip(*zipped)
unzipped

Click to expand...

Click to expand...

[(1, 2, 3), (4, 5, 6)]

Neat and completely confusing, huh?

<http://paddy3118.blogspot.com/2007/02/unzip-un-needed-in-python.html>

I hadn't thought about zip() being symmetrical like that. Very cool...

Rich

what is the best idiom to perform the opposite of "zip"? (besides "unfold")	1	Oct 2, 2012
compile directly into a .zip/.jar file?	22	Apr 12, 2014
Translater + module + tkinter	1	Feb 16, 2023
Data saving in condition of changing reality	0	Apr 29, 2022
built in zip function speed	12	Jul 4, 2006
merge list of tuples with list	12	Oct 20, 2010
inverse of the zip function	6	Jul 29, 2003
Python 3000, zip, *args and iterators	10	Dec 26, 2004

opposite of zip()?

igor.tatarinov

Paddy

Gary Herron

Steven D'Aprano

Steven D'Aprano

igor.tatarinov

Gary Herron

rasmus

greg

Rich Harkins

Matt Nordhoff

Rich Harkins

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads