convert list of tuples into several lists

O

Oliver Eichler

Diez said:
zip(*[(1,4),(2,5),(3,6)])
Thanks :) I knew it must be simple. The asterics - thing was new to me.

By the way: What is faster?

this:

z = [(1,4),(2,5),(3,6)
a,b = zip(*[(x[0], x[0]-x[1]) for x in z])

or:

a = []
b = []
for x in z:
a.append(x[0])
b.append(x[0]-x[1])

I guess first, isn't it?


Oliver
 
P

Pierre Barbier de Reuille

Oliver Eichler a écrit :
Diez B. Roggisch wrote:

zip(*[(1,4),(2,5),(3,6)])

Thanks :) I knew it must be simple. The asterics - thing was new to me.

By the way: What is faster?

this:

z = [(1,4),(2,5),(3,6)
a,b = zip(*[(x[0], x[0]-x[1]) for x in z])

or:

a = []
b = []
for x in z:
a.append(x[0])
b.append(x[0]-x[1])

I guess first, isn't it?


Oliver

Best answer is : try it :)
use the "timeit" module (in the standard lib) to do so ...

Pierre
 
F

Francis Girard

Le mercredi 9 Février 2005 14:46, Diez B. Roggisch a écrit :
zip(*[(1,4),(2,5),(3,6)])

That's incredibly clever! I would had never thought to use "zip" to do this !
I would had only think to use it for the contrary, i.e.
[(1, 4), (2, 5), (3, 6)]

Notice though that the solution doesn't yield the exact contrary as we end up
with a list of tuples instead of a list of lists :
[(1, 2, 3), (4, 5, 6)]

But this can be easily fix if lists are really wanted :
map(list, zip(*[(1,4),(2,5),(3,6)]))
[[1, 2, 3], [4, 5, 6]]


Anyway, I find Diez solution brillant ! I'm always amazed to see how skilled a
programmer can get when comes the time to find a short and sweet solution.

One can admire that zip(*zip(*a_list_of_tuples)) == a_list_of_tuples

Thank you
You gave me much to enjoy

Francis girard
 
P

Peter Hansen

Steven said:
Diez said:
zip(*[(1,4),(2,5),(3,6)])

While this is also the approach I would use, it is worth noting that
Guido thinks of this as an abuse of the argument passing machinery:

http://mail.python.org/pipermail/python-dev/2003-July/037346.html

I'm not sure that's the same thread I already read where he
dissed zip like that, but what I'm wondering is what is the
alternative? Is there an equally elegant approach that
doesn't "abuse" the argument passing machinery?

-Peter
 
S

Steven Bethard

Peter said:
Steven said:
Diez said:
zip(*[(1,4),(2,5),(3,6)])

While this is also the approach I would use, it is worth noting that
Guido thinks of this as an abuse of the argument passing machinery:

http://mail.python.org/pipermail/python-dev/2003-July/037346.html

I'm not sure that's the same thread I already read where he
dissed zip like that, but what I'm wondering is what is the
alternative? Is there an equally elegant approach that
doesn't "abuse" the argument passing machinery?

I know I found it in another thread before. I think he's said it a few
times.

Personally, I wish that, if we're not to use zip like this, that Python
would provide a builtin 'unzip' to do the corresponding thing.


If you're interested in a recipe that does this, you can look at my
'starzip' recipe which is basically 'unzip' for iterators:

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/302325

I guess if we wanted a standard solution to the 'unzip' problem, I could
rally for this to be added to itertools, but I'm not sure I'm up to the
task of rewriting it in C...

Steve
 
S

Steven Bethard

Cappy2112 said:
What does the leading * do?

Tells Python to use the following iterable as the (remainder of the)
argument list:

py> def f(x, y):
.... print x, y
....
py> f([1, 2])
Traceback (most recent call last):
File "<interactive input>", line 1, in ?
TypeError: f() takes exactly 2 arguments (1 given)
py> f(*[1, 2])
1 2
py> f(1, [2])
1 [2]
py> f(1, *[2])
1 2

Note that whenever the leading * is present, the following list gets
expanded into the positional arguments of f -- x and y.

Steve
 
O

Oliver Eichler

Pierre said:
Best answer is : try it :)
use the "timeit" module (in the standard lib) to do so ...

Ok,

import timeit

s = """\
a,b,c1,c2 = zip(*[(x[2],x[4], x[2]-x[1], x[2] - x[3]) for x in z])
"""

t = timeit.Timer(stmt=s,setup="z = [(1,2,3,4,5)]*1000")
print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)

s ="""\
for x in z:
a = x[2]
b = x[4]
c1 = x[2] - x[1]
c2 = x[2] - x[3]
"""

t = timeit.Timer(stmt=s,setup="z = [(1,2,3,4,5)]*1000")
print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)

for 30 elements:
oeichler@frog:~/tmp> python test.py
32.90 usec/pass
21.53 usec/pass

for 100 elements:
oeichler@frog:~/tmp> python test.py
103.32 usec/pass
70.91 usec/pass

for 100 elements:
oeichler@frog:~/tmp> python test.py
1435.43 usec/pass
710.55 usec/pass

What do we learn? It might look elegant but it is slow. I guess mainly
because I do the iteration twice with the zip command. The 1000 element run
seems to show what Guido means with "abuse of the argument passing
machinery"

Learned my lesson :)

Thanks to all

Oliver
 
O

Oliver Eichler

Pierre said:
Best answer is : try it :)
use the "timeit" module (in the standard lib) to do so ...

Ok, (a second time. I hope the first post was cancelled as it was false)

import timeit

s = """\
a,b,c1,c2 = zip(*[(x[2],x[4], x[2]-x[1], x[2] - x[3]) for x in z])
"""

t = timeit.Timer(stmt=s,setup="z = [(1,2,3,4,5)]*1000")
print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)

s ="""\
a = []
b = []
c1 = []
c2 = []
for x in z:
a.append(x[2])
b.append(x[4])
c1.append(x[2] - x[1])
c2.append(x[2] - x[3])
"""

t = timeit.Timer(stmt=s,setup="z = [(1,2,3,4,5)]*1000")
print "%.2f usec/pass" % (1000000 * t.timeit(number=100000)/100000)

for 100 elements:
oeichler@frog:~/tmp> python test.py
104.67 usec/pass
180.19 usec/pass

for 1000 elements:
oeichler@frog:~/tmp> python test.py
1432.06 usec/pass
1768.58 usec/pass


What do we learn? The zip-thingy is even faster than the for loop

Learned my lesson :)

Thanks to all

Oliver
 
N

Nick Craig-Wood

Cappy2112 said:
What does the leading * do?

It causes the list/tuple following the * to be unpacked into function
arguments. Eg
[(1, 4), (2, 5), (3, 6)]

is the same as
[(1, 4), (2, 5), (3, 6)]

The * should make you think of dereferencing things (eg pointer
de-reference in C).

Its equivalent to the now deprecated apply function which does the
same thing in a more wordy fashion, ie apply the list as parameters to
the function.
apply(zip, [(1, 2, 3), (4, 5, 6)])
[(1, 4), (2, 5), (3, 6)]
 
N

Nick Coghlan

Steven said:
Peter said:
Steven said:
Diez B. Roggisch wrote:

zip(*[(1,4),(2,5),(3,6)])


While this is also the approach I would use, it is worth noting that
Guido thinks of this as an abuse of the argument passing machinery:

http://mail.python.org/pipermail/python-dev/2003-July/037346.html


I'm not sure that's the same thread I already read where he
dissed zip like that, but what I'm wondering is what is the
alternative? Is there an equally elegant approach that
doesn't "abuse" the argument passing machinery?


I know I found it in another thread before. I think he's said it a few
times.

Personally, I wish that, if we're not to use zip like this, that Python
would provide a builtin 'unzip' to do the corresponding thing.

I never really got the impression that Guido was particularly *strongly* opposed
to this use of the extended call syntax. Merely that he was concerned that it
would break down if the relevant list turned out to be large (that is, the abuse
is using *args with a list when the list may turn out to be large, not a problem
specifically with using the star syntax with zip()).

Maybe he's been more explicit somewhere, and I just never saw it.

Anyway, his concern seems justified, as my understanding is that func(*iterable)
is roughly equivalent to func(*tuple(iterable)), which can be rather expensive
when the iterable is a long list of tuples.

So zip(*zip(*iterable)) is actually zip(*tuple(zip(*tuple(iterable)))). That's
potentially an awful lot of data copying for an identity operation :)

Anyway, I think it does make a decent case for an itertools.iunzip or some such
beast.

Cheers,
Nick.
 
P

Peter Hansen

Nick said:
I never really got the impression that Guido was particularly *strongly*
opposed to this use of the extended call syntax. Merely that he was
concerned that it would break down if the relevant list turned out to be
large (that is, the abuse is using *args with a list when the list may
turn out to be large, not a problem specifically with using the star
syntax with zip()).

Is there some unexpected limit to the number of arguments that may be
passed with the *args format (say, "256 function arguments maximum are
supported by Python"), or is this concern just because of the raw
memory inherently used by the tuple?

In other words, if one is confident that one can whip tuples of the
required size around without using up available memory, would there
still be such a concern about the *args "abuse"?

-Peter
 
P

Pierre Quentel

Steven Bethard a écrit :
Tells Python to use the following iterable as the (remainder of the)
argument list:

Could someone explain why this doesn't work :

Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information..... print args, kw
....
>>> f(*[1,2]) (1, 2) {}
>>> f(*[1,2],x=1)
File "<stdin>", line 1
f(*[1,2],x=1)
^
SyntaxError: invalid syntax
Pierre
 
S

Stephen Thorne

Steven Bethard a écrit :
Tells Python to use the following iterable as the (remainder of the)
argument list:

Could someone explain why this doesn't work :

Python 2.3.2 (#49, Oct 2 2003, 20:02:00) [MSC v.1200 32 bit (Intel)] on
win32
Type "help", "copyright", "credits" or "license" for more information.... print args, kw
...
f(*[1,2]) (1, 2) {}
f(*[1,2],x=1)
File "<stdin>", line 1
f(*[1,2],x=1)
^
SyntaxError: invalid syntax

the * and ** must occur at the end.

f(x=1, *[1,2]) is valid.

Stephen.
 
N

Nick Coghlan

Peter said:
Is there some unexpected limit to the number of arguments that may be
passed with the *args format (say, "256 function arguments maximum are
supported by Python"), or is this concern just because of the raw
memory inherently used by the tuple?

In other words, if one is confident that one can whip tuples of the
required size around without using up available memory, would there
still be such a concern about the *args "abuse"?

I'm not aware of any arbitrary limits in that code, since it does pass real
tuple objects around. Then again, it's not an area of the code I'm particularly
familiar with. . .

However, if there was a limit other than the amount of available memory though,
I expect Guido would have said so explicitly. As it is, the concern seems to be
that there is a potentially large copy operation triggered by an innocent
looking function call.

Cheers,
Nick.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top