bug with itertools.groupby?

K

Kitlbast

Hi there,

the code below on Python 2.5.2:

from itertools import groupby

info_list = [
{'profile': 'http://somesite.com/profile1', 'account': 61L},
{'profile': 'http://somesite.com/profile2', 'account': 64L},
{'profile': 'http://somesite.com/profile3', 'account': 61L},
]

grouped_by_account = groupby(info_list, lambda x: x['account'])
for acc, iter_info_items in grouped_by_account:
print 'grouped acc: ', acc

gives output:

grouped acc: 61
grouped acc: 64
grouped acc: 61

am I doing something wrong?
 
D

Diez B. Roggisch

Kitlbast said:
Hi there,

the code below on Python 2.5.2:

from itertools import groupby

info_list = [
{'profile': 'http://somesite.com/profile1', 'account': 61L},
{'profile': 'http://somesite.com/profile2', 'account': 64L},
{'profile': 'http://somesite.com/profile3', 'account': 61L},
]

grouped_by_account = groupby(info_list, lambda x: x['account'])
for acc, iter_info_items in grouped_by_account:
print 'grouped acc: ', acc

gives output:

grouped acc: 61
grouped acc: 64
grouped acc: 61

am I doing something wrong?

http://docs.python.org/library/itertools.html#itertools.groupby

"""
Generally, the iterable needs to already be sorted on the same key function.
"""

Diez
 
K

Kitlbast

Thanks guys!

Miss sorting when reading docs.. (

However, I just create simple "groupby":

def groupby(_list, key_func):
res = {}
for i in _list:
k = key_func(i)
if k not in res:
res[k] =
else:
res[k].append(i)
return res

and it works 3 times faster then itertools.groupby for my example (for
tests I extend number of profiles)
 
R

Raymond Hettinger

Hi there,

the code below on Python 2.5.2:

from itertools import groupby

info_list = [
    {'profile': 'http://somesite.com/profile1', 'account': 61L},
    {'profile': 'http://somesite.com/profile2', 'account': 64L},
    {'profile': 'http://somesite.com/profile3', 'account': 61L},
]

grouped_by_account = groupby(info_list, lambda x: x['account'])
for acc, iter_info_items in grouped_by_account:
    print 'grouped acc: ', acc

gives output:

grouped acc:  61
grouped acc:  64
grouped acc:  61

am I doing something wrong?

Try another variant of groupby() that doesn't require the data to be
sorted:

http://code.activestate.com/recipes/259173/


Raymond
 
K

Kitlbast

Hi there,
the code below on Python 2.5.2:
from itertools import groupby
info_list = [
    {'profile': 'http://somesite.com/profile1', 'account': 61L},
    {'profile': 'http://somesite.com/profile2', 'account': 64L},
    {'profile': 'http://somesite.com/profile3', 'account': 61L},
]
grouped_by_account = groupby(info_list, lambda x: x['account'])
for acc, iter_info_items in grouped_by_account:
    print 'grouped acc: ', acc
gives output:
grouped acc:  61
grouped acc:  64
grouped acc:  61
am I doing something wrong?

Try another variant of groupby() that doesn't require the data to be
sorted:

   http://code.activestate.com/recipes/259173/

Raymond

I've checked few options of groupby() implementations

1. def groupby(_list, key_func):
res = {}
for i in _list:
k = key_func(i)
if k not in res:
res[k] =
else:
res[k].append(i)
return res


2. def groupby(_list, key_func):
res = {}
[res.setdefault(key_func(i), []).append(i) for i in _list]
return res


second one with setdefault works little bit slower then (1), although
it use list comprehension
 
D

Dave Angel

Kitlbast said:
Hi there,

the code below on Python 2.5.2:

from itertools import groupby

info_list =
{'profile': 'http://somesite.com/profile1', 'account': 61L},
{'profile': 'http://somesite.com/profile2', 'account': 64L},
{'profile': 'http://somesite.com/profile3', 'account': 61L},
]

grouped_by_account =roupby(info_list, lambda x: x['account'])
for acc, iter_info_items in grouped_by_account:
print 'grouped acc: ', acc

gives output:

grouped acc: 61
grouped acc: 64
grouped acc: 61

am I doing something wrong?
Try another variant of groupby() that doesn't require the data to be
sorted:

http://code.activestate.com/recipes/259173/

Raymond

I've checked few options of groupby() implementations

1. def groupby(_list, key_func):
res =}
for i in _list:
k =ey_func(i)
if k not in res:
res[k] =i]
else:
res[k].append(i)
return res


2. def groupby(_list, key_func):
res =}
[res.setdefault(key_func(i), []).append(i) for i in _list]
return res


second one with setdefault works little bit slower then (1), although
it use list comprehension
Or option 3: use defaultdict
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,586
Members
45,087
Latest member
JeremyMedl

Latest Threads

Top