How do I iterate over items in a dict grouped by N number ofelements?

N

Noah

What is the fastest way to select N items at a time from a dictionary?
I'm iterating over a dictionary of many thousands of items.
I want to operate on only 100 items at a time.
I want to avoid copying items using any sort of slicing.
Does itertools copy items?

This works, but is ugly:
from itertools import *
D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9, 'j':10}
N = 3
for G in izip(*[chain(D.items(), repeat(None, N-1))]*N):
.... print G
....
(('a', 1), ('c', 3), ('b', 2))
(('e', 5), ('d', 4), ('g', 7))
(('f', 6), ('i', 9), ('h', 8))
(('j', 10), None, None)

I'd prefer the last sequence not return None
elements and instead just return (('j',10)), but this isn't a huge
deal.

This works and is clear, but it makes copies of items:
.... print ii[i:i+N]
....
[('a', 1), ('c', 3), ('b', 2)]
[('e', 5), ('d', 4), ('g', 7)]
[('f', 6), ('i', 9), ('h', 8)]
[('j', 10)]
 
P

Paul Rubin

Noah said:
What is the fastest way to select N items at a time from a dictionary?
I'm iterating over a dictionary of many thousands of items.
I want to operate on only 100 items at a time.
I want to avoid copying items using any sort of slicing.

I'd do something like (untested):

def groups(seq, n):
while True:
s = list(itertools.islice(seq, n))
if not s: return
yield s

items = d.iteritems()
for g in groups(items, 100):
operate_on (g)
Does itertools copy items?

I don't understand this question.
 
A

attn.steven.kuo

What is the fastest way to select N items at a time from a dictionary?
I'm iterating over a dictionary of many thousands of items.
I want to operate on only 100 items at a time.
I want to avoid copying items using any sort of slicing.
Does itertools copy items?

This works, but is ugly:
from itertools import *
D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9, 'j':10}
N = 3
for G in izip(*[chain(D.items(), repeat(None, N-1))]*N):

... print G
...
(('a', 1), ('c', 3), ('b', 2))
(('e', 5), ('d', 4), ('g', 7))
(('f', 6), ('i', 9), ('h', 8))
(('j', 10), None, None)

I'd prefer the last sequence not return None
elements and instead just return (('j',10)), but this isn't a huge
deal.

This works and is clear, but it makes copies of items:

... print ii[i:i+N]
...
[('a', 1), ('c', 3), ('b', 2)]
[('e', 5), ('d', 4), ('g', 7)]
[('f', 6), ('i', 9), ('h', 8)]
[('j', 10)]


groupby?

import itertools

D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9,
'j':10}
N = 3

it = itertools.groupby(enumerate(D.items()), lambda t: int(t[0]/N))

for each in it:
print tuple(t[1] for t in each[1])
 
A

Arnaud Delobelle

What is the fastest way to select N items at a time from a dictionary?
I'm iterating over a dictionary of many thousands of items.
I want to operate on only 100 items at a time.
I want to avoid copying items using any sort of slicing.
Does itertools copy items?

This works, but is ugly:
from itertools import *
D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9, 'j':10}
N = 3
for G in izip(*[chain(D.items(), repeat(None, N-1))]*N):

This solution matches exactly the one proposed in itertools. The
following is an extract from http://docs.python.org/lib/itertools-functions.html.

Note, the left-to-right evaluation order of the iterables is
guaranteed. This makes possible an idiom for clustering a data series
into n-length groups using "izip(*[iter(s)]*n)". For data that doesn't
fit n-length groups exactly, the last tuple can be pre-padded with
fill values using "izip(*[chain(s, [None]*(n-1))]*n)".
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top