# How do I iterate over items in a dict grouped by N number ofelements?

Discussion in 'Python' started by Noah, Mar 14, 2008.

1. ### NoahGuest

What is the fastest way to select N items at a time from a dictionary?
I'm iterating over a dictionary of many thousands of items.
I want to operate on only 100 items at a time.
I want to avoid copying items using any sort of slicing.
Does itertools copy items?

This works, but is ugly:

>>> from itertools import *
>>> D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9, 'j':10}
>>> N = 3
>>> for G in izip(*[chain(D.items(), repeat(None, N-1))]*N):

.... print G
....
(('a', 1), ('c', 3), ('b', 2))
(('e', 5), ('d', 4), ('g', 7))
(('f', 6), ('i', 9), ('h', 8))
(('j', 10), None, None)

I'd prefer the last sequence not return None
elements and instead just return (('j',10)), but this isn't a huge
deal.

This works and is clear, but it makes copies of items:

>>> ii = D.items()
>>> for i in range (0, len(ii), N):

.... print ii[i:i+N]
....
[('a', 1), ('c', 3), ('b', 2)]
[('e', 5), ('d', 4), ('g', 7)]
[('f', 6), ('i', 9), ('h', 8)]
[('j', 10)]

--
Noah

Noah, Mar 14, 2008

2. ### Paul RubinGuest

Re: How do I iterate over items in a dict grouped by N number of elements?

Noah <> writes:
> What is the fastest way to select N items at a time from a dictionary?
> I'm iterating over a dictionary of many thousands of items.
> I want to operate on only 100 items at a time.
> I want to avoid copying items using any sort of slicing.

I'd do something like (untested):

def groups(seq, n):
while True:
s = list(itertools.islice(seq, n))
if not s: return
yield s

items = d.iteritems()
for g in groups(items, 100):
operate_on (g)

> Does itertools copy items?

I don't understand this question.

Paul Rubin, Mar 14, 2008

3. ### Guest

On Mar 13, 6:34 pm, Noah <> wrote:
> What is the fastest way to select N items at a time from a dictionary?
> I'm iterating over a dictionary of many thousands of items.
> I want to operate on only 100 items at a time.
> I want to avoid copying items using any sort of slicing.
> Does itertools copy items?
>
> This works, but is ugly:
>
> >>> from itertools import *
> >>> D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9, 'j':10}
> >>> N = 3
> >>> for G in izip(*[chain(D.items(), repeat(None, N-1))]*N):

>
> ... print G
> ...
> (('a', 1), ('c', 3), ('b', 2))
> (('e', 5), ('d', 4), ('g', 7))
> (('f', 6), ('i', 9), ('h', 8))
> (('j', 10), None, None)
>
> I'd prefer the last sequence not return None
> elements and instead just return (('j',10)), but this isn't a huge
> deal.
>
> This works and is clear, but it makes copies of items:
>
> >>> ii = D.items()
> >>> for i in range (0, len(ii), N):

>
> ... print ii[i:i+N]
> ...
> [('a', 1), ('c', 3), ('b', 2)]
> [('e', 5), ('d', 4), ('g', 7)]
> [('f', 6), ('i', 9), ('h', 8)]
> [('j', 10)]
>

groupby?

import itertools

D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9,
'j':10}
N = 3

it = itertools.groupby(enumerate(D.items()), lambda t: int(t[0]/N))

for each in it:
print tuple(t[1] for t in each[1])

--
Hope this helps,
Steven

, Mar 14, 2008
4. ### Arnaud DelobelleGuest

On Mar 14, 1:34 am, Noah <> wrote:
> What is the fastest way to select N items at a time from a dictionary?
> I'm iterating over a dictionary of many thousands of items.
> I want to operate on only 100 items at a time.
> I want to avoid copying items using any sort of slicing.
> Does itertools copy items?
>
> This works, but is ugly:
>
> >>> from itertools import *
> >>> D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9, 'j':10}
> >>> N = 3
> >>> for G in izip(*[chain(D.items(), repeat(None, N-1))]*N):

This solution matches exactly the one proposed in itertools. The
following is an extract from http://docs.python.org/lib/itertools-functions.html.

Note, the left-to-right evaluation order of the iterables is
guaranteed. This makes possible an idiom for clustering a data series
into n-length groups using "izip(*[iter(s)]*n)". For data that doesn't
fit n-length groups exactly, the last tuple can be pre-padded with
fill values using "izip(*[chain(s, [None]*(n-1))]*n)".

--
Arnaud

Arnaud Delobelle, Mar 14, 2008