How do I iterate over items in a dict grouped by N number ofelements?

Discussion in 'Python' started by Noah, Mar 14, 2008.

  1. Noah

    Noah Guest

    What is the fastest way to select N items at a time from a dictionary?
    I'm iterating over a dictionary of many thousands of items.
    I want to operate on only 100 items at a time.
    I want to avoid copying items using any sort of slicing.
    Does itertools copy items?

    This works, but is ugly:

    >>> from itertools import *
    >>> D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9, 'j':10}
    >>> N = 3
    >>> for G in izip(*[chain(D.items(), repeat(None, N-1))]*N):

    .... print G
    ....
    (('a', 1), ('c', 3), ('b', 2))
    (('e', 5), ('d', 4), ('g', 7))
    (('f', 6), ('i', 9), ('h', 8))
    (('j', 10), None, None)

    I'd prefer the last sequence not return None
    elements and instead just return (('j',10)), but this isn't a huge
    deal.

    This works and is clear, but it makes copies of items:

    >>> ii = D.items()
    >>> for i in range (0, len(ii), N):

    .... print ii[i:i+N]
    ....
    [('a', 1), ('c', 3), ('b', 2)]
    [('e', 5), ('d', 4), ('g', 7)]
    [('f', 6), ('i', 9), ('h', 8)]
    [('j', 10)]

    --
    Noah
    Noah, Mar 14, 2008
    #1
    1. Advertising

  2. Noah

    Paul Rubin Guest

    Re: How do I iterate over items in a dict grouped by N number of elements?

    Noah <> writes:
    > What is the fastest way to select N items at a time from a dictionary?
    > I'm iterating over a dictionary of many thousands of items.
    > I want to operate on only 100 items at a time.
    > I want to avoid copying items using any sort of slicing.


    I'd do something like (untested):

    def groups(seq, n):
    while True:
    s = list(itertools.islice(seq, n))
    if not s: return
    yield s

    items = d.iteritems()
    for g in groups(items, 100):
    operate_on (g)

    > Does itertools copy items?


    I don't understand this question.
    Paul Rubin, Mar 14, 2008
    #2
    1. Advertising

  3. Noah

    Guest

    On Mar 13, 6:34 pm, Noah <> wrote:
    > What is the fastest way to select N items at a time from a dictionary?
    > I'm iterating over a dictionary of many thousands of items.
    > I want to operate on only 100 items at a time.
    > I want to avoid copying items using any sort of slicing.
    > Does itertools copy items?
    >
    > This works, but is ugly:
    >
    > >>> from itertools import *
    > >>> D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9, 'j':10}
    > >>> N = 3
    > >>> for G in izip(*[chain(D.items(), repeat(None, N-1))]*N):

    >
    > ... print G
    > ...
    > (('a', 1), ('c', 3), ('b', 2))
    > (('e', 5), ('d', 4), ('g', 7))
    > (('f', 6), ('i', 9), ('h', 8))
    > (('j', 10), None, None)
    >
    > I'd prefer the last sequence not return None
    > elements and instead just return (('j',10)), but this isn't a huge
    > deal.
    >
    > This works and is clear, but it makes copies of items:
    >
    > >>> ii = D.items()
    > >>> for i in range (0, len(ii), N):

    >
    > ... print ii[i:i+N]
    > ...
    > [('a', 1), ('c', 3), ('b', 2)]
    > [('e', 5), ('d', 4), ('g', 7)]
    > [('f', 6), ('i', 9), ('h', 8)]
    > [('j', 10)]
    >



    groupby?

    import itertools

    D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9,
    'j':10}
    N = 3

    it = itertools.groupby(enumerate(D.items()), lambda t: int(t[0]/N))

    for each in it:
    print tuple(t[1] for t in each[1])

    --
    Hope this helps,
    Steven
    , Mar 14, 2008
    #3
  4. On Mar 14, 1:34 am, Noah <> wrote:
    > What is the fastest way to select N items at a time from a dictionary?
    > I'm iterating over a dictionary of many thousands of items.
    > I want to operate on only 100 items at a time.
    > I want to avoid copying items using any sort of slicing.
    > Does itertools copy items?
    >
    > This works, but is ugly:
    >
    > >>> from itertools import *
    > >>> D = {'a':1, 'b':2, 'c':3, 'd':4, 'e':5, 'f':6, 'g':7, 'h':8, 'i':9, 'j':10}
    > >>> N = 3
    > >>> for G in izip(*[chain(D.items(), repeat(None, N-1))]*N):


    This solution matches exactly the one proposed in itertools. The
    following is an extract from http://docs.python.org/lib/itertools-functions.html.

    Note, the left-to-right evaluation order of the iterables is
    guaranteed. This makes possible an idiom for clustering a data series
    into n-length groups using "izip(*[iter(s)]*n)". For data that doesn't
    fit n-length groups exactly, the last tuple can be pre-padded with
    fill values using "izip(*[chain(s, [None]*(n-1))]*n)".

    --
    Arnaud
    Arnaud Delobelle, Mar 14, 2008
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Gogo
    Replies:
    1
    Views:
    2,088
    Sudsy
    Sep 4, 2003
  2. kbutterly
    Replies:
    1
    Views:
    474
    kbutterly
    Jan 16, 2007
  3. Drew
    Replies:
    19
    Views:
    1,334
    Duncan Booth
    Mar 15, 2007
  4. zr
    Replies:
    18
    Views:
    2,009
    James Kanze
    Mar 28, 2009
  5. Ken Fine
    Replies:
    4
    Views:
    133
    Evertjan.
    Apr 5, 2004
Loading...

Share This Page