splitting a list into n groups

Discussion in 'Python' started by Rajarshi Guha, Oct 8, 2003.

  1. Hi,
    is there an efficient (pythonic) way in which I could split a list into
    say 5 groups? By split I mean the the first x members would be one group,
    the next x members another group and so on 5 times. (Obviously x =
    lengthof list/5)

    I have done this by a simple for loop and using indexes into the list.
    But it does'nt seemm very elegant

    Thanks,
     
    Rajarshi Guha, Oct 8, 2003
    #1
    1. Advertising

  2. Rajarshi Guha

    Tim Hochberg Guest

    Rajarshi Guha wrote:
    > Hi,
    > is there an efficient (pythonic) way in which I could split a list into
    > say 5 groups? By split I mean the the first x members would be one group,
    > the next x members another group and so on 5 times. (Obviously x =
    > lengthof list/5)
    >
    > I have done this by a simple for loop and using indexes into the list.
    > But it does'nt seemm very elegant


    Depending on what you're doing, this may or may not be appropriate, but
    you might want to look at Numeric (or it's eventual successor numarray).
    In this case, you could use reshape to map your list into a n by x
    array, which you can treat like a nested list with respect to indexing.

    >>> import Numeric
    >>> data = range(25)
    >>> split = Numeric.reshape(data, (5,5))
    >>> split

    array([[ 0, 1, 2, 3, 4],
    [ 5, 6, 7, 8, 9],
    [10, 11, 12, 13, 14],
    [15, 16, 17, 18, 19],
    [20, 21, 22, 23, 24]])
    >>> split[0]

    array([0, 1, 2, 3, 4])
    >>> list(split[4])

    [20, 21, 22, 23, 24]


    You can sometimes use Numeric with nonnumeric data, but it tends to be
    quircky and often is not worth the trouble, but if you've got numeric
    data, try it out.

    -tim
     
    Tim Hochberg, Oct 8, 2003
    #2
    1. Advertising

  3. Rajarshi Guha

    Eddie Corns Guest

    Rajarshi Guha <> writes:

    >Hi,
    > is there an efficient (pythonic) way in which I could split a list into
    >say 5 groups? By split I mean the the first x members would be one group,
    >the next x members another group and so on 5 times. (Obviously x =
    >lengthof list/5)


    How about:

    ------------------------------------------------------------
    def span (x, n):
    return range(0, x, n)

    def group (l, num):
    n = (len(l)/num) + 1
    return [l[s:s+n] for s in span (len(l), n)]

    # test cases
    l1 = range(100)
    print group (l1, 5)
    print group (l1, 6)
    print group (l1, 4)
    ------------------------------------------------------------

    Even though span could be folded in to make it shorter, abstracting it out
    allows you to name and document what it does (return the start index of each
    SPAN of n items). Also it doesn't worry too much about the last list being
    the same size as the rest. I'm too lazy to check the boundary conditions
    properly (an exercise for the reader). A completely non lazy person would
    possibly define the more primitive group_by_n:

    def group_by_n (l, n):
    return [l[s:s+n] for s in span (len(l), n)]

    and define group_into_num_pieces in terms of that:

    def group_into_num_pieces (l, num):
    return group_by_n (l,(len(l)/num) + 1)

    giving you 3 useful functions for the price of 1, but I'm too lazy for that.

    Eddie
     
    Eddie Corns, Oct 8, 2003
    #3
  4. Rajarshi Guha

    Terry Reedy Guest

    "Rajarshi Guha" <> wrote in message
    news:p...
    > Hi,
    > is there an efficient (pythonic) way in which I could split a list

    into
    > say 5 groups? By split I mean the the first x members would be one

    group,
    > the next x members another group and so on 5 times. (Obviously x =
    > lengthof list/5)
    >
    > I have done this by a simple for loop and using indexes into the

    list.
    > But it does'nt seemm very elegant


    Does it work correctly for all input cases? Is it readable? Is it
    acceptibly fast? If so, it is probably 'Pythonic' enough.

    TJR
     
    Terry Reedy, Oct 8, 2003
    #4
  5. Rajarshi Guha

    Paul Rubin Guest

    Rajarshi Guha <> writes:
    > is there an efficient (pythonic) way in which I could split a list into
    > say 5 groups? By split I mean the the first x members would be one group,
    > the next x members another group and so on 5 times. (Obviously x =
    > lengthof list/5)


    groups = [a[i:i+(len(a)//5)] for i in range(5)]
     
    Paul Rubin, Oct 8, 2003
    #5
  6. Rajarshi Guha

    Peter Otten Guest

    Paul Rubin wrote:

    > Rajarshi Guha <> writes:
    >> is there an efficient (pythonic) way in which I could split a list into
    >> say 5 groups? By split I mean the the first x members would be one group,
    >> the next x members another group and so on 5 times. (Obviously x =
    >> lengthof list/5)

    >
    > groups = [a[i:i+(len(a)//5)] for i in range(5)]



    >>> for k in range(10):

    .... a = range(k)
    .... print [a[i:i+(len(a)//5)] for i in range(5)]
    ....
    [[], [], [], [], []]
    [[], [], [], [], []]
    [[], [], [], [], []]
    [[], [], [], [], []]
    [[], [], [], [], []]
    [[0], [1], [2], [3], [4]]
    [[0], [1], [2], [3], [4]]
    [[0], [1], [2], [3], [4]]
    [[0], [1], [2], [3], [4]]
    [[0], [1], [2], [3], [4]]
    >>>


    Is that what you expected?

    Peter
     
    Peter Otten, Oct 8, 2003
    #6
  7. Rajarshi Guha

    Peter Otten Guest

    Eddie Corns wrote:

    > Rajarshi Guha <> writes:
    >
    >>Hi,
    >> is there an efficient (pythonic) way in which I could split a list into
    >>say 5 groups? By split I mean the the first x members would be one group,
    >>the next x members another group and so on 5 times. (Obviously x =
    >>lengthof list/5)

    >
    > How about:
    >
    > ------------------------------------------------------------
    > def span (x, n):
    > return range(0, x, n)
    >
    > def group (l, num):
    > n = (len(l)/num) + 1
    > return [l[s:s+n] for s in span (len(l), n)]
    >
    > # test cases
    > l1 = range(100)
    > print group (l1, 5)
    > print group (l1, 6)
    > print group (l1, 4)
    > ------------------------------------------------------------


    More test cases :)

    for k in range(20):
    l1 = range(k)
    lol = group(l1, 5)
    print len(lol), lol

    0 []
    1 [[0]]
    2 [[0], [1]]
    3 [[0], [1], [2]]
    4 [[0], [1], [2], [3]]
    3 [[0, 1], [2, 3], [4]]
    3 [[0, 1], [2, 3], [4, 5]]
    4 [[0, 1], [2, 3], [4, 5], [6]]
    4 [[0, 1], [2, 3], [4, 5], [6, 7]]
    5 [[0, 1], [2, 3], [4, 5], [6, 7], [8]]
    4 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
    4 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10]]
    4 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]]
    5 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11], [12]]
    5 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11], [12, 13]]
    4 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14]]
    4 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15]]
    5 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16]]
    5 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17]]
    5 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17,
    18]]

    Peter
     
    Peter Otten, Oct 8, 2003
    #7
  8. Rajarshi Guha

    Paul Rubin Guest

    Peter Otten <> writes:
    > > groups = [a[i:i+(len(a)//5)] for i in range(5)]

    >
    >
    > >>> for k in range(10):

    > ... a = range(k)
    > ... print [a[i:i+(len(a)//5)] for i in range(5)]
    > ...
    > [[], [], [], [], []]

    ....
    > Is that what you expected?


    Bah! Nope, got confused and mis-wrote the loop contents (started
    thinking of donig it one way, then another, then got the two mixed up
    while typing). Also, wasn't concerned about the case where the list
    can't be chopped into equal length parts. Try this:

    assert len(a) > 0 and len(a) % 5 == 0
    d = len(a) // 5
    groups = [a[d*i:d*(i+1)] for i in range(5)]

    Handling the case where the pieces end up unequal because len(a) isn't
    a multiple of 5 gets a little messy:

    d = (len(a) + 4) // 5
    groups = [a[d*i:d*(i+1)] for i in range(5)]

    when a = range(6) gives

    [[0, 1], [2, 3], [4, 5], [], []]

    It's not clear to me whether that's good or bad: something like

    [[0, 1], [2], [3], [4], [5]]

    is probably better, but that would depend on the application.
     
    Paul Rubin, Oct 8, 2003
    #8
  9. | ....
    | More test cases
    | ....

    Peter ....

    I use Outlook Express with the Quote-Fix add on
    for reading news groups and got quite a chuckle
    when I saw how your test-case code showed up here ....

    http://fastq.com/~sckitching/Python/python_lol.png
    [ 2 KB ]

    --
    Cousin Stanley
    Human Being
    Phoenix, Arizona
     
    Cousin Stanley, Oct 8, 2003
    #9
  10. Rajarshi Guha <> wrote in message news:<>...
    > Hi,
    > is there an efficient (pythonic) way in which I could split a list into
    > say 5 groups? By split I mean the the first x members would be one group,
    > the next x members another group and so on 5 times. (Obviously x =
    > lengthof list/5)
    >
    > I have done this by a simple for loop and using indexes into the list.
    > But it does'nt seemm very elegant
    >
    > Thanks,


    I had the same problem a while back. Here's the function I wrote:

    def GroupList(inlist, step=2):
    outlist = []
    ents = len(inlist)
    if ents % step != 0:
    print "In GroupList, the length of list ", inlist, " isn't
    evenly"
    print "divisible by step %i" % step
    sys.exit(4)
    maininds = filter(lambda x: x % step == 0, range(ents))
    for i in maininds:
    currlist = []
    for j in range(step):
    currlist.append(inlist[i+j])
    outlist.append(currlist)
    return outlist

    As you can see, I made some assumptions (the default n is 2, the list
    must be evenly divisible by the step, and so on) and I haven't heavily
    debugged it, but it might help you out.
     
    Corey Coughlin, Oct 9, 2003
    #10
  11. Rajarshi Guha

    Eddie Corns Guest

    Peter Otten <> writes:

    >Eddie Corns wrote:


    >> Rajarshi Guha <> writes:
    >>
    >>>Hi,
    >>> is there an efficient (pythonic) way in which I could split a list into
    >>>say 5 groups? By split I mean the the first x members would be one group,
    >>>the next x members another group and so on 5 times. (Obviously x =
    >>>lengthof list/5)

    >>
    >> How about:
    >>
    >> ------------------------------------------------------------
    >> def span (x, n):
    >> return range(0, x, n)
    >>
    >> def group (l, num):
    >> n = (len(l)/num) + 1
    >> return [l[s:s+n] for s in span (len(l), n)]
    >>
    >> # test cases
    >> l1 = range(100)
    >> print group (l1, 5)
    >> print group (l1, 6)
    >> print group (l1, 4)
    >> ------------------------------------------------------------


    >More test cases :)


    >for k in range(20):
    > l1 = range(k)
    > lol = group(l1, 5)
    > print len(lol), lol


    >0 []
    >1 [[0]]
    >2 [[0], [1]]
    >3 [[0], [1], [2]]
    >4 [[0], [1], [2], [3]]
    >3 [[0, 1], [2, 3], [4]]
    >3 [[0, 1], [2, 3], [4, 5]]
    >4 [[0, 1], [2, 3], [4, 5], [6]]
    >4 [[0, 1], [2, 3], [4, 5], [6, 7]]
    >5 [[0, 1], [2, 3], [4, 5], [6, 7], [8]]
    >4 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9]]
    >4 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10]]
    >4 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11]]
    >5 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11], [12]]
    >5 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11], [12, 13]]
    >4 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14]]
    >4 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15]]
    >5 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16]]
    >5 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17]]
    >5 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17,
    >18]]


    Critiscism first thing in the morning!, luckily I got out of bed on the right
    side this morning.

    def span (x, n):
    return range(0, x, n)

    def group (l, num):
    n = ((len(l))/num) + 1
    res = [l[s:s+n] for s in span (len(l), n)]
    return res + [None]*(num-len(res))

    # test cases
    for k in range(1,20):
    l1 = range(k)
    lol = group(l1, 5)
    print len(lol), lol

    5 [[0], None, None, None, None]
    5 [[0], [1], None, None, None]
    5 [[0], [1], [2], None, None]
    5 [[0], [1], [2], [3], None]
    5 [[0, 1], [2, 3], [4], None, None]
    5 [[0, 1], [2, 3], [4, 5], None, None]
    5 [[0, 1], [2, 3], [4, 5], [6], None]
    5 [[0, 1], [2, 3], [4, 5], [6, 7], None]
    5 [[0, 1], [2, 3], [4, 5], [6, 7], [8]]
    5 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9], None]
    5 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10], None]
    5 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11], None]
    5 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11], [12]]
    5 [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10, 11], [12, 13]]
    5 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14], None]
    5 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], None]
    5 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16]]
    5 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17]]
    5 [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18]]


    With new improved boundary condition checking as well. The OP can put in
    checks for 0 if it makes any sense. Short lists will need more sophisticated
    handling to avoid results like:

    [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14], None]

    We shouldn't expect too much from 3 lines of code, not even in Python.

    Eddie
     
    Eddie Corns, Oct 9, 2003
    #11
  12. Rajarshi Guha wrote:

    > Hi,
    > is there an efficient (pythonic) way in which I could split a list into
    > say 5 groups? By split I mean the the first x members would be one group,
    > the next x members another group and so on 5 times. (Obviously x =
    > lengthof list/5)
    >
    > I have done this by a simple for loop and using indexes into the list.
    > But it does'nt seemm very elegant


    from itertools import islice

    def split_into_n_groups(alist, n=5):
    it = iter(alist)
    x = len(alist)/n # note: will just drop the last len(alist) % n items
    for i in range(n):
    yield list(islice(it, x))

    print list(split_into_n_groups(range(23)))

    emits:

    [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17, 18,
    19]]

    Of course, if you always want to return a list of lists (not a general
    sequence -- iterator -- with list items) you can simplify this, e.g.:

    def split_into_n_groups(alist, n=5):
    it = iter(alist)
    x = len(alist)/n # note: will just drop the last len(alist) % n items
    return [ list(islice(it, x)) for i in range(n) ]

    and just "print split_into_n_groups(range(23))".


    Alex
     
    Alex Martelli, Oct 9, 2003
    #12
  13. Rajarshi Guha

    Peter Otten Guest

    Eddie Corns wrote:

    > Critiscism first thing in the morning!, luckily I got out of bed on the
    > right side this morning.


    The image link posted by Cousin Stanley should have cheered you up :)

    > We shouldn't expect too much from 3 lines of code, not even in Python.


    Guess why I didn't post a correction...

    Peter
     
    Peter Otten, Oct 9, 2003
    #13
  14. Rajarshi Guha

    Peter Otten Guest

    Alex Martelli wrote:

    > Rajarshi Guha wrote:
    >
    >> Hi,
    >> is there an efficient (pythonic) way in which I could split a list into
    >> say 5 groups? By split I mean the the first x members would be one group,
    >> the next x members another group and so on 5 times. (Obviously x =
    >> lengthof list/5)
    >>
    >> I have done this by a simple for loop and using indexes into the list.
    >> But it does'nt seemm very elegant

    >
    > from itertools import islice
    >
    > def split_into_n_groups(alist, n=5):
    > it = iter(alist)
    > x = len(alist)/n # note: will just drop the last len(alist) % n
    > items for i in range(n):
    > yield list(islice(it, x))
    >
    > print list(split_into_n_groups(range(23)))
    >
    > emits:
    >
    > [[0, 1, 2, 3], [4, 5, 6, 7], [8, 9, 10, 11], [12, 13, 14, 15], [16, 17,
    > [[18,
    > 19]]
    >
    > Of course, if you always want to return a list of lists (not a general
    > sequence -- iterator -- with list items) you can simplify this, e.g.:
    >
    > def split_into_n_groups(alist, n=5):
    > it = iter(alist)
    > x = len(alist)/n # note: will just drop the last len(alist) % n
    > items return [ list(islice(it, x)) for i in range(n) ]
    >
    > and just "print split_into_n_groups(range(23))".


    Now I know why I couldn't come up with a solution. I didn' t look into the
    itertools module :)
    Anyway, now you solved the main problem, here's how I would spread the
    dropped items over the groups:

    from itertools import islice

    def split_into_n_groups(alist, n=5, noEmptyGroups=True):
    it = iter(alist)
    d, m = divmod(len(alist), n)
    if d == 0 and noEmptyGroups:
    n = m
    for i in range(n):
    yield list(islice(it, d+(i<m)))

    N = 5
    for k in range(23):
    lol = list(split_into_n_groups(range(k), N))
    cnt = reduce(lambda x, y: x + len(y), lol, 0)
    assert cnt == k
    assert len(lol) == min(k, N)
    print lol

    Peter
     
    Peter Otten, Oct 9, 2003
    #14
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.

Share This Page