Counter for items in lists in lists?

Discussion in 'Python' started by Charlotte Henkle, Sep 25, 2004.

  1. Hello;

    I'm pondering how to count the number of times an item appears in
    total in a nested list. For example:

    myList=[[a,b,c,d],[a,f,g,h],[a,b,x,y]]

    I'd like to know that a appeared three times, and b appeared twice,
    and the rest appeard only once.

    Every solution I've touched on so far has seemed clumsy. Does anyone
    have any thoughts for a down and dirty solution to this?

    Many thanks...

    /Charlotte
     
    Charlotte Henkle, Sep 25, 2004
    #1
    1. Advertising

  2. Charlotte Henkle <charlotte <at> fgm.com> writes:
    > I'm pondering how to count the number of times an item appears in
    > total in a nested list.


    How about this:

    >>> myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']]
    >>> def count(item):

    .... if not isinstance(item, list):
    .... return {item:1}
    .... counts = {}
    .... for i in item:
    .... for key, ct in count(i).items():
    .... counts[key] = counts.get(key, 0) + ct
    .... return counts
    ....
    >>> count(myList)

    {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1}

    Steve
     
    Steven Bethard, Sep 25, 2004
    #2
    1. Advertising

  3. Charlotte Henkle

    Bryan Guest

    Steven Bethard wrote:
    > Charlotte Henkle <charlotte <at> fgm.com> writes:
    >
    >>I'm pondering how to count the number of times an item appears in
    >>total in a nested list.

    >
    >
    > How about this:
    >
    >
    >>>>myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']]
    >>>>def count(item):

    >
    > ... if not isinstance(item, list):
    > ... return {item:1}
    > ... counts = {}
    > ... for i in item:
    > ... for key, ct in count(i).items():
    > ... counts[key] = counts.get(key, 0) + ct
    > ... return counts
    > ...
    >
    >>>>count(myList)

    >
    > {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1}
    >
    > Steve
    >


    or maybe a less general approach might work if the nested list is always one deep:

    >>> myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']]
    >>> tmp = []
    >>> d = {}
    >>> for item in myList: tmp += item
    >>> for key in tmp: d[key] = d.get(key, 0) + 1
    >>> d

    {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1}


    bryan
     
    Bryan, Sep 25, 2004
    #3
  4. Bryan <belred1 <at> yahoo.com> writes:
    > or maybe a less general approach might work if the nested list is always one
    > deep:
    >
    > >>> myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']]
    > >>> tmp = []
    > >>> d = {}
    > >>> for item in myList: tmp += item
    > >>> for key in tmp: d[key] = d.get(key, 0) + 1
    > >>> d

    > {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1}


    Yeah, if that's the case, you don't really even need to construct a temporary
    list:

    >>> myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']]
    >>> counts = {}
    >>> for lst in myList:

    .... for item in lst:
    .... counts[item] = counts.get(item, 0) + 1
    ....
    >>> counts

    {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1}

    Steve
     
    Steven Bethard, Sep 25, 2004
    #4
  5. Charlotte Henkle <> wrote:

    > Hello;
    >
    > I'm pondering how to count the number of times an item appears in
    > total in a nested list. For example:
    >
    > myList=[[a,b,c,d],[a,f,g,h],[a,b,x,y]]
    >
    > I'd like to know that a appeared three times, and b appeared twice,
    > and the rest appeard only once.


    Two problems enmeshed, here: how to count items of various values if you
    have a way to iterate over the values; and how to iterate on a nested
    list. You may code them together, but factoring them as two reusable
    functions is preferable -- you'll end up with more reusable building
    blocks than by coding them as one monolithic function. Python is good
    at allowing such factorings. There may be a third sub-problem, about
    how best to _present_ the counts -- hard to say from your statement
    about the problem, whether that's a problem, but, see later.

    You're not telling us what is the type of the items. If they're strings
    or numbers, basically anything that can be a key to a dictionary, the
    counting problem is best solved by building a dictionary:

    def itemcount(sequence):
    d = {}
    for item in sequence: d[item] = 1 + d.get(item, 0)
    return d

    this kind of function is known as an 'accumulator': given a sequence it
    loops over it computing/building some resulting object and returns it.
    Not a specific Python term, but a nice one to keep in mind.

    You don't clarify the structure of your list -- is it always two levels
    as you present it, could it be more, are all items always going to be at
    the same level or could some level have a mixture of items and sublists?
    Taking the simplest hypothesis -- always two levels and all items at the
    second level, the first one being made up of sublists only:

    def itemseq(nested_list):
    for sublist in nested_list:
    for item in sublist: yield item

    this kind of function is known as a 'simple generator': it produces a
    sequence you may loop on with a for statement, pass to an accumulator,
    etc. This is a specific Python term, and the keyword that lets you tell
    whether a function is a generator is 'yield'.

    So, given these two functions -- or enhanced equivalents if your list or
    its items don't meet these hypotheses -- the dict of counts is easy to
    obtain:

    myCounts = itemcount(itemseq(myList))

    Now for the presentation issue -- you say you want to receive the
    information "that a appeared three times, and b appeared twice,
    and the rest appeard only once". This appears to mean that you want to
    get a sort of items by count, with items appearing only once just
    summarized as a group, not listed individually. If I correctly divined
    your intention, this might help:

    def summarize_counts(counts_dict):
    number_of_singletons = 0
    count_per_item_list = []
    for item, count in counts_dict.iteritems():
    if count == 1: number_of_singletons += 1
    else: count_per_item_list.append((count, item))
    count_per_item_list.sort()
    for count, item in count_per_item_list:
    print 'Item %r appeared %d times' % (item, count)
    print '%d items appeared only once each' % number_of_singletons


    So, overall, if I've read your mind correctly,

    summarize_counts(itemcount(itemseq(myList)))

    should do exactly what you required. However, mindreading is always an
    iffy business (that's part of why the Zen of Python says "explicit is
    better than implicit"...:). So, if you can clarify the various
    ambiguous and uncertain details of your specs, which forced me to guess
    (violating another Zen injunction: "in the face of ambiguity, resist the
    temptation to guess"), over and over, at what exactly you meant, I'm
    confident that the various pieces presented here can be enhanced to meet
    whatever your requirements actually _are_.


    Alex
     
    Alex Martelli, Sep 25, 2004
    #5
  6. If you want to be able to use any iterable, you can use itertools.chain().

    >>> import itertools
    >>> myList=[['a','b','c','d'],['a','f','g','h'],['a','b','x','y']]
    >>> counts = {}
    >>> for item in itertools.chain(*myList):

    .... counts[item] = counts.get(item, 0) + 1
    ....
    >>> counts

    {'a': 3, 'c': 1, 'b': 2, 'd': 1, 'g': 1, 'f': 1, 'h': 1, 'y': 1, 'x': 1}
    --
    Michael Hoffman
     
    Michael Hoffman, Sep 25, 2004
    #6
  7. Ahhhh...thank you all!

    Actually, I wanted to be able to count the items in the list to check
    my thinking on a problem, and as it turns out, my thinking was
    incorrect. So I have a follow up question now ;)

    Some background:

    I was given a problem by a friend. He is in a tennis group that has 6
    members. They play once a week for 10 weeks. They were having
    trouble generating a schedule that was random but fair (IE everyone
    gets to play a base number of times , and the mod is evenly and
    randomly distributed).

    I decided that I wanted to abstract as much of the problem as possible
    so that it would be reusable for other groups (IE to solve it for
    games with N players, Y times over X weeks). And then I decided I
    wanted this to be my second program in python, so forgive my
    clumsiness ;)

    My first pass through was short and simple: Figure out the total
    number of games that will be played, and then use something like this:

    gamePlayers=random.sample(templist, players_per_game)

    to fill up all the game slots. Neat, simple.

    The problem I found with this solution was that it didn't give me a
    weighted list for remainder. The same person could get in all of the
    "extra" games.

    So instead I decided to fill games from my list of players by removal
    until I had a number of players left less than a full game, and then
    have a back-up list. The back up list would act like a queue, but if
    you were already in the game, it would take the next guy. If you got
    into a game from the back-up list, you were sent to the end of the
    line. My 2 lines grew to 50 plus ;)

    Sadly, this isn't quite working either. Now that I can print out the
    players, I see that generally things work, but every once in a while,
    I get an unexpected result. For example, with 6 players, 10 weeks, 2
    games per week, I would expect combinations of:

    {'a': 13, 'c': 14, 'b': 14, 'e': 13, 'd': 13, 'f': 13}

    (4 13s and 2 14s)

    But sometimes I get:

    {'a': 13, 'c': 14, 'b': 14, 'e': 14, 'd': 12, 'f': 13}

    (2 13s, 3 14s and a 12)

    That 12 breaks the planned even distribution. :(

    I suspect the problem comes from the random pulling in the first part,
    but I'm not sure. I also feel that some sections (espcially the
    print) don't have a "python-grace", so I would love some suggestions
    to make them more...slithery? :)

    To make a long story longer, here's the code:

    ....#! /usr/bin/env python
    ....
    ....#A program to randomly fill a tennis schedule
    ....#
    ....#The original theory looked like this:
    ....# gamePlayers=random.sample(templist, players_per_game)
    ....# print gamePlayers
    ....#
    ....#But that didn't give a weighted list for extra games
    ....
    ....import random
    ....
    ....#Eventually these will get set dynamically
    ....number_of_weeks=10
    ....players=['a', 'b', 'c', 'd', 'e', 'f']
    ....games_per_week=2
    ....players_per_game=4
    ....games=number_of_weeks*games_per_week
    ....
    ....#this will be used to pull off "extra game" players
    ....backupList=players[:]
    ....random.shuffle(backupList)
    ....
    ....#a templist so we can modify it.
    ....templist=players[:]
    ....
    ....#our finished product:
    ....finishedList=[]
    ....
    ....while len(finishedList)!=games:
    .... if len(templist)>=players_per_game:
    .... gamePlayers=[]
    .... while len(gamePlayers)!=players_per_game:
    .... randomNumber=random.randint(0, len(templist)-1)
    .... potentialPlayer=templist.pop(randomNumber)
    .... gamePlayers.append(potentialPlayer)
    .... finishedList.append(gamePlayers)
    .... else:
    .... gamePlayers=templist
    .... print "I am the leftover game players", gamePlayers
    .... print "I am the list of backup players", backupList
    .... count=0
    .... while len(gamePlayers)!=players_per_game:
    .... print "I am a potential player"
    .... potentialPlayer=backupList[count]
    .... print potentialPlayer
    .... print "checking to see if I'm in the game"
    .... if potentialPlayer not in gamePlayers:
    .... print "I do not think the player is in the game"
    .... print "I am the back-up list", backupList
    .... potentialPlayer=backupList.pop(count)
    .... gamePlayers.append(potentialPlayer)
    .... backupList.append(potentialPlayer)
    .... print "I am the back-up list after reorder", backupList
    .... print "I am the gamePlayers after test and insertion",
    gamePlayers
    ....
    .... else:
    .... print "I think that player is in the game"
    .... count+=1
    .... finishedList.append(gamePlayers)
    .... templist=players[:]
    ....
    ....#count the list (thank you, Steve!
    ....http://groups.google.com/groups?dq=...lang.python&ie=UTF-8&hl=en&btnG=Google+Search
    ....
    ....def count(item):
    .... if not isinstance(item, list):
    .... return {item:1}
    .... counts = {}
    .... for i in item:
    .... for key, ct in count(i).items():
    .... counts[key] = counts.get(key, 0) + ct
    .... return counts
    ....
    ....def printList(weeks, games, list):
    .... x=0
    .... while x!=weeks:
    .... y=0
    .... print "Week: ", x+1
    .... while y<games:
    .... print "Game ",y+1, " players are ", list[y]
    .... y+=1
    .... x+=1
    ....
    ....#printing out and counting the final list
    ....
    ....printList(number_of_weeks, games_per_week, finishedList)
    ....print finishedList.count("a")
     
    Charlotte Henkle, Sep 25, 2004
    #7
  8. Charlotte Henkle <> wrote:
    ...
    > I was given a problem by a friend. He is in a tennis group that has 6
    > members. They play once a week for 10 weeks. They were having
    > trouble generating a schedule that was random but fair (IE everyone
    > gets to play a base number of times , and the mod is evenly and
    > randomly distributed).


    How many of the 6 members can play on each of those once-weekly
    occasions?

    > My first pass through was short and simple: Figure out the total
    > number of games that will be played, and then use something like this:
    >
    > gamePlayers=random.sample(templist, players_per_game)
    >
    > to fill up all the game slots. Neat, simple.
    >
    > The problem I found with this solution was that it didn't give me a
    > weighted list for remainder. The same person could get in all of the
    > "extra" games.


    Yep, you're being a bit "too random" there -- sampling with repetition.

    Consider random.shuffle: puts the players in arbitrary order but without
    repetition. Just pick whatever number at a time, say from the back --
    doing so from the front would be just as good, but slower, so it might
    as well be from the back. When you're trying to pick more players than
    are left in the shuffled list you're peeling stuff from, make sure those
    get in then reshuffle the original list minus those who are already in,
    for the next time.

    Ah, natural language is SO complicated, let's put it in pseudocode.
    First, you'll need sets, since obviously there are set operations
    involved (e.g., "minus those who are already in" is goofy language for a
    set difference operation). So let's make sure we have sets. In Python
    2.4, they're built-in; in 2.3, they're in module sets of the standard
    library. To ensure you run optimally in either release, star your
    program with:

    import random
    try: set=set
    except NameError: from sets import Set as set

    Now we're ready to write our pseudocode. At each step, i.e. each time
    another weekly game is planned, we'll have: some list X of players who
    haven't played yet in this 'round' in shuffled order; a request for N
    players for tonight; and of course the original set P of players. If N
    is less than the items in X we're in an easy case:
    if N < len(X):
    yield X[-N:]
    del X[-N:]
    i.e. we just yield the last N items of list X then remove those same
    items from further consideration in this round.

    If more players are going to play tonight, than there are items in X,
    it's a little bit more complicated. We need to prepare a shuffled list
    of all players except those who are alreayd in X...:
    else:
    newX = list(P-set(X))
    random.shuffle(newX)
    and then yield the contents of X and the needed N-len(X) items of newX:
    fromNewX = N-len(X)
    yield X + newX[-fromNewX:]
    finally, we need to remove from newX the items we just yielded, and set
    the remainder as the new value of X for the next evening of play:
    X = newX[:-fromNewX]

    Great, but how do we get started? Heh, funny, if X is empty we're just
    in a starting position -- len(X) is 0 so we're gonna prepare newX, and
    we're gonna prepare it from the whole of set P... just as we want. So
    all the initialization we need is to make sure that P _is_ a set and
    that X is empty:
    P = set(P)
    X = []

    Great -- now that we have our pseudocode, how do we turn it into actual
    working code? That's where Python shines, for the pseudocode we jotted
    down to help our thinking as we reasoned about the problem IS working
    Python code. We could rename X, P and N, but apart from that, here is
    the generator we need...:

    def choose_players(P, N):
    assert len(P) >= N > 0
    P = set(P)
    X = []
    while True:
    if N < len(X):
    yield X[-N:]
    del X[-N:]
    else:
    newX = list(P-set(X))
    random.shuffle(newX)
    fromNewX = N-len(X)
    yield X + newX[-fromNewX:]
    X = newX[:-fromNewX]

    I've added a check that N makes sense (0 or fewer players, or more than
    the club's membership, being needed each time, obviously makes no
    sense!) and an unending loop around the whole thing. This is a
    nonterminating generator -- it won't ever stop iterating by itself; it
    needs to be called the appropriate number of times. I find that more
    useful than passing the generator a parameter telling it how many times
    to loop (you'd just need to add such a parameter T and change the while
    to 'for i in xrange(T):' -- but if you need to do that, you could just
    as well use iterator.islice or whatever to limit the numbers of steps
    over the generator...).


    Here's an example use...:

    for i, players in enumerate(choose_players(range(13), 4))):
    print players
    if i > 9: break

    Here there are 13 members, named 0 to 12, and each week 4 can play, for
    11 weeks. You can run it a few times to eyeball it for correctness, but
    of course you'll want to check it out better eventually, for example:

    for number_of_tests in range(10):
    number_of_plays = 13*[0]
    for i, players in enumerate(choose_players(range(13), 4)):
    for p in players: number_of_plays[p] += 1
    if i > 9: break
    for i in range(max(number_of_plays)+1):
    print '%d:%d'%(i, number_of_plays.count(i)),
    print

    You'll soon see that the results aren't as even as we wished, e.g.:

    kallisti:~/downloads alex$ python2.3 pla.py
    0:0 1:0 2:2 3:4 4:7
    0:0 1:0 2:1 3:6 4:6
    0:0 1:0 2:0 3:8 4:5
    0:0 1:0 2:0 3:8 4:5
    0:0 1:0 2:2 3:4 4:7
    0:0 1:0 2:1 3:6 4:6
    0:0 1:0 2:1 3:6 4:6
    0:0 1:0 2:0 3:8 4:5
    0:0 1:0 2:0 3:8 4:5
    0:0 1:0 2:0 3:8 4:5

    ooops -- the 44 available 'slots' are NOT always divided "three apiece
    and a fourth one for 5 lucky ones of the 13 members" -- sometimes one or
    even two members play only twice! So what's wrong with the pseudode we
    had so nicely worked out...? Hmmm, clearly the 'left over' 13th player
    who first gets to play on the 4th night isn't given a chance to play
    again soon enough... he should go right back into the new X. Can we fix
    that? We can sure try, just change in the above:
    newX = list(P-set(X))
    random.shuffle(newX)
    fromNewX = N-len(X)
    yield X + newX[-fromNewX:]
    X = newX[:-fromNewX]
    into:
    newX = list(P-set(X))
    random.shuffle(newX)
    fromNewX = N-len(X)
    yield X + newX[-fromNewX:]
    X += newX[:-fromNewX]
    random.shuffle(X)

    Ah, NOW our tests are more satisfactory -- huge runs of
    0:0 1:0 2:0 3:8 4:5
    just as we wanted! Of course, that's no mathematical _proof_ that our
    procedure is correct -- indeed, it's QUITE interesting to provide such a
    proof (but I have no space left in the margins of this post to do
    so...:).


    I hope that by showing you how a little program can evolve in real life,
    from specs to thinking in pseudocode to Python, eyeball-testing, better
    testing (with counting;-), correction of some problem, ..., I may have
    helped you on your way to "thinking like a programmer"!-)


    Alex
     
    Alex Martelli, Sep 26, 2004
    #8
  9. > To make a long story longer, here's the code:

    Whoops...Correction:

    ....#! /usr/bin/env python
    ....#Charlotte Henkle
    ....
    ....#A program to randomly fill a tennis schedule
    ....#
    ....#The original theory looked like this:
    ....# gamePlayers=random.sample(templist, players_per_game)
    ....# print gamePlayers
    ....#
    ....#But that didn't give a weighted list for extra games
    ....
    ....import random
    ....
    ....#Eventually these will get set dynamically
    ....number_of_weeks=10
    ....players=['a', 'b', 'c', 'd', 'e', 'f', 'g']
    ....games_per_week=2
    ....players_per_game=4
    ....games=number_of_weeks*games_per_week
    ....
    ....#this will be used to pull off "extra game" players
    ....backupList=players[:]
    ....random.shuffle(backupList)
    ....
    ....#a templist so we can modify it.
    ....templist=players[:]
    ....
    ....#our finished product:
    ....finishedList=[]
    ....
    ....while len(finishedList)!=games:
    .... if len(templist)>=players_per_game:
    .... gamePlayers=[]
    .... while len(gamePlayers)!=players_per_game:
    .... randomNumber=random.randint(0,
    len(templist)-1)
    .... potentialPlayer=templist.pop(randomNumber)
    .... gamePlayers.append(potentialPlayer)
    .... finishedList.append(gamePlayers)
    .... else:
    .... gamePlayers=templist
    .... print "I am the leftover game players", gamePlayers
    .... print "I am the list of backup players", backupList
    .... count=0
    .... while len(gamePlayers)!=players_per_game:
    .... print "I am a potential player "
    .... potentialPlayer=backupList[count]
    .... print potentialPlayer
    .... print "checking to see if I'm in the game"
    .... if potentialPlayer not in gamePlayers:
    .... print "I do not think the player is
    in the game"
    .... print "I am the back-up list",
    backupList
    ....
    potentialPlayer=backupList.pop(count)
    .... gamePlayers.append(potentialPlayer)
    .... backupList.append(potentialPlayer)
    .... print "I am the back-up list after
    reorder", backupList
    .... print "I am the gamePlayers after
    test and insertion", gamePlayers
    ....
    .... else:
    .... print "I think that player is in
    the game"
    .... count+=1
    .... finishedList.append(gamePlayers)
    .... templist=players[:]
    ....
    ....#count the list (thank you, Steve!
    ....http://groups.google.com/groups?dq=...lang.python&ie=UTF-8&hl=en&btnG=Google+Search
    ....
    ....def count(item):
    .... if not isinstance(item, list):
    .... return {item:1}
    .... counts = {}
    .... for i in item:
    .... for key, ct in count(i).items():
    .... counts[key] = counts.get(key, 0) + ct
    .... return counts
    ....
    ....def printList(weeks, games, list):
    .... x=0
    .... y=0
    .... index=0
    .... while x!=weeks:
    .... print "Week: ", x+1
    .... y=0
    .... while y<games:
    .... print "Game ",y+1, " players are ",
    list[index]
    .... y+=1
    .... index+=1
    .... x+=1
    ....
    ....#printing out and counting the final list
    ....
    ....printList(number_of_weeks, games_per_week, finishedList)
    ....print count(finishedList)
     
    Charlotte Henkle, Sep 26, 2004
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. The Eeediot
    Replies:
    3
    Views:
    2,282
    =?Utf-8?B?UnVsaW4gSG9uZw==?=
    Dec 22, 2004
  2. =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==

    List of lists of lists of lists...

    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==, May 8, 2006, in forum: Python
    Replies:
    5
    Views:
    432
    =?UTF-8?B?w4FuZ2VsIEd1dGnDqXJyZXogUm9kcsOtZ3Vleg==
    May 15, 2006
  3. ardief
    Replies:
    14
    Views:
    753
    Paddy
    Feb 3, 2007
  4. George2
    Replies:
    1
    Views:
    833
    Alf P. Steinbach
    Jan 31, 2008
  5. Alexzive
    Replies:
    6
    Views:
    621
    Chris Rebert
    Mar 20, 2009
Loading...

Share This Page