tallying occurrences in list

Discussion in 'Python' started by kj, Jun 4, 2010.

  1. kj

    kj Guest

    Task: given a list, produce a tally of all the distinct items in
    the list (for some suitable notion of "distinct").

    Example: if the list is ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b',
    'c', 'a'], then the desired tally would look something like this:

    [('a', 4), ('b', 3), ('c', 3)]

    I find myself needing this simple operation so often that I wonder:

    1. is there a standard name for it?
    2. is there already a function to do it somewhere in the Python
    standard library?

    Granted, as long as the list consists only of items that can be
    used as dictionary keys (and Python's equality test for hashkeys
    agrees with the desired notion of "distinctness" for the tallying),
    then the following does the job passably well:

    def tally(c):
    t = dict()
    for x in c:
    t[x] = t.get(x, 0) + 1
    return sorted(t.items(), key=lambda x: (-x[1], x[0]))

    But, of course, if a standard library solution exists it would be
    preferable. Otherwise I either cut-and-paste the above every time
    I need it, or I create a module just for it. (I don't like either
    of these, though I suppose that the latter is much better than the
    former.)

    So anyway, I thought I'd ask. :)

    ~K
    kj, Jun 4, 2010
    #1
    1. Advertising

  2. kj

    Paul Rubin Guest

    kj <> writes:
    > 1. is there a standard name for it?


    I don't know of one, or a stdlib for it, but it's pretty trivial.

    > def tally(c):
    > t = dict()
    > for x in c:
    > t[x] = t.get(x, 0) + 1
    > return sorted(t.items(), key=lambda x: (-x[1], x[0]))


    I like to use defaultdict and tuple unpacking for code like that:

    from collections import defaultdict
    def tally(c):
    t = defaultdict(int)
    for x in c:
    t[x] += 1
    return sorted(t.iteritems(), key=lambda (k,v): (-v, k))
    Paul Rubin, Jun 4, 2010
    #2
    1. Advertising

  3. kj

    Peter Otten Guest

    kj wrote:

    >
    >
    >
    >
    >
    > Task: given a list, produce a tally of all the distinct items in
    > the list (for some suitable notion of "distinct").
    >
    > Example: if the list is ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b',
    > 'c', 'a'], then the desired tally would look something like this:
    >
    > [('a', 4), ('b', 3), ('c', 3)]
    >
    > I find myself needing this simple operation so often that I wonder:
    >
    > 1. is there a standard name for it?
    > 2. is there already a function to do it somewhere in the Python
    > standard library?
    >
    > Granted, as long as the list consists only of items that can be
    > used as dictionary keys (and Python's equality test for hashkeys
    > agrees with the desired notion of "distinctness" for the tallying),
    > then the following does the job passably well:
    >
    > def tally(c):
    > t = dict()
    > for x in c:
    > t[x] = t.get(x, 0) + 1
    > return sorted(t.items(), key=lambda x: (-x[1], x[0]))
    >
    > But, of course, if a standard library solution exists it would be
    > preferable. Otherwise I either cut-and-paste the above every time
    > I need it, or I create a module just for it. (I don't like either
    > of these, though I suppose that the latter is much better than the
    > former.)
    >
    > So anyway, I thought I'd ask. :)


    Python 3.1 has, and 2.7 will have collections.Counter:

    >>> from collections import Counter
    >>> c = Counter("abcabcabca")
    >>> c.most_common()

    [('a', 4), ('c', 3), ('b', 3)]

    Peter
    Peter Otten, Jun 4, 2010
    #3
  4. kj

    Magdoll Guest

    On Jun 4, 11:28 am, Paul Rubin <> wrote:
    > kj <> writes:
    > > 1. is there a standard name for it?

    >
    > I don't know of one, or a stdlib for it, but it's pretty trivial.
    >
    > > def tally(c):
    > >     t = dict()
    > >     for x in c:
    > >         t[x] = t.get(x, 0) + 1
    > >     return sorted(t.items(), key=lambda x: (-x[1], x[0]))

    >
    > I like to use defaultdict and tuple unpacking for code like that:
    >
    >  from collections import defaultdict
    >  def tally(c):
    >      t = defaultdict(int)
    >      for x in c:
    >          t[x] += 1
    >      return sorted(t.iteritems(), key=lambda (k,v): (-v, k))


    I would also very much like to see this become part of the standard
    library. Sure the code is easy to write but I use this incredibly
    often and I've always wished I would have a one-line function call
    that has the same output as the mysql query:

    "SELECT id, count(*) FROM table GROUP BY somefield"

    or maybe there is already a short solution to this that I'm not aware
    of...
    Magdoll, Jun 4, 2010
    #4
  5. kj

    Magdoll Guest

    On Jun 4, 11:33 am, Peter Otten <> wrote:
    > kj wrote:
    >
    > > Task: given a list, produce a tally of all the distinct items in
    > > the list (for some suitable notion of "distinct").

    >
    > > Example: if the list is ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b',
    > > 'c', 'a'], then the desired tally would look something like this:

    >
    > > [('a', 4), ('b', 3), ('c', 3)]

    >
    > > I find myself needing this simple operation so often that I wonder:

    >
    > > 1. is there a standard name for it?
    > > 2. is there already a function to do it somewhere in the Python
    > >    standard library?

    >
    > > Granted, as long as the list consists only of items that can be
    > > used as dictionary keys (and Python's equality test for hashkeys
    > > agrees with the desired notion of "distinctness" for the tallying),
    > > then the following does the job passably well:

    >
    > > def tally(c):
    > >     t = dict()
    > >     for x in c:
    > >         t[x] = t.get(x, 0) + 1
    > >     return sorted(t.items(), key=lambda x: (-x[1], x[0]))

    >
    > > But, of course, if a standard library solution exists it would be
    > > preferable.  Otherwise I either cut-and-paste the above every time
    > > I need it, or I create a module just for it.  (I don't like either
    > > of these, though I suppose that the latter is much better than the
    > > former.)

    >
    > > So anyway, I thought I'd ask. :)

    >
    > Python 3.1 has, and 2.7 will have collections.Counter:
    >
    > >>> from collections import Counter
    > >>> c = Counter("abcabcabca")
    > >>> c.most_common()

    >
    > [('a', 4), ('c', 3), ('b', 3)]
    >
    > Peter



    Thanks Peter, I think you just answered my post :)
    Magdoll, Jun 4, 2010
    #5
  6. kj

    MRAB Guest

    kj wrote:
    >
    >
    >
    >
    > Task: given a list, produce a tally of all the distinct items in
    > the list (for some suitable notion of "distinct").
    >
    > Example: if the list is ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b',
    > 'c', 'a'], then the desired tally would look something like this:
    >
    > [('a', 4), ('b', 3), ('c', 3)]
    >
    > I find myself needing this simple operation so often that I wonder:
    >
    > 1. is there a standard name for it?
    > 2. is there already a function to do it somewhere in the Python
    > standard library?
    >
    > Granted, as long as the list consists only of items that can be
    > used as dictionary keys (and Python's equality test for hashkeys
    > agrees with the desired notion of "distinctness" for the tallying),
    > then the following does the job passably well:
    >
    > def tally(c):
    > t = dict()
    > for x in c:
    > t[x] = t.get(x, 0) + 1
    > return sorted(t.items(), key=lambda x: (-x[1], x[0]))
    >
    > But, of course, if a standard library solution exists it would be
    > preferable. Otherwise I either cut-and-paste the above every time
    > I need it, or I create a module just for it. (I don't like either
    > of these, though I suppose that the latter is much better than the
    > former.)
    >
    > So anyway, I thought I'd ask. :)
    >

    In Python 3 there's the 'Counter' class in the 'collections' module.
    It'll also be in Python 2.7.

    For earlier versions there's this:

    http://code.activestate.com/recipes/576611/
    MRAB, Jun 4, 2010
    #6
  7. kj

    Lie Ryan Guest

    On 06/05/10 04:38, Magdoll wrote:
    > On Jun 4, 11:33 am, Peter Otten <> wrote:
    >> kj wrote:
    >>
    >>> Task: given a list, produce a tally of all the distinct items in
    >>> the list (for some suitable notion of "distinct").

    >>
    >>> Example: if the list is ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b',
    >>> 'c', 'a'], then the desired tally would look something like this:

    >>
    >>> [('a', 4), ('b', 3), ('c', 3)]

    >>
    >>> I find myself needing this simple operation so often that I wonder:

    >>
    >>> 1. is there a standard name for it?
    >>> 2. is there already a function to do it somewhere in the Python
    >>> standard library?

    >>
    >>> Granted, as long as the list consists only of items that can be
    >>> used as dictionary keys (and Python's equality test for hashkeys
    >>> agrees with the desired notion of "distinctness" for the tallying),
    >>> then the following does the job passably well:

    >>
    >>> def tally(c):
    >>> t = dict()
    >>> for x in c:
    >>> t[x] = t.get(x, 0) + 1
    >>> return sorted(t.items(), key=lambda x: (-x[1], x[0]))

    >>
    >>> But, of course, if a standard library solution exists it would be
    >>> preferable. Otherwise I either cut-and-paste the above every time
    >>> I need it, or I create a module just for it. (I don't like either
    >>> of these, though I suppose that the latter is much better than the
    >>> former.)

    >>
    >>> So anyway, I thought I'd ask. :)

    >>
    >> Python 3.1 has, and 2.7 will have collections.Counter:
    >>
    >>>>> from collections import Counter
    >>>>> c = Counter("abcabcabca")
    >>>>> c.most_common()

    >>
    >> [('a', 4), ('c', 3), ('b', 3)]
    >>
    >> Peter

    >
    >
    > Thanks Peter, I think you just answered my post :)


    If you're using previous versions (2.4 and onwards) then:

    [(o, len(list(g))) for o, g in itertools.groupby(sorted(myList))]
    Lie Ryan, Jun 4, 2010
    #7
  8. kj

    kj Guest

    Thank you all!

    ~K
    kj, Jun 4, 2010
    #8
  9. On Jun 4, 11:14 am, kj <> wrote:
    > Task: given a list, produce a tally of all the distinct items in
    > the list (for some suitable notion of "distinct").
    >
    > Example: if the list is ['a', 'b', 'c', 'a', 'b', 'c', 'a', 'b',
    > 'c', 'a'], then the desired tally would look something like this:
    >
    > [('a', 4), ('b', 3), ('c', 3)]
    >
    > I find myself needing this simple operation so often that I wonder:
    >
    > 1. is there a standard name for it?
    > 2. is there already a function to do it somewhere in the Python
    >    standard library?
    >
    > Granted, as long as the list consists only of items that can be
    > used as dictionary keys (and Python's equality test for hashkeys
    > agrees with the desired notion of "distinctness" for the tallying),
    > then the following does the job passably well:
    >
    > def tally(c):
    >     t = dict()
    >     for x in c:
    >         t[x] = t.get(x, 0) + 1
    >     return sorted(t.items(), key=lambda x: (-x[1], x[0]))
    >
    > But, of course, if a standard library solution exists it would be
    > preferable.  Otherwise I either cut-and-paste the above every time
    > I need it, or I create a module just for it.  (I don't like either
    > of these, though I suppose that the latter is much better than the
    > former.)
    >
    > So anyway, I thought I'd ask. :)
    >
    > ~K


    How about this one liner, if you prefer them;
    set([(k,yourList.count(k)) for k in yourList])
    Sreenivas Reddy Thatiparthy, Jun 5, 2010
    #9
  10. kj

    Paul Rubin Guest

    Sreenivas Reddy Thatiparthy <> writes:
    > How about this one liner, if you prefer them;
    > set([(k,yourList.count(k)) for k in yourList])


    That has a rather bad efficiency problem if the list is large.
    Paul Rubin, Jun 5, 2010
    #10
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. bahoo
    Replies:
    37
    Views:
    833
    Paul McGuire
    Apr 9, 2007
  2. Andrew Robinson

    Vote tallying...

    Andrew Robinson, Jan 17, 2013, in forum: Python
    Replies:
    0
    Views:
    98
    Andrew Robinson
    Jan 17, 2013
  3. Lie Ryan

    Re: Vote tallying...

    Lie Ryan, Jan 18, 2013, in forum: Python
    Replies:
    0
    Views:
    106
    Lie Ryan
    Jan 18, 2013
  4. Stefan Behnel

    Re: Vote tallying...

    Stefan Behnel, Jan 18, 2013, in forum: Python
    Replies:
    0
    Views:
    132
    Stefan Behnel
    Jan 18, 2013
  5. Nick Cash

    RE: Vote tallying...

    Nick Cash, Jan 18, 2013, in forum: Python
    Replies:
    0
    Views:
    96
    Nick Cash
    Jan 18, 2013
Loading...

Share This Page