Pythonic way to count sequences

Discussion in 'Python' started by CM, Apr 25, 2013.

  1. CM

    CM Guest

    I have to count the number of various two-digit sequences in a list
    such as this:

    mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4)
    sequence appears 2 times.)

    and tally up the results, assigning each to a variable. The inelegant
    first pass at this was something like...

    # Create names and set them all to 0
    alpha = 0
    beta = 0
    delta = 0
    gamma = 0
    # etc...

    # loop over all the tuple sequences and increment appropriately
    for sequence_tuple in list_of_tuples:
    if sequence_tuple == (1,2):
    alpha += 1
    if sequence_tuple == (2,4):
    beta += 1
    if sequence_tuple == (2,5):
    delta +=1
    # etc... But I actually have more than 10 sequence types.

    # Finally, I need a list created like this:
    result_list = [alpha, beta, delta, gamma] #etc...in that order

    I can sense there is very likely an elegant/Pythonic way to do this,
    and probably with a dict, or possibly with some Python structure I
    don't typically use. Suggestions sought. Thanks.
    CM, Apr 25, 2013
    #1
    1. Advertising

  2. On Thu, Apr 25, 2013 at 3:05 PM, CM <> wrote:
    > I have to count the number of various two-digit sequences in a list
    > such as this:
    >
    > mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4)
    > sequence appears 2 times.)
    >
    > and tally up the results, assigning each to a variable.


    You can use a tuple as a dictionary key, just like you would a string.
    So you can count them up directly with a dictionary:

    count = {}
    for sequence_tuple in list_of_tuples:
    count[sequence_tuple] = count.get(sequence_tuple,0) + 1

    Also, since this is such a common thing to do, there's a standard
    library way of doing it:

    import collections
    count = collections.Counter(list_of_tuples)

    This doesn't depend on knowing ahead of time what your elements will
    be. At the end of it, you can simply iterate over 'count' and get all
    your counts:

    for sequence,number in count.items():
    print("%d of %r" % (number,sequence))

    ChrisA
    Chris Angelico, Apr 25, 2013
    #2
    1. Advertising

  3. On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:

    > I have to count the number of various two-digit sequences in a list such
    > as this:
    >
    > mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
    > appears 2 times.)
    >
    > and tally up the results, assigning each to a variable. The inelegant
    > first pass at this was something like...
    >
    > # Create names and set them all to 0
    > alpha = 0
    > beta = 0
    > delta = 0
    > gamma = 0
    > # etc...


    Do they absolutely have to be global variables like that? Seems like a
    bad design, especially if you don't know in advance exactly how many
    there are.


    > # loop over all the tuple sequences and increment appropriately for
    > sequence_tuple in list_of_tuples:
    > if sequence_tuple == (1,2):
    > alpha += 1
    > if sequence_tuple == (2,4):
    > beta += 1
    > if sequence_tuple == (2,5):
    > delta +=1
    > # etc... But I actually have more than 10 sequence types.


    counts = {}
    for t in list_of_tuples:
    counts[t] = counts.get(t, 0) + 1


    Or, use collections.Counter:

    from collections import Counter
    counts = Counter(list_of_tuples)


    > # Finally, I need a list created like this: result_list = [alpha, beta,
    > delta, gamma] #etc...in that order


    Dicts are unordered, so getting the results in a specific order will be a
    bit tricky. You could do this:

    results = sorted(counts.items(), key=lambda t: t[0])
    results = [t[1] for t in results]

    if you are lucky enough to have the desired order match the natural order
    of the tuples. Otherwise:

    desired_order = [(2, 3), (3, 1), (1, 2), ...]
    results = [counts.get(t, 0) for t in desired_order]



    --
    Steven
    Steven D'Aprano, Apr 25, 2013
    #3
  4. 25.04.13 08:26, Chris Angelico напиÑав(ла):
    > So you can count them up directly with a dictionary:
    >
    > count = {}
    > for sequence_tuple in list_of_tuples:
    > count[sequence_tuple] = count.get(sequence_tuple,0) + 1


    Or alternatives:

    count = {}
    for sequence_tuple in list_of_tuples:
    if sequence_tuple] in count:
    count[sequence_tuple] += 1
    else:
    count[sequence_tuple] = 1

    count = {}
    for sequence_tuple in list_of_tuples:
    try:
    count[sequence_tuple] += 1
    except KeyError:
    count[sequence_tuple] = 1

    import collections
    count = collections.defaultdict(int)
    for sequence_tuple in list_of_tuples:
    count[sequence_tuple] += 1

    But of course collections.Counter is a preferable way now.
    Serhiy Storchaka, Apr 25, 2013
    #4
  5. On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:

    > I have to count the number of various two-digit sequences in a list such
    > as this:
    >
    > mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
    > appears 2 times.)
    >
    > and tally up the results, assigning each to a variable. The inelegant
    > first pass at this was something like...
    >
    > # Create names and set them all to 0 alpha = 0 beta = 0 delta = 0 gamma
    > = 0 # etc...
    >
    > # loop over all the tuple sequences and increment appropriately for
    > sequence_tuple in list_of_tuples:
    > if sequence_tuple == (1,2):
    > alpha += 1
    > if sequence_tuple == (2,4):
    > beta += 1
    > if sequence_tuple == (2,5):
    > delta +=1
    > # etc... But I actually have more than 10 sequence types.
    >
    > # Finally, I need a list created like this:
    > result_list = [alpha, beta, delta, gamma] #etc...in that order
    >
    > I can sense there is very likely an elegant/Pythonic way to do this, and
    > probably with a dict, or possibly with some Python structure I don't
    > typically use. Suggestions sought. Thanks.


    mylist = [ (3,3), (1,2), "fred", ("peter",1,7), 1, 19, 37, 28.312,
    ("monkey"), "fred", "fred", (1,2) ]

    bits = {}

    for thing in mylist:
    if thing in bits:
    bits[thing] += 1
    else:
    bits[thing] = 1

    for thing in bits:
    print thing, " occurs ", bits[thing], " times"

    outputs:

    (1, 2) occurs 2 times
    1 occurs 1 times
    ('peter', 1, 7) occurs 1 times
    (3, 3) occurs 1 times
    28.312 occurs 1 times
    fred occurs 3 times
    19 occurs 1 times
    monkey occurs 1 times
    37 occurs 1 times

    if you want to check that thing is a 2 int tuple then use something like:

    for thing in mylist:
    if isinstance( thing, tuple ) and len( thing ) == 2 and isinstance
    ( thing[0], ( int, long ) ) and isinstance( thing[1], ( int, long) ):
    if thing in bits:
    bits[thing] += 1
    else:
    bits[thing] = 1

    --
    Denis McMahon,
    Denis McMahon, Apr 26, 2013
    #5
  6. CM

    Modulok Guest

    On 4/25/13, Denis McMahon <> wrote:
    > On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:
    >
    >> I have to count the number of various two-digit sequences in a list such
    >> as this:
    >>
    >> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
    >> appears 2 times.)
    >>
    >> and tally up the results, assigning each to a variable.

    ....

    Consider using the ``collections`` module::


    from collections import Counter

    mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]
    count = Counter()
    for k in mylist:
    count[k] += 1

    print(count)

    # Output looks like this:
    # Counter({(2, 4): 2, (4, 5): 1, (3, 4): 1, (2, 1): 1})


    You then have access to methods to return the most common items, etc. See more
    examples here:

    http://docs.python.org/3.3/library/collections.html#collections.Counter


    Good luck!
    -Modulok-
    Modulok, Apr 26, 2013
    #6
  7. CM

    CM Guest

    Thank you, everyone, for the answers. Very helpful and knowledge-
    expanding.
    CM, Apr 26, 2013
    #7
  8. A Counter is definitely the way to go about this. Just as a little more
    information. The below example can be simplified:

    from collections import Counter
    count = Counter(mylist)

    With the other example, you could have achieved the same thing (and been
    backward compatible to python2.5) with

    from collections import defaultdict
    count = defaultdict(int)
    for k in mylist:
    count[k] += 1



    On 4/25/13 9:16 PM, Modulok wrote:
    > On 4/25/13, Denis McMahon <> wrote:
    >> On Wed, 24 Apr 2013 22:05:52 -0700, CM wrote:
    >>
    >>> I have to count the number of various two-digit sequences in a list such
    >>> as this:
    >>>
    >>> mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)] # (Here the (2,4) sequence
    >>> appears 2 times.)
    >>>
    >>> and tally up the results, assigning each to a variable.

    > ...
    >
    > Consider using the ``collections`` module::
    >
    >
    > from collections import Counter
    >
    > mylist = [(2,4), (2,4), (3,4), (4,5), (2,1)]
    > count = Counter()
    > for k in mylist:
    > count[k] += 1
    >
    > print(count)
    >
    > # Output looks like this:
    > # Counter({(2, 4): 2, (4, 5): 1, (3, 4): 1, (2, 1): 1})
    >
    >
    > You then have access to methods to return the most common items, etc. See more
    > examples here:
    >
    > http://docs.python.org/3.3/library/collections.html#collections.Counter
    >
    >
    > Good luck!
    > -Modulok-
    Matthew Gilson, Apr 26, 2013
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Doug Rosser

    What's the Pythonic way to do this?

    Doug Rosser, Sep 10, 2004, in forum: Python
    Replies:
    4
    Views:
    320
    Phillip J. Eby
    Sep 12, 2004
  2. Charles Krug
    Replies:
    11
    Views:
    549
    Bengt Richter
    Apr 27, 2005
  3. Thomas Lotze

    Controlling a generator the pythonic way

    Thomas Lotze, Jun 11, 2005, in forum: Python
    Replies:
    12
    Views:
    441
    Thomas Lotze
    Jun 14, 2005
  4. Carl J. Van Arsdall
    Replies:
    4
    Views:
    480
    Bruno Desthuilliers
    Feb 7, 2006
  5. efelnavarro09
    Replies:
    2
    Views:
    896
    efelnavarro09
    Jan 26, 2011
Loading...

Share This Page