Returning histogram-like data for items in a list

Discussion in 'Python' started by Ric Deez, Jul 22, 2005.

  1. Ric Deez

    Ric Deez Guest

    Hi there,

    I have a list:
    L1 = [1,1,1,2,2,3]

    How can I easily turn this into a list of tuples where the first element
    is the list element and the second is the number of times it occurs in
    the list (I think that this is referred to as a histogram):

    i.e.:

    L2 = [(1,3),(2,2),(3,1)]

    I was doing something like:

    myDict = {}
    for i in L1:
    myDict.setdefault(i,[]).append(i)

    then doing this:

    L2 = []
    for k, v in myDict.iteritems():
    L2.append((k, len(v)))

    This works but I sort of feel like there ought to be an easier way,
    rather than to have to store the list elements, when all I want is a
    count of them. Would anyone care to comment?

    I also tried this trick, where locals()['_[1]'] refers to the list
    comprehension itself as it gets built, but it gave me unexpected results:

    >>> L2 = [(i, len(i)) for i in L2 if not i in locals()['_[1]']]
    >>> L2

    [((1, 3), 2), ((2, 2), 2), ((3, 1), 2)]

    i.e. I don't understand why each tuple is being counted as well.

    Regards,

    Ric
     
    Ric Deez, Jul 22, 2005
    #1
    1. Advertising

  2. Ric Deez wrote:
    > Hi there,
    >
    > I have a list:
    > L1 = [1,1,1,2,2,3]
    >
    > How can I easily turn this into a list of tuples where the first element
    > is the list element and the second is the number of times it occurs in
    > the list (I think that this is referred to as a histogram):
    >
    > i.e.:
    >
    > L2 = [(1,3),(2,2),(3,1)]


    >>> import itertools
    >>> L1 = [1,1,1,2,2,3]
    >>> L2 = [(key, len(list(group))) for key, group in itertools.groupby(L1)]
    >>> L2

    [(1, 3), (2, 2), (3, 1)]
    --
    Michael Hoffman
     
    Michael Hoffman, Jul 22, 2005
    #2
    1. Advertising

  3. "Michael Hoffman" <> wrote:

    > Ric Deez wrote:
    > > Hi there,
    > >
    > > I have a list:
    > > L1 = [1,1,1,2,2,3]
    > >
    > > How can I easily turn this into a list of tuples where the first element
    > > is the list element and the second is the number of times it occurs in
    > > the list (I think that this is referred to as a histogram):
    > >
    > > i.e.:
    > >
    > > L2 = [(1,3),(2,2),(3,1)]

    >
    > >>> import itertools
    > >>> L1 = [1,1,1,2,2,3]
    > >>> L2 = [(key, len(list(group))) for key, group in itertools.groupby(L1)]
    > >>> L2

    > [(1, 3), (2, 2), (3, 1)]
    > --
    > Michael Hoffman


    This is correct if the original list items are grouped together; to be on the safe side, sort it
    first:
    L2 = [(key, len(list(group))) for key, group in itertools.groupby(sorted(L1))]

    Or if you care about performance rather than number of lines, use this:

    def hist(seq):
    h = {}
    for i in seq:
    try: h += 1
    except KeyError: h = 1
    return h.items()


    George
     
    George Sakkis, Jul 22, 2005
    #3
  4. Ric Deez

    jeethu_rao Guest

    Adding to George's reply, if you want slightly more performance, you
    can avoid the exception with something like

    def hist(seq):
    h = {}
    for i in seq:
    h = h.get(i,0)+1
    return h.items()

    Jeethu Rao
     
    jeethu_rao, Jul 22, 2005
    #4
  5. Ric Deez a ├ęcrit :
    > Hi there,
    >
    > I have a list:
    > L1 = [1,1,1,2,2,3]
    >
    > How can I easily turn this into a list of tuples where the first element
    > is the list element and the second is the number of times it occurs in
    > the list (I think that this is referred to as a histogram):
    >
    > i.e.:
    >
    > L2 = [(1,3),(2,2),(3,1)]
    >
    > I was doing something like:
    >
    > myDict = {}
    > for i in L1:
    > myDict.setdefault(i,[]).append(i)
    >
    > then doing this:
    >
    > L2 = []
    > for k, v in myDict.iteritems():
    > L2.append((k, len(v)))
    >
    > This works but I sort of feel like there ought to be an easier way,


    If you don't care about order (but your solution isn't garanteed to
    preserve order either...):

    L2 = dict([(item, L1.count(item)) for item in L1]).items()

    But this may be inefficient is the list is large, so...

    def hist(seq):
    d = {}
    for item in seq:
    if not item in d:
    d[item] = seq.count(item)
    return d.items()

    > I also tried this trick, where locals()['_[1]'] refers to the list


    Not sure to understand how that one works... But anyway, please avoid
    this kind of horror unless your engaged in WORN context with a
    perl-monger !-).
     
    Bruno Desthuilliers, Jul 22, 2005
    #5
  6. "jeethu_rao" <> wrote:

    > Adding to George's reply, if you want slightly more performance, you
    > can avoid the exception with something like
    >
    > def hist(seq):
    > h = {}
    > for i in seq:
    > h = h.get(i,0)+1
    > return h.items()
    >
    > Jeethu Rao


    The performance penalty of the exception is imposed only the first time a distinct item is found. So
    unless you have a huge list of distinct items, I seriously doubt that this is faster at any
    measurable rate.

    George
     
    George Sakkis, Jul 22, 2005
    #6
  7. Ric Deez

    David Isaac Guest

    "Ric Deez" <> wrote in message
    news:dbpat7$28o$...
    > I have a list:
    > L1 = [1,1,1,2,2,3]
    > How can I easily turn this into a list of tuples where the first element
    > is the list element and the second is the number of times it occurs in
    > the list (I think that this is referred to as a histogram):


    For ease of reading (but not efficiency) I like:
    hist = [(x,L1.count(x)) for x in set(L1)]
    See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/277600

    Alan Isaac
     
    David Isaac, Jul 22, 2005
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Gilles Kuhn
    Replies:
    0
    Views:
    456
    Gilles Kuhn
    Sep 15, 2003
  2. Replies:
    11
    Views:
    692
    Christos Georgiou
    May 2, 2006
  3. Anjan Bhowmik
    Replies:
    1
    Views:
    510
    Misbah Arefin
    Feb 14, 2008
  4. divya
    Replies:
    1
    Views:
    1,119
    Munna
    May 28, 2008
  5. DeMarcus
    Replies:
    26
    Views:
    629
    DeMarcus
    May 15, 2010
Loading...

Share This Page