Returning histogram-like data for items in a list

Ric Deez · Jul 22, 2005

Hi there,

I have a list:
L1 = [1,1,1,2,2,3]

How can I easily turn this into a list of tuples where the first element
is the list element and the second is the number of times it occurs in
the list (I think that this is referred to as a histogram):

i.e.:

L2 = [(1,3),(2,2),(3,1)]

I was doing something like:

myDict = {}
for i in L1:
myDict.setdefault(i,[]).append(i)

then doing this:

L2 = []
for k, v in myDict.iteritems():
L2.append((k, len(v)))

This works but I sort of feel like there ought to be an easier way,
rather than to have to store the list elements, when all I want is a
count of them. Would anyone care to comment?

I also tried this trick, where locals()['_[1]'] refers to the list
comprehension itself as it gets built, but it gave me unexpected results:

>>> L2 = [(i, len(i)) for i in L2 if not i in locals()['_[1]']]
>>> L2

Click to expand...

Click to expand...

[((1, 3), 2), ((2, 2), 2), ((3, 1), 2)]

i.e. I don't understand why each tuple is being counted as well.

Regards,

Ric

Michael Hoffman · Jul 22, 2005

Ric said:
Hi there,

I have a list:
L1 = [1,1,1,2,2,3]

How can I easily turn this into a list of tuples where the first element
is the list element and the second is the number of times it occurs in
the list (I think that this is referred to as a histogram):

i.e.:

L2 = [(1,3),(2,2),(3,1)]

>>> import itertools
>>> L1 = [1,1,1,2,2,3]
>>> L2 = [(key, len(list(group))) for key, group in itertools.groupby(L1)]
>>> L2

Click to expand...

Click to expand...

[(1, 3), (2, 2), (3, 1)]

George Sakkis · Jul 22, 2005

Michael Hoffman said:
Ric said:

Hi there,

I have a list:
L1 = [1,1,1,2,2,3]

How can I easily turn this into a list of tuples where the first element
is the list element and the second is the number of times it occurs in
the list (I think that this is referred to as a histogram):

i.e.:

L2 = [(1,3),(2,2),(3,1)]

import itertools
L1 = [1,1,1,2,2,3]
L2 = [(key, len(list(group))) for key, group in itertools.groupby(L1)]
L2

Click to expand...

Click to expand...

[(1, 3), (2, 2), (3, 1)]

This is correct if the original list items are grouped together; to be on the safe side, sort it
first:
L2 = [(key, len(list(group))) for key, group in itertools.groupby(sorted(L1))]

Or if you care about performance rather than number of lines, use this:

def hist(seq):
h = {}
for i in seq:
try: h += 1
except KeyError: h = 1
return h.items()

George

jeethu_rao · Jul 22, 2005

Adding to George's reply, if you want slightly more performance, you
can avoid the exception with something like

def hist(seq):
h = {}
for i in seq:
h = h.get(i,0)+1
return h.items()

Jeethu Rao

Bruno Desthuilliers · Jul 22, 2005

Ric Deez a écrit :

Hi there,

I have a list:
L1 = [1,1,1,2,2,3]

How can I easily turn this into a list of tuples where the first element
is the list element and the second is the number of times it occurs in
the list (I think that this is referred to as a histogram):

i.e.:

L2 = [(1,3),(2,2),(3,1)]

I was doing something like:

myDict = {}
for i in L1:
myDict.setdefault(i,[]).append(i)

then doing this:

L2 = []
for k, v in myDict.iteritems():
L2.append((k, len(v)))

This works but I sort of feel like there ought to be an easier way,

If you don't care about order (but your solution isn't garanteed to
preserve order either...):

L2 = dict([(item, L1.count(item)) for item in L1]).items()

But this may be inefficient is the list is large, so...

def hist(seq):
d = {}
for item in seq:
if not item in d:
d[item] = seq.count(item)
return d.items()

I also tried this trick, where locals()['_[1]'] refers to the list

Not sure to understand how that one works... But anyway, please avoid
this kind of horror unless your engaged in WORN context with a
perl-monger !-).

George Sakkis · Jul 22, 2005

jeethu_rao said:
Adding to George's reply, if you want slightly more performance, you
can avoid the exception with something like

def hist(seq):
h = {}
for i in seq:
h = h.get(i,0)+1
return h.items()

Jeethu Rao

The performance penalty of the exception is imposed only the first time a distinct item is found. So
unless you have a huge list of distinct items, I seriously doubt that this is faster at any
measurable rate.

George

David Isaac · Jul 22, 2005

Ric Deez said:
I have a list:
L1 = [1,1,1,2,2,3]
How can I easily turn this into a list of tuples where the first element
is the list element and the second is the number of times it occurs in
the list (I think that this is referred to as a histogram):

For ease of reading (but not efficiency) I like:
hist = [(x,L1.count(x)) for x in set(L1)]
See http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/277600

Alan Isaac

Range / empty list issues??	1	Dec 11, 2023
Reference	32	Mar 3, 2014
Grouping items by a key?	1	Mar 22, 2013
Help for my project in the last minute	0	Apr 23, 2022
Index Error during backpropagation in a multilayer neural network.	1	Jun 17, 2023
Making the case for "typed" lists/iterators in python	6	Dec 16, 2011
Adding the Copy Property to a Simple Histogram	2	Jul 6, 2009
Iterate through a list of tuples for processing	0	Sep 20, 2013

Returning histogram-like data for items in a list

Ric Deez

Michael Hoffman

George Sakkis

jeethu_rao

Bruno Desthuilliers

George Sakkis

David Isaac

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads