nDimensional sparse histogram in python.

K

KraftDiner

Hi, I wrote a C++ class that implements an n dimensional histogram in
C++, using stl maps and vectors. I want to code this up now in Python
and would like some input from this group.

The C++ class was VERY simple..

std::map<std::vector<unsigned short>, unsigned long> histo;

Say for example I want a 3D histogram then std::vector<unsigned short>
would contains
the x, y, and z points in space and the unsigned long is the number of
counts.
If I want to know the value at a specific point in space I pass in a
vector [3,2,1] and if
it is found in the map then the count is returned other wise zero is
returned.
If I want to increment a count then histo[vector]++ will either set the
count to 1 or increment
the count if it is found...

So as you can see this is a very simple way to implement a multi
dimensional sparse histogram.

I'm thinking that in python this must also be as simple as a
dictionary, where the key is the vector x,y,z and the value is the
count.

My python is ok, but not the best and I was hoping some one here might
want to help me out?
This doesn't work for example...

histo = {[0,0,0]:1, [0,0,1]:2}
Would indicate that there is one sample at 0,0,0 and two samples at
0,0,1
but python tells me TypeError: list objects are unhashable

So any suggestions would be welcome.
TIA.
 
A

Alex Martelli

KraftDiner said:
histo = {[0,0,0]:1, [0,0,1]:2}
Would indicate that there is one sample at 0,0,0 and two samples at
0,0,1
but python tells me TypeError: list objects are unhashable

So any suggestions would be welcome.

Use tuples, not lists, as your keys:

histo = {(0,0,0):1, (0,0,1):2}


Alex
 
K

KraftDiner

Cool.
Ok so my histogram class had two methods 1) To increment a point and 2)
to get the point.

def class histo:
def __init__(self):
histo = {}
def update(point):
'''searches the dictionary for point and if it exists increment the
value.
if it doesn't exist set the value to 1'''

def get(point):
return self.histo[point]

I'm not sure how to code up the update routine...
 
J

John McMonagle

Cool.
Ok so my histogram class had two methods 1) To increment a point and 2)
to get the point.

def class histo:
def __init__(self):
histo = {}
def update(point):
'''searches the dictionary for point and if it exists increment the
value.
if it doesn't exist set the value to 1'''

def get(point):
return self.histo[point]

I'm not sure how to code up the update routine...

def update(point);

if histo.get(point) != None:
histo[point] = histo[point] + 1

Regards,

John
 
A

Alex Martelli

KraftDiner said:
Cool.
Ok so my histogram class had two methods 1) To increment a point and 2)
to get the point.

def class histo:
def __init__(self):
histo = {}
def update(point):
'''searches the dictionary for point and if it exists increment the
value.
if it doesn't exist set the value to 1'''

def update(self, point):
self.histo[point] = 1 + self.histo.get(point, 0)
def get(point):

def (self, point):

actually.
return self.histo[point]

I'm not sure how to code up the update routine...

See above.


Alex
 
K

KraftDiner

The dictionary is sorted by the point key.
Is there a way to sort the dictionary by the value?
Seems to me this goes against the purpose of a dictionary but....
thats what I need to do..
 
K

Kirk McDonald

KraftDiner said:
The dictionary is sorted by the point key.
Is there a way to sort the dictionary by the value?
Seems to me this goes against the purpose of a dictionary but....
thats what I need to do..

Well, it's probably not all that efficient, but it is simple code:

sortedList = [(v, k) for k, v in self.histo.items()]
sortedList.sort()

Should do it...

-Kirk McDonald
 
K

KraftDiner

Ok so this is nice.. Just one thing.. When you try to get a value from
a dictionary
and it isn't found in the dictionary things go bad...

Take this for example:

class histogram(object):
def __init__(self):
self.histo = {}

def update(self, point):
if self.histo.get(point) != None:
self.histo[point] = self.histo[point] + 1
else:
self.histo[point] = 1

def get(self, point):
return self.histo[point]


hist = histogram()
hist.update((0,0,0))
hist.update((0,0,1))
hist.update((0,0,1))
hist.get((0,0,0))
hist.get((0,0,1))
hist.get((0,0,2))

spews out this error:

Traceback (most recent call last):
File "histogram.py", line 21, in ?
hist.get((0,0,2))
File "histogram.py", line 12, in get
return self.histo[point]
KeyError: (0, 0, 2)
 
K

Kirk McDonald

KraftDiner said:
Ok so this is nice.. Just one thing.. When you try to get a value from
a dictionary
and it isn't found in the dictionary things go bad...

Take this for example:

class histogram(object):
def __init__(self):
self.histo = {}

def update(self, point):
if self.histo.get(point) != None:
self.histo[point] = self.histo[point] + 1
else:
self.histo[point] = 1

def get(self, point):
return self.histo[point]


hist = histogram()
hist.update((0,0,0))
hist.update((0,0,1))
hist.update((0,0,1))
hist.get((0,0,0))
hist.get((0,0,1))
hist.get((0,0,2))

spews out this error:

Traceback (most recent call last):
File "histogram.py", line 21, in ?
hist.get((0,0,2))
File "histogram.py", line 12, in get
return self.histo[point]
KeyError: (0, 0, 2)

Try this:

class histogram(object):
def __init__(self):
self.histo = {}

def update(self, point):
self.histo[point] = self.histo.get(point, 0) + 1

def get(self, point):
return self.histo.get(point, 0)

dict.get's default return value is your friend.

-Kirk McDonald
 
R

rajasekaran.natarajan

Hi
you can use has_key() to check whether
the particular value is in the key set or not.
a = {}
a[1] = 22
a[2] = 33
a {1: 22, 2: 33}
a.has_key(3) False
a.has_key(1) True


-raj
KraftDiner said:
Ok so this is nice.. Just one thing.. When you try to get a value from
a dictionary
and it isn't found in the dictionary things go bad...

Take this for example:

class histogram(object):
def __init__(self):
self.histo = {}

def update(self, point):
if self.histo.get(point) != None:
self.histo[point] = self.histo[point] + 1
else:
self.histo[point] = 1

def get(self, point):
return self.histo[point]


hist = histogram()
hist.update((0,0,0))
hist.update((0,0,1))
hist.update((0,0,1))
hist.get((0,0,0))
hist.get((0,0,1))
hist.get((0,0,2))

spews out this error:

Traceback (most recent call last):
File "histogram.py", line 21, in ?
hist.get((0,0,2))
File "histogram.py", line 12, in get
return self.histo[point]
KeyError: (0, 0, 2)
 
K

KraftDiner

Many thanks everyone.

One last quick question...
The dictionary, I believe, is sorted by the key.
Is there a way to sort it by the value?
Say I wanted to put out a list of the frequency sorted by highest to
lowest?
 
K

Kirk McDonald

KraftDiner said:
Many thanks everyone.

One last quick question...
The dictionary, I believe, is sorted by the key.
Is there a way to sort it by the value?
Say I wanted to put out a list of the frequency sorted by highest to
lowest?

The dictionary itself is actually unordered; a C++ std::map this ain't.
To sort its elements, you need to build a list of the items and sort
that, e.g.:

items = [(v, k) for k, v in self.histo.items()]
items.sort()

This will give you a list of (value, key) tuples sorted by value.

-Kirk McDonald
 
A

Alex Martelli

KraftDiner said:
Many thanks everyone.

One last quick question...
The dictionary, I believe, is sorted by the key.

Dictionaries are NOT sorted -- they're hashtables.
Is there a way to sort it by the value?
Say I wanted to put out a list of the frequency sorted by highest to
lowest?

import operator
for k, v in sorted(somedict, key=operator.itemgetter(1), reverse=1):
print k, v

emits key/value pairs, one per line, for any dictionary sorted by
descending order of values. Embellish to taste.


Alex
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,013
Latest member
KatriceSwa

Latest Threads

Top