Why is this so much faster?

Keir Rice · Jun 2, 2011

Hi All,

The following function was showing up in my profiles as a large bottle neck:

# Slow version
def RMSBand(self, histogram):
"""Calculates the root-mean-squared value for the given colour stream histogram."""
intermediateResult = map(lambda (i, h): h*(i**2), zip(histogram, range(255)))
totalSum = sum(intermediateResult)

# calculate rms
return math.sqrt(totalSum / self.Area())

So after a bit of trial and error I came up with the following function which is a lot faster:

# Fast version
def RMSBand(self, histogram):
"""Calculates the root-mean-squared value for the given colour stream histogram."""
totalSum = 0
for i, h in enumerate(histogram):
totalSum += h*(i**2)

# calculate rms
return math.sqrt(totalSum / self.Area())

My question is why is the second method so much faster?
Is it the removal of the extra function calls?
Is it skipping the creation of a list?
Do the basic arithmetic operators get pushed up into C code?

Any insights into the whats really happening behind the scenes would be appreciated.

Keir

Terry Reedy · Jun 3, 2011

Hi All,

The following function was showing up in my profiles as a large bottle neck:

# Slow version
def RMSBand(self, histogram):
"""Calculates the root-mean-squared value for the given colour stream histogram."""
intermediateResult = map(lambda (i, h): h*(i**2), zip(histogram, range(255)))
totalSum = sum(intermediateResult)

# calculate rms
return math.sqrt(totalSum / self.Area())

So after a bit of trial and error I came up with the following function which is a lot faster:

# Fast version
def RMSBand(self, histogram):
"""Calculates the root-mean-squared value for the given colour stream histogram."""
totalSum = 0
for i, h in enumerate(histogram):
totalSum += h*(i**2)

# calculate rms
return math.sqrt(totalSum / self.Area())

My question is why is the second method so much faster?
Is it the removal of the extra function calls?

Yes. Map is only 'fast' when one already has a function and is going to
call it repeatedly regardless of the other code. When one has an
expression, wrapping it as a function to use map is surely slower. Have
you tried

return math.sqrt(sum([h*i*i for i,h in enumerate(histogram)])
/ self.Area())

or same without [] brackets?

i*i should be faster than i**2 in any version.

Is it skipping the creation of a list?

A bit.

See Tim's response.

Ian Kelly · Jun 3, 2011

First of all, do you have the parameters to the lambda the wrong way around
in the map() version? zip(histogram, range(255)) will return (histogram
value, index), but enumerate(histogram) will return (index, histogram
value). But the parameters haven't been swapped, resulting in the histogram
value being squared in the map version, but the index being squared in the
manual summing version. Depending on the values, this could result in a
large performance increase in the second version (if the total value exceeds
the maximum size of your platform's "long" type).

In addition to what Tim said, I question whether you should even be
multiplying by the index at all. The usual definition of "root mean
square" is this:

math.sqrt(sum(x * x for x in values) / len(values))

I don't know the problem that you're trying to solve, though, so maybe
I am just being confused by your terminology.

Cheers,
Ian

Ian Kelly · Jun 3, 2011

Ian, I was basing my code off Fredrik Lundh post on comparing images.
http://effbot.org/zone/pil-comparing-images.htm

Ah, now I get what it's doing. Thanks!

faster way to do this?	17	Dec 1, 2011
Faster Prime class then Ruby 1.9	0	Feb 3, 2006
how do i configure this code	0	Oct 7, 2010
tornado.web ioloop add_timeout eats CPU	3	Sep 3, 2012
Is this a good use of __metaclass__?	9	May 5, 2006
Memory leak problem (while using tkinter)	2	Dec 31, 2008
Processes not exiting	14	Jul 31, 2009
numpy help	2	Nov 3, 2006

Why is this so much faster?

Keir Rice

Terry Reedy

Ian Kelly

Ian Kelly

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads