Why is this so much faster?

Discussion in 'Python' started by Keir Rice, Jun 2, 2011.

1. Keir RiceGuest

Hi All,

The following function was showing up in my profiles as a large bottle neck:

# Slow version
def RMSBand(self, histogram):
"""Calculates the root-mean-squared value for the given colour stream histogram."""
intermediateResult = map(lambda (i, h): h*(i**2), zip(histogram, range(255)))
totalSum = sum(intermediateResult)

# calculate rms
return math.sqrt(totalSum / self.Area())

So after a bit of trial and error I came up with the following function which is a lot faster:

# Fast version
def RMSBand(self, histogram):
"""Calculates the root-mean-squared value for the given colour stream histogram."""
totalSum = 0
for i, h in enumerate(histogram):
totalSum += h*(i**2)

# calculate rms
return math.sqrt(totalSum / self.Area())

My question is why is the second method so much faster?
Is it the removal of the extra function calls?
Is it skipping the creation of a list?
Do the basic arithmetic operators get pushed up into C code?

Any insights into the whats really happening behind the scenes would be appreciated.

Keir

Keir Rice, Jun 2, 2011

2. Terry ReedyGuest

On 6/2/2011 6:07 PM, Keir Rice wrote:
> Hi All,
>
> The following function was showing up in my profiles as a large bottle neck:
>
> # Slow version
> def RMSBand(self, histogram):
> """Calculates the root-mean-squared value for the given colour stream histogram."""
> intermediateResult = map(lambda (i, h): h*(i**2), zip(histogram, range(255)))
> totalSum = sum(intermediateResult)
>
> # calculate rms
> return math.sqrt(totalSum / self.Area())
>
> So after a bit of trial and error I came up with the following function which is a lot faster:
>
> # Fast version
> def RMSBand(self, histogram):
> """Calculates the root-mean-squared value for the given colour stream histogram."""
> totalSum = 0
> for i, h in enumerate(histogram):
> totalSum += h*(i**2)

> # calculate rms
> return math.sqrt(totalSum / self.Area())
>
> My question is why is the second method so much faster?
> Is it the removal of the extra function calls?

Yes. Map is only 'fast' when one already has a function and is going to
call it repeatedly regardless of the other code. When one has an
expression, wrapping it as a function to use map is surely slower. Have
you tried

return math.sqrt(sum([h*i*i for i,h in enumerate(histogram)])
/ self.Area())

or same without [] brackets?

i*i should be faster than i**2 in any version.

> Is it skipping the creation of a list?

A bit.

See Tim's response.

--
Terry Jan Reedy

Terry Reedy, Jun 3, 2011

3. Ian KellyGuest

On Thu, Jun 2, 2011 at 4:38 PM, Tim Delaney <> wrote:
> First of all, do you have the parameters to the lambda the wrong way around
> in the map() version? zip(histogram, range(255)) will return (histogram
> value, index), but enumerate(histogram) will return (index, histogram
> value). But the parameters haven't been swapped, resulting in the histogram
> value being squared in the map version, but the index being squared in the
> manual summing version. Depending on the values, this could result in a
> large performance increase in the second version (if the total value exceeds
> the maximum size of your platform's "long" type).

In addition to what Tim said, I question whether you should even be
multiplying by the index at all. The usual definition of "root mean
square" is this:

math.sqrt(sum(x * x for x in values) / len(values))

I don't know the problem that you're trying to solve, though, so maybe
I am just being confused by your terminology.

Cheers,
Ian

Ian Kelly, Jun 3, 2011
4. Ian KellyGuest

Ian Kelly, Jun 3, 2011