why scipy cause my program slow?

HYRY · Jan 16, 2007

Why the exec time of test(readdata()) and test(randomdata()) of
following program is different?
my test file 150Hz10dB.wav has 2586024 samples, so I set randomdata
function
to return a list with 2586024 samples.
the exec result is:
2586024
<type 'list'>
10.8603842736
2586024
<type 'list'>
2.16525233979
test(randomdata()) is 5x faster than test(readdata())
if I remove "from scipy import *" then I get the following result:
2586024
<type 'list'>
2.21851601473
2586024
<type 'list'>
2.13885042216

So, what the problem with scipy?
Python 2.4.2, scipy ver. 0.5.1

import wave
from scipy import *
from time import *
import random
from array import array

def readdata():
f = wave.open("150Hz10dB.wav", "rb")
t = f.getparams()
SampleRate = t[2]
data = array("h", f.readframes(t[3]))
f.close()
left = data[0::2]
mean = sum(left)/float(len(left))
left = [abs(x-mean) for x in left]
return left

def randomdata():
return [random.random()*32768.0 for i in xrange(2586024)]

def test(data):
print len(data)
print type(data)
envelop = []
e = 0.0
ga, gr = 0.977579425259, 0.999773268338
ga1, gr1 = 1.0 - ga, 1.0 - gr
start = clock()
for x in data:
if e < x:
e *= ga
e += ga1*x
else:
e *= gr
e += gr1*x
envelop.append(e)
print clock() - start
return envelop

test(readdata())
test(randomdata())

Robert Kern · Jan 16, 2007

HYRY said:
Why the exec time of test(readdata()) and test(randomdata()) of
following program is different?
my test file 150Hz10dB.wav has 2586024 samples, so I set randomdata
function
to return a list with 2586024 samples.
the exec result is:
2586024
<type 'list'>
10.8603842736
2586024
<type 'list'>
2.16525233979
test(randomdata()) is 5x faster than test(readdata())
if I remove "from scipy import *" then I get the following result:
2586024
<type 'list'>
2.21851601473
2586024
<type 'list'>
2.13885042216

So, what the problem with scipy?

You're importing (through scipy) numpy's sum() function. The result type of that
function is a numpy scalar type. The set of scalar types was introduced for a
number of reasons, mostly having to do with being able to represent the full
range of numerical datatypes that Python does not have builtin types for.
Unfortunately, the code paths that get executed when arithmetic is performed
sith such scalars are still suboptimal; I believe they are still going through
the full ufunc machinery.

--
Robert Kern

"I have come to believe that the whole world is an enigma, a harmless enigma
that is made terrible by our own mad attempt to interpret it as though it had
an underlying truth."
-- Umberto Eco

HYRY · Jan 16, 2007

Thanks, by your hint, I change type(data) to type(data[0]), and I get
<type 'float'>
<type 'numpy.float64'>
So, calculate with float is about 5x faster numpy.float64.

robert · Jan 16, 2007

HYRY said:
Thanks, by your hint, I change type(data) to type(data[0]), and I get
<type 'float'>
<type 'numpy.float64'>
So, calculate with float is about 5x faster numpy.float64.

approx..
numpy funcs all upcast int to int32 and float to float32 and
int32/float to float32 etc. This is probably ill behavior.
float32 arrays should only arise if numpy.array(l,dtype=numpy.float32)

In your example you'll best go to numpy/scipy types very early
(not mixing with the python array type in addition) and do the
array computations with scipy

left = [abs(x-mean) for x in left]

->

data = scipy.array(f.readframes(t[3]),"h")
...
left = abs(left-mean)

code the test(data) similar - see also scipy.signal.lfilter etc.

and cast types down to Python types late like float(mynumfloat) ...

The type magic and speed loss will and pickle problems will
probably only disapear, when float & int are handled as extra
(more conservative) types in numpy - with numpy scalar types only
on request. Currently numpy uses Python.

Robert

Travis Oliphant · Jan 17, 2007

Robert said:
You're importing (through scipy) numpy's sum() function. The result type of that
function is a numpy scalar type. The set of scalar types was introduced for a
number of reasons, mostly having to do with being able to represent the full
range of numerical datatypes that Python does not have builtin types for.
Unfortunately, the code paths that get executed when arithmetic is performed
sith such scalars are still suboptimal; I believe they are still going through
the full ufunc machinery.

This should not be true in the 1.0 release of NumPy. The numpy scalars
do their own math which has less overhead than ufunc-based math. But,
there is still more overhead than with simple floats because mixed-type
arithmetic is handled more generically (the same algorithm covers all
the cases).

The speed could be improved but hasn't been because it is so easy to get
a Python float if you are concerned about speed.

-Travis

Trouble with prediction code, for the life of me I can't figure out why it isnt running properly. Help would be appreciated.	0	Jul 8, 2023
Translater + module + tkinter	1	Feb 16, 2023
Help for my project in the last minute	0	Apr 23, 2022
My graphics don't look good with my buttons	0	May 20, 2022
numpy/scipy: calculate definite integral of sampled data	1	Aug 9, 2011
Help with my responsive home page	2	Dec 14, 2022
Index Error during backpropagation in a multilayer neural network.	1	Jun 17, 2023
collections.Counter surprisingly slow	11	Jul 28, 2013

why scipy cause my program slow?

HYRY

Robert Kern

HYRY

robert

Travis Oliphant

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads