How to increase the speed of this program?

H

HYRY

I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.

import wave
import array
lfile = wave.open(lfilename)
rfile = wave.open(rfilename)
ofile = wave.open(ofilename, "w")
lformat = lfile.getparams()
rformat = rfile.getparams()
lframes = lfile.readframes(lformat[3])
rframes = rfile.readframes(rformat[3])
lfile.close()
rfile.close()
larray = array.array("h", lframes)
rarray = array.array("h", rframes)
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1
oarray[0::2] = larray #2
oarray[1::2] = rarray #3
ofile.setnchannels(2)
ofile.setsampwidth(2)
ofile.setframerate(lformat[2])
ofile.setnframes(len(larray))
ofile.writeframes(oarray.tostring())
ofile.close()
 
P

Paul McGuire

HYRY said:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
I'm not overly familiar with the array module, but one place you may be
paying a penalty is in allocating the list of 0's, and then interleaving the
larray and rarray lists.

What if you replace lines 1-3 with:

def takeOneAtATime(tupleiter):
for i in tupleiter:
yield i[0]
yield i[1]

oarray = array.array("h",takeOneAtATime(itertools.izip(larray,rarray)))

Or in place of calling takeOneAtATime, using itertools.chain.

oarray = array.array("h", itertools.chain(*itertools.izip(larray,rarray)))

Use itertools.izip (have to import itertools somewhere up top) to take left
and right values in pairs, then use takeOneAtATime to yield these values one
at a time. The key though, is that you aren't making a list ahead of time,
but a generator expression. On the other hand, array.array may be just
building an internal list anyway, so this may just be a wash.

Also, try psyco, if you can, especially with this version. Or pyrex to
optimize this data-interleaving.

HTH,
-- Paul
 
P

Peter Otten

HYRY said:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h").fromstring("\0" * size)

may be a bit faster.

Peter
 
P

Peter Otten

HYRY said:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Peter
 
H

HYRY

I think
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1
oarray[0::2] = larray #2
oarray[1::2] = rarray #3
will be executed at C level, but if I use itertools, the program is
executed at Python level. So the itertools version is actually slower
than the original program.
I tested #1,#2,#3. the speed of #2 and #3 is OK, but #1 is slow.
So my question is : are there some methods to create a huge array
without an initializer?
 
H

HYRY

Peter said:
HYRY said:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Peter

Thank you very much, that is just what I want.
 
P

Peter Otten

Peter said:
HYRY said:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Confirmed:

$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h"); a.fromstring("\0"*(2*N))'
100 loops, best of 3: 9.68 msec per loop
$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a = array("h",
[0]*N);'
10 loops, best of 3: 199 msec per loop

Peter
 
L

Leo Kislov

Peter said:
Peter said:
HYRY said:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Confirmed:

$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h"); a.fromstring("\0"*(2*N))'
100 loops, best of 3: 9.68 msec per loop
$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a = array("h",
[0]*N);'
10 loops, best of 3: 199 msec per loop

Funny thing is that using huge temporary string is faster that
multiplying small array:

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
=array('h'); a.fromstring('\0'*(2*N))"
100 loops, best of 3: 9.57 msec per loop

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
= array('h','\0\0'); a*N"
10 loops, best of 3: 28.4 msec per loop

Perhaps if array multiplication was as smart as string multiplication
then array multiplication version would be the fastest.

-- Leo
 
P

Peter Otten

Leo said:
Peter said:
Peter said:
HYRY wrote:

I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.

oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Confirmed:

$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h"); a.fromstring("\0"*(2*N))'
100 loops, best of 3: 9.68 msec per loop
$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h",
[0]*N);'
10 loops, best of 3: 199 msec per loop

Funny thing is that using huge temporary string is faster that
multiplying small array:

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
=array('h'); a.fromstring('\0'*(2*N))"
100 loops, best of 3: 9.57 msec per loop

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
= array('h','\0\0'); a*N"
10 loops, best of 3: 28.4 msec per loop

Perhaps if array multiplication was as smart as string multiplication
then array multiplication version would be the fastest.

That will not suffice:

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = [0]*N' 'array("h", init)'
10 loops, best of 3: 130 msec per loop

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = "\n"*(2*N)' 'array("h").fromstring(init)'
100 loops, best of 3: 5 msec per loop

A big chunk of the time is probably consumed by "casting" the list items.
Perhaps an array.fill(value, repeat) method would be useful.

Peter
 
P

Peter Otten

Peter said:
Leo said:
Peter said:
Peter Otten wrote:

HYRY wrote:

I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.

oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Confirmed:

$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h"); a.fromstring("\0"*(2*N))'
100 loops, best of 3: 9.68 msec per loop
$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h",
[0]*N);'
10 loops, best of 3: 199 msec per loop

Funny thing is that using huge temporary string is faster that
multiplying small array:

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
=array('h'); a.fromstring('\0'*(2*N))"
100 loops, best of 3: 9.57 msec per loop

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
= array('h','\0\0'); a*N"
10 loops, best of 3: 28.4 msec per loop

Perhaps if array multiplication was as smart as string multiplication
then array multiplication version would be the fastest.

Oops, I have to work on my reading skills. You're right, of course...
That will not suffice:

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = [0]*N' 'array("h", init)'
10 loops, best of 3: 130 msec per loop

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = "\n"*(2*N)' 'array("h").fromstring(init)'
100 loops, best of 3: 5 msec per loop

A big chunk of the time is probably consumed by "casting" the list items.
Perhaps an array.fill(value, repeat) method would be useful.

.... and that could be spelled array.__mul__ as you suggest.

Peter
 
L

Leo Kislov

HYRY said:
Peter said:
HYRY said:
I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.
oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Peter

Thank you very much, that is just what I want.

Even faster: oarray = larray + rarray

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
=array('h'); a.fromstring('\0'*(2*N))"
100 loops, best of 3: 9.57 msec per loop

C:\Python25>python -m timeit -s"from array import array; N = 10**6; b =
array('h', [0])*(N/2); c = b[:]" "a = b + c"
100 loops, best of 3: 5.7 msec per loop

-- Leo
 
J

John Machin

Peter said:
Peter said:
Leo said:
Peter Otten wrote:
Peter Otten wrote:

HYRY wrote:

I want to join two mono wave file to a stereo wave file by only using
the default python module.
Here is my program, but it is much slower than the C version, so how
can I increase the speed?
I think the problem is at line #1, #2, #3.

oarray = array.array("h", [0]*(len(larray)+len(rarray))) #1

ITEMSIZE = 2
size = ITEMSIZE*(len(larray) + len(rarray))
oarray = array.array("h")
oarray.fromstring("\0" * size)

may be a bit faster.

Confirmed:

$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h"); a.fromstring("\0"*(2*N))'
100 loops, best of 3: 9.68 msec per loop
$ python2.5 -m timeit -s'from array import array; N = 10**6' 'a =
array("h",
[0]*N);'
10 loops, best of 3: 199 msec per loop

Funny thing is that using huge temporary string is faster that
multiplying small array:

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
=array('h'); a.fromstring('\0'*(2*N))"
100 loops, best of 3: 9.57 msec per loop

C:\Python25>python -m timeit -s"from array import array; N = 10**6" "a
= array('h','\0\0'); a*N"
10 loops, best of 3: 28.4 msec per loop

Perhaps if array multiplication was as smart as string multiplication
then array multiplication version would be the fastest.

Oops, I have to work on my reading skills. You're right, of course...
That will not suffice:

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = [0]*N' 'array("h", init)'
10 loops, best of 3: 130 msec per loop

$ python2.5 -m timeit -s'from array import array; from itertools import
repeat; N = 10**6; init = "\n"*(2*N)' 'array("h").fromstring(init)'
100 loops, best of 3: 5 msec per loop

A big chunk of the time is probably consumed by "casting" the list items.
Perhaps an array.fill(value, repeat) method would be useful.

... and that could be spelled array.__mul__ as you suggest.

I'm extremely agnostic about the spelling :) IOW I'd be very glad of
any way [pure Python; e.g. maintaining my own version of the array
module doesn't qualify] to simply and rapidly create an array.array
instance with typecode t and number of elements n with each element
initialised to value v (default to be the zero appropriate to the
typecode).

Cheers,
John
 
F

Fredrik Lundh

John said:
I'm extremely agnostic about the spelling :) IOW I'd be very glad of
any way [pure Python; e.g. maintaining my own version of the array
module doesn't qualify] to simply and rapidly create an array.array
instance with typecode t and number of elements n with each element
initialised to value v (default to be the zero appropriate to the
typecode).

array(t, [v])*n

</F>
 
P

Peter Otten

Fredrik said:
John said:
I'm extremely agnostic about the spelling :) IOW I'd be very glad of
any way [pure Python; e.g. maintaining my own version of the array
module doesn't qualify] to simply and rapidly create an array.array
instance with typecode t and number of elements n with each element
initialised to value v (default to be the zero appropriate to the
typecode).

array(t, [v])*n

Of course Leo was already there before I messed it up again.

$ python2.5 -m timeit -s'from array import array; s = "abc"' 'a = array("c",
s); a*1000000'
10 loops, best of 3: 53.5 msec per loop

$ python2.5 -m timeit -s'from array import array; s = "abc"' 'a = array("c",
s); s*1000000'
100 loops, best of 3: 7.63 msec per loop

So str * N is significantly faster than array * N even if the same amount of
data is copied.

Peter
 
J

John Machin

Fredrik said:
John said:
I'm extremely agnostic about the spelling :) IOW I'd be very glad of
any way [pure Python; e.g. maintaining my own version of the array
module doesn't qualify] to simply and rapidly create an array.array
instance with typecode t and number of elements n with each element
initialised to value v (default to be the zero appropriate to the
typecode).

array(t, [v])*n

</F>

Thanks, that's indeed faster than array(t, [v]*n) but what I had in
mind was something like an additional constructor:

array.filledarray(typecode, repeat_value, repeat_count)

which I speculate should be even faster. Looks like I'd better get a
copy of arraymodule.c and start fiddling.

Anyone who could use this? Suggestions on name? Argument order?

Functionality: same as array.array(typecode, [repeat_value]) *
repeat_count. So it would cope with array.filledarray('c', "foo", 10)

I'm presuming an additional constructor would be better than doubling
up on the existing one:

array.array(typecode[, initializer)
and
array.array(typecode[, repeat_value, repeat_count])

Cheers,
John
 
F

Fredrik Lundh

John said:
Thanks, that's indeed faster than array(t, [v]*n) but what I had in
mind was something like an additional constructor:

array.filledarray(typecode, repeat_value, repeat_count)

which I speculate should be even faster.

before you add a new API, you should probably start by borrowing the
repeat code from Object/stringobject.c and see if the speedup is good
enough.

</F>
 
K

Klaas

John said:
Thanks, that's indeed faster than array(t, [v]*n) but what I had in
mind was something like an additional constructor:

array.filledarray(typecode, repeat_value, repeat_count)

which I speculate should be even faster. Looks like I'd better get a
copy of arraymodule.c and start fiddling.

Anyone who could use this? Suggestions on name? Argument order?

Functionality: same as array.array(typecode, [repeat_value]) *
repeat_count. So it would cope with array.filledarray('c', "foo", 10)

Why not just optimize array.__mul__? The difference is clearly in the
repeated memcpy() in arraymodule.c:683. Pseudo-unrolling the loop in
python demonstrates a speed up:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c',['\0'])*100000"
100 loops, best of 3: 3.14 msec per loop
[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c',['\0','\0','\0','\0'])*25000"
1000 loops, best of 3: 732 usec per loop
[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c','\0'*20)*5000"10000 loops, best of 3: 148 usec per loop

Which is quite close to your fromstring solution:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c').fromstring('\0'*100000)"
10000 loops, best of 3: 137 usec per loop

In fact, you can make it about 4x faster by balancing:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c','\0'*200)*500"
10000 loops, best of 3: 32.4 usec per loop

For the record:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c','\0'*100000)"
10000 loops, best of 3: 140 usec per loop

-Mike
 
K

Klaas

Klaas said:
In fact, you can make it about 4x faster by balancing:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c','\0'*200)*500"
10000 loops, best of 3: 32.4 usec per loop

This is an unclean minimally-tested patch which achieves reasonable
performance (about 10x faster than unpatched python):

$ ./python -m timeit -s "from array import array" "array('c',
'\0')*100000"
10000 loops, best of 3: 71.6 usec per loop

You have my permission to use this code if you want to submit a patch
to sourceforge (it needs, proper benchmarking, testing, and tidying).

-Mike

Index: Modules/arraymodule.c
===================================================================
--- Modules/arraymodule.c (revision 52849)
+++ Modules/arraymodule.c (working copy)
@@ -680,10 +680,29 @@
return NULL;
p = np->ob_item;
nbytes = a->ob_size * a->ob_descr->itemsize;
- for (i = 0; i < n; i++) {
- memcpy(p, a->ob_item, nbytes);
- p += nbytes;
- }
+
+ if (n) {
+ Py_ssize_t chunk_size = nbytes;
+ Py_ssize_t copied = 0;
+ char *src = np->ob_item;
+
+ /* copy first element */
+ memcpy(p, a->ob_item, nbytes);
+ copied += nbytes;
+
+ /* copy exponentially-increasing chunks */
+ while(chunk_size < (size - copied)) {
+ memcpy(p + copied, src, chunk_size);
+ copied += chunk_size;
+ if(chunk_size < size/10)
+ chunk_size *= 2;
+ }
+ /* copy remainder */
+ while (copied < size) {
+ memcpy(p + copied, src, nbytes);
+ copied += nbytes;
+ }
+ }
return (PyObject *) np;
}
 
K

Klaas

Klaas said:
Klaas said:
In fact, you can make it about 4x faster by balancing:

[klaas@worbo ~]$ python -m timeit -s "from array import array"
"array('c','\0'*200)*500"
10000 loops, best of 3: 32.4 usec per loop

This is an unclean minimally-tested patch which achieves reasonable
performance (about 10x faster than unpatched python):

<snip>

Never mind, that patch is bogus. A updated patch is here:
http://sourceforge.net/tracker/index.php?func=detail&aid=1605020&group_id=5470&atid=305470

-Mike
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,479
Members
44,899
Latest member
RodneyMcAu

Latest Threads

Top