speeding up Python script

L

Luis P. Mendes

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I have a 1000 line python script that takes many hours to finish. It is
running with six inside 'for' loops.

I've searched the net for ways to speed up the proccess.

Psyco improves performance around 3% in this case which is not good enough.

How can I dramatically improve speed?

I tried to find some tool that converts Python to C automatically but
couldn't. As I don't know C, I think that weave and PyInline for
example are out of the solution.

I'm using Linux.

Luis
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCio0YHn4UHCY8rB8RAoe0AKC4rj/dxE2HGw4xtCMEmVGDgrGEEwCghO1r
+Rkgx8j1BzD0Lxq/iPey23s=
=xv4O
-----END PGP SIGNATURE-----
 
R

Rune Strand

Without seeing any code, it's hard to tell, but it's not a wild guess
that 'six inside for loops' may be replaced by more efficient ways ;-)
 
J

James Stroud

You may want to read through this case study by the BDFL.

http://www.python.org/doc/essays/list2str.html

Hi,

I have a 1000 line python script that takes many hours to finish. It is
running with six inside 'for' loops.

I've searched the net for ways to speed up the proccess.

Psyco improves performance around 3% in this case which is not good enough.

How can I dramatically improve speed?

I tried to find some tool that converts Python to C automatically but
couldn't. As I don't know C, I think that weave and PyInline for
example are out of the solution.

I'm using Linux.

Luis

--
James Stroud
UCLA-DOE Institute for Genomics and Proteomics
Box 951570
Los Angeles, CA 90095

http://www.jamesstroud.com/
 
P

Peter Dembinski

Luis P. Mendes said:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Hi,

I have a 1000 line python script that takes many hours to finish.
It is running with six inside 'for' loops.

I've searched the net for ways to speed up the proccess.

Psyco improves performance around 3% in this case which is not good
enough.

How can I dramatically improve speed?

You are looking for better algorithm, not for Python optimisations.
 
L

Luis P. Mendes

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The reason why I'm using six nested for loops is because I need to find
the best output using those six variables as input.

Here's the simplified code:

for per in range():
~ for s in range():
~ for t in range():
for v in range():
~ for n in range():
~ for l in range():
var a, \
var b, \
...
var 15 = function1(arg1, \
...
arg20)
~ var a, \
var d, \
var a, \
...
var 14 = function2(arg1, \
...
arg25)
var c, \
...
var 18 = function3(arg1, \
...
arg20)
ia = var1*var2-numPerdidos*sl
result.insert(index, [per,s,t,v,n,l,ia])
index = index + 1

thanks for the replies
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCiy2CHn4UHCY8rB8RAlafAJ4wQ8mSCGFwTTt9EH1bn54tTuI40wCfSrn8
KDSrHnsIV2cY/PKmL3ZwLdI=
=39Cm
-----END PGP SIGNATURE-----
 
R

Robert Kern

Luis said:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

The reason why I'm using six nested for loops is because I need to find
the best output using those six variables as input.

Here's the simplified code:

for per in range():
~ for s in range():
~ for t in range():
for v in range():
~ for n in range():
~ for l in range():

What's the size of each dimension in the parameter space? If you
absolutely need an exhaustive search, then you are going to have to live
with the long runtimes. The fact that psyco didn't help significantly
suggests that Python optimizations aren't going to help much.

You should examine your problem and determine if you really need an
exhaustive search or if you can live with some other search strategy.
Read up on "combinatorial optimzation" and "discrete optimization."

--
Robert Kern
(e-mail address removed)

"In the fields of hell where the grass grows high
Are the graves of dreams allowed to die."
-- Richard Harter
 
D

Dan Sommers

The reason why I'm using six nested for loops is because I need to find
the best output using those six variables as input.
Here's the simplified code:
for per in range():
~ for s in range():
~ for t in range():
for v in range():
~ for n in range():
~ for l in range():
var a, \
var b, \
...
var 15 = function1(arg1, \
...
arg20)
~ var a, \
var d, \
var a, \
...
var 14 = function2(arg1, \
...
arg25)
var c, \
...
var 18 = function3(arg1, \
...
arg20)
ia = var1*var2-numPerdidos*sl
result.insert(index, [per,s,t,v,n,l,ia])
index = index + 1
thanks for the replies

I don't know how large your search-space is, but if the ranges for s, t,
v, n, and/or l don't change, at the very least you can pre-compute those
ranges instead of creating a new list with range every time:

srange = range( <whatever> )
trange = range( <whatever> )
vrange = range( <whatever> )
nrange = range( <whatever> )
lrange = range( <whatever> )

for per in range( <whatever> ):
for s in srange:
for t in trange:
for v = vrange:
for n in nrange:
for l in nrange:
somevar = somefunction( per, s, t, v, n, l )
result.append( [ somevar, per, s, t, v, n, l ] )

(Appending the results is probably quicker than inserting them, but I
don't know for sure.)

But a better-than-an-exhaustive-search algorithm sounds like a good
idea, too.

Regards,
Dan
 
L

Lonnie Princehouse

Quick tip-
Try xrange instead of range. This will use dramatically less memory
if your search space is large, which will speed things up /if/ your
machine is being forced to swap.

Besides that, without seeing the code for your functions, it's hard to
offer more advice. Your algorithm is necessarily going to take a long
time because you're enumerating all of the possible values of your
functions in a six dimensional space!! Recoding the whole program in
C would get you maybe an order of magnitude increase for any given
problem size, but it will still get really slow as the size of your
search space increases.

What are function1, function2, and function3? You might really
consider looking into some numerical methods for optimization if you're
just trying to find the smallest/largest value of ia.
 
G

Grant Edwards

I have a 1000 line python script that takes many hours to
finish. It is running with six inside 'for' loops.
[...]

How can I dramatically improve speed?

In probably order of efficacy:

1) Use a better algorithm

2) Replace 'for' loops with vector operations or list
comprehensions

3) Install more RAM

4) Install a faster disk drive

5) Use a faster CPU

6) Rent DVDs to watch while program is running.

Before you do anything, you should profile your program to see
where the time is being spent.
 
J

James Carroll

It looks like your algorithm really does iterate over all values for
six variables and do lots of math.. then you can't do any better than
implementing the inner loop in C. It does look like you have some
functions that are being called that are also in python, and it would
be interesting to see if just one of those was the culprit. If so,
you might be able to replace just one of those calls with something
faster.

I'm no expert, but I have used swig once or twice, I might be able to
help. What do you think?

If you are going through all the data just to find a minimum or
something, then you might be able to get away with not visiting all of
the values of all six variables. Some sort of gradient descent
minimization is probably already out there written in C ready to be
used.

-Jim
 
L

Luis P. Mendes

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

I appreciate everyone's help!

I got some ideas that I'll try to put into practice.

Regards,

Luis
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFCi7QQHn4UHCY8rB8RAkpWAJsGezFRCjDpVvbDiuEN8a2lPY41AgCgm6w4
3hgtkw5NkxZ6GuzC1k2dIYo=
=Epti
-----END PGP SIGNATURE-----
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,574
Members
45,048
Latest member
verona

Latest Threads

Top