improving a huge double-for cycle

G

Gabriel Genellina

Also I never saw a list where the threading often goes wrong like
this here - is there any special setup or is it just peoples MUA
which screws up?

Perhaps it's due to the newsgroup/list duality...
 
J

Jake Anderson

Bruno said:
Alexzive a écrit :
Hello there :) ,

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)



Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!! :(

Any idea to improve it?


A couple ones have been submitted. Harald gets a point about the
redundant tests (even if his solution seems to be broken, cf below) -
your inner loop should have looked like :

for j in xrange(i+1, len(IN))

Now the obvious winner is pruebono - even unoptimized, using sets seems
to be *way* faster than even the most optimized corrected version of
your algorithm.

Here's a quick bench - please everyone doublecheck it to make sure it's ok:

Results here (py2.5, gentoo linux, athlonxp1800, 512 ram):

doubles0 : 1.55667901039
doubles1 : 0.719144105911
doubles2 : 0.703393936157
doubles3 : 0.700654983521
doubles4 : 0.706257104874
doubles5 : 0.528184890747
doubles6 : 0.461633205414
doubles8 : 0.0134379863739
doubles9 : 0.0108540058136

Not surprisingly, half less iterations makes for half less time.
Aliasing, as often, proves to be a good optimization too. But obviously,
using the correct data structure / algorithm combo is the key : simpler
code, and 115 times faster (143 times with aliasing). If pruebono
solution's is correct (and it as AFAICT), your 15 hours computation
should by now take less than 10 minutes...

Ubuntu 8.04 core2 2.6(i think)
without psycho
doubles0 : 0.610555171967
doubles1 : 0.29314494133
doubles2 : 0.286273956299
doubles3 : 0.281984090805
doubles4 : 0.28240609169
doubles5 : 0.207377910614
doubles6 : 0.156388044357
doubles8 : 0.00533080101013
doubles9 : 0.00458884239197

with psycho
doubles0 : 0.127684116364
doubles1 : 0.069571018219
doubles2 : 0.064826965332
doubles3 : 0.0702300071716
doubles4 : 0.0647261142731
doubles5 : 0.0522589683533
doubles6 : 0.0437579154968
doubles8 : 0.00190806388855
doubles9 : 0.00214099884033

On this small test its a variance between ~6x to 2X still its basically
free so why not ;->
 
G

giltay

I'm not being snarky about losing priority here, but I submitted
essentially the same solution two hours earlier than pruebono.

My apologies (seriosuly). In this case, I think it might just be
haste. For what it's worth, I saw your post (on Google Groups), but I
skipped over it. You wrote two solutions, one slow and one fast (the
latter being the same as pruebono's). You put the slow one at the
top, I saw

for ...
for ...

and went straight to the next message without reading the better
solution. I knew that there was only one for loop necessary, so I
didn't bother reading on. Actually, I missed pruebono's post, too,
until after I figured it out myself (but before I posted).

That several people came up with the nigh exact same solution, modulo
variable names only, says something about the Zen of Python.

Geoff G-T
 
H

Harald Luessen

# Harald : uncomment this and run test_results. As far as I can tell, it
# doesn't yields the expected results

## IN7 = IN[:]
## def sortk7(n):
## return n.coordinates[0]

## def doubles7():
## "is ordering better ? - Nope Sir, it's broken..."
## IN7.sort(key=sortk)
## SN = []
## sn_append = SN.append
## in_len = len(IN)
## for i in xrange(in_len):
## node_i = IN
## coords_i = node_i.coordinates
## for j in xrange(i+1, in_len):
## if coords_i[0] == IN[j].coordinates[0]:
## if coords_i[1] == IN[j].coordinates[1]:
## sn_append(node_i)
## else:
## break
## return SN ....
Results here (py2.5, gentoo linux, athlonxp1800, 512 ram):
doubles0 : 1.55667901039
doubles1 : 0.719144105911
doubles2 : 0.703393936157
doubles3 : 0.700654983521
doubles4 : 0.706257104874
doubles5 : 0.528184890747
doubles6 : 0.461633205414
doubles8 : 0.0134379863739
doubles9 : 0.0108540058136


When you change my code then do it right. :)
You forgot to change the IN to IN7 at _every_ place.
And the sortk should be sortk7 in _both_ places.

I never let the code run before myself. I just wrote it
in the newsreader. But now i did and I have a second
version as bonus.


IN7 = IN[:]

def sortk7(n):
return n.coordinates[0], n.coordinates[1]

def doubles7():
IN7.sort(key=sortk7)
SN = []
sn_append = SN.append
in_len = len(IN7)
for i in xrange(in_len):
node_i = IN7
coords_i = node_i.coordinates
for j in xrange(i+1, in_len):
if coords_i[0] == IN7[j].coordinates[0]:
if coords_i[1] == IN7[j].coordinates[1]:
sn_append(node_i)
else:
break
return SN


def comp7( x, y ):
return cmp( x.coordinates, y.coordinates )

def doubles7a():
IN7.sort( comp7 )
SN = []
sn_append = SN.append
in_len = len(IN7)
for i in xrange(in_len):
node_i = IN7
for j in xrange(i+1, in_len):
node_j = IN7[j]
if comp7( node_i, node_j ) == 0:
sn_append(node_i)
else:
break
return SN


Here are the results. (py2.5, WindowsXP, Pentium4, 2.6GHz, 1.5GB):
My version is not so bad.

doubles0 : 1.03830598582
doubles1 : 0.47943719104
doubles2 : 0.487412506338
doubles3 : 0.475924733451
doubles4 : 0.466548681466
doubles5 : 0.340487967046
doubles6 : 0.278480365521
doubles7 : 0.0953190978183
doubles7a : 0.0784233750379
doubles8 : 0.010236496538
doubles9 : 0.00742803903848


Harald
 
P

pruebauno

I'm not being snarky about losing priority here, but I submitted
essentially the same solution two hours earlier than pruebono.

Are people not seeing my posts? Have I been kill-filed by everyone except
Mensator? I also asked a question about HTTPError and I haven't seen any
responses at all.

I see your post now, but I didn't notice it at the time I posted.
Could be because I am using the noticeable bad Google groups interface
that is known for being unreliable. Duplicate solutions on Usenet are
almost a given and I consider duplicate solutions a good thing, it
means that other people will be able to understand that code.

In any case I am not here for glory I am posting under a pseudonym so
nobody discovers that I am slacking off at work reading Usen[carrier
lost]
 
B

Bruno Desthuilliers

Steven D'Aprano a écrit :
I'm not being snarky about losing priority here, but I submitted
essentially the same solution two hours earlier than pruebono.

My bad... Now that you mention it, I see you effectively did. My fault,
I did a too-fast screening of submitted solutions, and hereby declare
you Steven as the winner.
Are people not seeing my posts? Have I been kill-filed by everyone except
Mensator?

I do see your posts. Whether I read them entirely is another question -
and in this concrete case, the answer is obviously 'nope'. Put the blame
on me.
I also asked a question about HTTPError and I haven't seen any
responses at all.

This is another problem - it happens that no one has a clue, or that the
one having a clue lack time (or willingness) to anwer. Once again, sorry
if me missing your correct answer drives you paranoid :(
 
B

Bruno Desthuilliers

Bruno Desthuilliers a écrit :
Alexzive a écrit : (snip)

Now the obvious winner is pruebono

Correction (my bad) : Steven D'Aprano submitted the set-based solution
first. So the winners are Steven and pruebono.
 
B

Bruno Desthuilliers

Harald Luessen a écrit :
# Harald : uncomment this and run test_results. As far as I can tell, it
# doesn't yields the expected results

## IN7 = IN[:]
## def sortk7(n):
## return n.coordinates[0]

## def doubles7():
## "is ordering better ? - Nope Sir, it's broken..."
## IN7.sort(key=sortk)
## SN = []
## sn_append = SN.append
## in_len = len(IN)
## for i in xrange(in_len):
## node_i = IN
## coords_i = node_i.coordinates
## for j in xrange(i+1, in_len):
## if coords_i[0] == IN[j].coordinates[0]:
## if coords_i[1] == IN[j].coordinates[1]:
## sn_append(node_i)
## else:
## break
## return SN ...
Results here (py2.5, gentoo linux, athlonxp1800, 512 ram):
test_results() True
test_times()
doubles0 : 1.55667901039
doubles1 : 0.719144105911
doubles2 : 0.703393936157
doubles3 : 0.700654983521
doubles4 : 0.706257104874
doubles5 : 0.528184890747
doubles6 : 0.461633205414
doubles8 : 0.0134379863739
doubles9 : 0.0108540058136


When you change my code then do it right. :)
You forgot to change the IN to IN7 at _every_ place.
And the sortk should be sortk7 in _both_ places.


doh :(

Sorry Harald, my fault, you're right...
I never let the code run before myself. I just wrote it
in the newsreader. But now i did and I have a second
version as bonus.


IN7 = IN[:]

def sortk7(n):
return n.coordinates[0], n.coordinates[1]

def doubles7():
IN7.sort(key=sortk7)
SN = []
sn_append = SN.append
in_len = len(IN7)
for i in xrange(in_len):
node_i = IN7
coords_i = node_i.coordinates
for j in xrange(i+1, in_len):
if coords_i[0] == IN7[j].coordinates[0]:
if coords_i[1] == IN7[j].coordinates[1]:
sn_append(node_i)
else:
break
return SN


def comp7( x, y ):
return cmp( x.coordinates, y.coordinates )

def doubles7a():
IN7.sort( comp7 )
SN = []
sn_append = SN.append
in_len = len(IN7)
for i in xrange(in_len):
node_i = IN7
for j in xrange(i+1, in_len):
node_j = IN7[j]
if comp7( node_i, node_j ) == 0:
sn_append(node_i)
else:
break
return SN


Here are the results. (py2.5, WindowsXP, Pentium4, 2.6GHz, 1.5GB):
My version is not so bad.

doubles0 : 1.03830598582
doubles1 : 0.47943719104
doubles2 : 0.487412506338
doubles3 : 0.475924733451
doubles4 : 0.466548681466
doubles5 : 0.340487967046
doubles6 : 0.278480365521
doubles7 : 0.0953190978183
doubles7a : 0.0784233750379
doubles8 : 0.010236496538
doubles9 : 0.00742803903848


Point taken. Now there's one thing I find questionnable with your
solution (which is probably why I didn't bother double-checking it when
it *appeared* to be broken) : you assume that it's ok to loose original
ordering, which I strongly suspect is not the case for the OP use case,
and given the OP use case list's size, working on a copy might not be an
acceptable solution.

But anyway: this is not an excuse for me having broken your code. My
apologies.
 
S

Steven D'Aprano

Once again, sorry
if me missing your correct answer drives you paranoid :)

What do you mean by that? How many other people have been talking about
me?

*wink*
 
T

Tim Chase

km said:
how abt this ?

N = len(IN)
for k in range(N):
for j in range(N):
if j >= k: # or k <= j
doSomething()

This has the root problem that the "if" statement is evaluated
N*N times, which is ugly/slow O(N^2) behavior. My solution
managed to reduce it by a constant multiplier, but several folks
proposed a more elegant O(N) solution which was leaps & bounds
faster.

-tkc
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top