improving a huge double-for cycle

Alexzive · Sep 18, 2008

Hello there

,

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)

Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!!

Any idea to improve it?

I have already tried to group the "if statements" in a single one:

Code: Select all
if i <> j and if IN.coordinates[0] == IN[j].coordinates[0] and
if IN.coordinates[1] == IN[j].coordinates[1]:

but no improvements.

Many thanks, Alex

skip · Sep 18, 2008

Alex> Unfortunately my len(IN) is about 100.000 and the running time
Alex> about 15h !!!!

Alex> Any idea to improve it?

numpy?

http://numpy.scipy.org/
http://www.scipy.org/Numpy_Example_List

More immediately, note that you are building a list of len(IN) ints every
time through the inner loop. A quick hit might be this simple change:

indexes = range(len(IN))
for i in indexes: #scan all elements of the list IN
for j in indexes:
if i != j:
if (IN.coordinates[0] == IN[j].coordinates[0] and
IN.coordinates[1] == IN[j].coordinates[1]):
SN.append(IN.label)

Skip

Peter Otten · Sep 18, 2008

Alexzive said:
Hello there ,

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)

Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!!

Any idea to improve it?

I have already tried to group the "if statements" in a single one:

Code: Select all
if i <> j and if IN.coordinates[0] == IN[j].coordinates[0] and
if IN.coordinates[1] == IN[j].coordinates[1]:

but no improvements.

Many thanks, Alex

Click to expand...

When you're looking for duplicates an efficient solution is likely to be
based on a set or dict object.

# untested
from collections import defaultdict

groups = defaultdict(list)
for item in IN:
c = item.coordinates
groups[c[0], c[1]].append(item.label)
SN = []
for labels in groups.itervalues():
if len(labels) > 1:
SN.extend(labels) # or labels[1:] if you want to keep one item

Peter

Tim Chase · Sep 18, 2008

Code: Select all

for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)

Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!!

Any idea to improve it? [snip]
I have already tried to group the "if statements" in a single one:

Code: Select all
if i <> j and if IN.coordinates[0] == IN[j].coordinates[0] and
if IN.coordinates[1] == IN[j].coordinates[1]:

but no improvements.

It's like rearranging deck-chairs on the Titanic Yes, it may
give a speed up, but what's 3 seconds when you're waiting 15hr

Not knowing the len(IN[x].coordinates) or their structure, if
it's a list of len==2, you should be able to just do

if i <> j and IN.coordinates == IN[j].coordinates

or

if i <> j and IN.coordinates[:2] == IN[j].coordinates[:2]

However, again, this is just polish. The big problem is that you
have an O(N^2) algorithm that's killing you.

1) use xrange instead of range to save eating memory with a huge
unneeded array.

2) unless you need to append duplicate labels, you know that when
I and J are swapped, you'll reach the same condition again, so it
might be worth writing the outer loops to eliminate this
scenario, and in the process, but just starting at i+1 rather
than i, you can forgo the check if "i<>j".

Such changes might look something like

for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):
if IN.coordinates == IN[j].coordinates:
SN.append(IN.label)

If my college algorithms memory serves me sufficiently, this
reduces your O(N^2) to O(N log N) which will garner you some
decent time savings.

-tkc

bearophileHUGS · Sep 18, 2008

Skip:

indexes = range(len(IN))
for i in indexes: #scan all elements of the list IN
for j in indexes:

Nope, use xrange in both situations, and save a list.

Tim Chase:

for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):
if IN.coordinates == IN[j].coordinates:
SN.append(IN.label)

If my college algorithms memory serves me sufficiently, this
reduces your O(N^2) to O(N log N) which will garner you some
decent time savings.

That's O(n^2) still, it's just half matrix, a triangle.

Bye,
bearophile

Steven D'Aprano · Sep 18, 2008

Hello there ,

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a certain
geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)

Here's a better version of your algorithm, one which avoids the minor
inefficiencies but keeps the huge inefficiency:

for node1 in IN:
for node2 in IN:
if node1 is not node2:
if node1.coordinates == node2.coordinates:
SN.append(node1.label)

This assumes that node.coordinates is a type where equality is defined.
If they are a tuple or list, that should work fine.

But the huge inefficiency is that you are checking each node not once,
not twice, but 100,000 times! So you have to iterate 10,000,000,000
times, which is going to be slow no matter what you do. Especially in
pure Python.

Here's a better idea: iterate over the list once only:

seen = set()
for node in IN:
coords = tuple(node.coordinates)
if coords in seen:
SN.append(node.label)
else:
seen.add(coords)

Hope this helps.

Tim Chase · Sep 18, 2008

Tim Chase:

for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):
if IN.coordinates == IN[j].coordinates:
SN.append(IN.label)

If my college algorithms memory serves me sufficiently, this
reduces your O(N^2) to O(N log N) which will garner you some
decent time savings.

Click to expand...

That's O(n^2) still, it's just half matrix, a triangle.

Ah, good catch as I'm thinking about it more, you're right...it's
O((N^2)/2) which is just O(N^2). Sigh. I'm not awake enough
yet. However, dividing the time by 2 (from 15hr to 7.5hr) is
still a significant savings in my book

However, if list-comprehensions are faster as well, you might be
able to do something like

SN = [d.label
for (i,d) in enumerate(IN)
for j in xrange(i+1, len(IN))
if d.coordinates == IN[j].coordinates
]

or the slightly more verbose (but closer to my original code) version

SN = [IN.label
for i in xrange(len(IN))
for j in xrange(i+1, len(IN))
if IN.coordinates == IN[j].coordinates
]

To the OP: As always, throw some timing code on the various
samples you get back from the list and see what works for you.

-tkc

J. Cliff Dyer · Sep 18, 2008

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)

Click to expand...

Such changes might look something like

for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):
if IN.coordinates == IN[j].coordinates:
SN.append(IN.label)

Click to expand...

If you aren't checking j values less than i, you might want to do both

SN.append(IN.label)
SN.append(IN[j].label)

on the same pass.

pruebauno · Sep 18, 2008

Hello there ,

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)

Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!!

Any idea to improve it?

I have already tried to group the "if statements" in a single one:

Code: Select all
if i <> j and if IN.coordinates[0] == IN[j].coordinates[0] and
if IN.coordinates[1] == IN[j].coordinates[1]:

but no improvements.

Many thanks, Alex

dup=set()
SN=[]
for item in IN:
c=item.coordinates[0], item.coordinates[1]
if c in dup:
SN.append(item.label)
else:
dup.add(c)

Jake Anderson · Sep 18, 2008

psyco might help a fair bit (10x-40x) here ;->
perhaps look at dumping the data into sqlite then pulling it back out.
It (or the other databases) are designed for tossing around large lumps
of data.

Alexzive said:
Hello there ,

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)

Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!!

Any idea to improve it?

I have already tried to group the "if statements" in a single one:

Code: Select all
if i <> j and if IN.coordinates[0] == IN[j].coordinates[0] and
if IN.coordinates[1] == IN[j].coordinates[1]:

but no improvements.

Many thanks, Alex

--

Vapour Forge

Jake Anderson

Project Manager

Mobile: 0412 897 125

Email: (e-mail address removed)

Web Page: www.vapourforge.com

Your source for custom IT services

giltay · Sep 18, 2008

dup=set()
SN=[]
for item in IN:
c=item.coordinates[0], item.coordinates[1]
if c in dup:
SN.append(item.label)
else:
dup.add(c)

+1 for O(N)

If item.coordinates is just an (x, y) pair, you can skip building c
and save a little memory:

seen_coords = set()

for node in IN:
if node.coordinates in seen_coords:
SN.append(node.label)
else:
seen_coords.add(node.coordinates)

seen_coords gets populated with references to the existing
node.coordinates objects, instead of new tuples.

Geoff G-T

Harald Luessen · Sep 18, 2008

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)

I did not test the syntax, but here is an idea with sorted lists.
It should be O(NlogN).

def sk(x):
return x.coordinates[0]

IN.sort(key=sk)
for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)
else:
break

Harald

Bruno Desthuilliers · Sep 18, 2008

Harald Luessen a écrit :
(snip)

I did not test the syntax,
but here is an idea with sorted lists.
It should be O(NlogN).

def sk(x):
return x.coordinates[0]

IN.sort(key=sk)
for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)
else:
break

The syntax is ok. Not the results, or so it seems (cf my other post in
this thread). But you still get a good point for noticing the redundant
tests in the inner loop !-)

Bruno Desthuilliers · Sep 18, 2008

Alexzive a écrit :

Hello there ,

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)

Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!!

Any idea to improve it?

A couple ones have been submitted. Harald gets a point about the
redundant tests (even if his solution seems to be broken, cf below) -
your inner loop should have looked like :

for j in xrange(i+1, len(IN))

Now the obvious winner is pruebono - even unoptimized, using sets seems
to be *way* faster than even the most optimized corrected version of
your algorithm.

Here's a quick bench - please everyone doublecheck it to make sure it's ok:

from timeit import Timer
import random

class Node(object):
def __init__(self, x, y):
self.coordinates = (x, y)
def __repr__(self):
return repr(self.coordinates)

# how to build src:
# src = [(random.randrange(10),random.randrange(10)) for x in xrange(100)]
src = [
(4, 9), (5, 0), (6, 6), (7, 2), (3, 6), (9, 6), (0, 1), (1, 6),
(0, 5), (1, 2), (8, 9), (5, 4), (1, 6), (7, 6), (9, 1), (7, 6),
(0, 1), (7, 4), (7, 4), (8, 4), (8, 4), (3, 5), (9, 6), (6, 1),
(3, 4), (4, 5), (0, 5), (6, 3), (2, 4), (1, 6), (9, 5), (1, 2),
(5, 8), (8, 5), (3, 1), (9, 4), (9, 4), (3, 3), (4, 8), (9, 7),
(8, 4), (6, 2), (1, 5), (5, 8), (8, 6), (0, 8), (5, 2), (3, 4),
(0, 5), (4, 4), (2, 9), (7, 7), (1, 0), (4, 2), (5, 7), (0, 4),
(2, 5), (0, 8), (7, 3), (9, 1), (0, 4), (5, 0), (4, 9), (0, 6),
(3, 0), (3, 0), (3, 9), (8, 3), (7, 9), (8, 5), (7, 6), (1, 5),
(0, 6), (5, 9), (6, 8), (0, 0), (4, 1), (3, 3), (5, 4), (5, 3),
(6, 1), (5, 4), (4, 5), (5, 8), (4, 1), (3, 6), (1, 9), (0, 5),
(6, 5), (5, 5), (6, 0), (0, 9), (2, 6), (0, 7), (5, 9), (7, 3),
(7, 9), (5, 4), (4, 9), (2, 9)
]
IN = [Node(x, y) for x, y in src]

def doubles0():
SN = []
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
#print i, j
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN)
return SN

def doubles1():
"first solve an algoritmic problem"
SN = []
for i in range(len(IN)):
for j in range(i+1, len(IN)):
#print i, j
#if i != j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN)
return SN

def doubles2():
"then avoid buildin useless lists"
SN = []
for i in xrange(len(IN)):
for j in xrange(i+1, len(IN)):
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN)
return SN

def doubles3():
"then avoid uselessly calling a constant operation"
SN = []
in_len = len(IN)
for i in xrange(in_len):
for j in xrange(i+1, in_len):
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN)
return SN

def doubles4():
"then alias commonly used methods"
SN = []
sn_append = SN.append
in_len = len(IN)
for i in xrange(in_len):
for j in xrange(i+1, in_len):
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
sn_append(IN)
return SN

def doubles5():
"then alias a couple other things"
SN = []
sn_append = SN.append
in_len = len(IN)
for i in xrange(in_len):
node_i = IN
coords_i = node_i.coordinates
for j in xrange(i+1, in_len):
if coords_i[0] == IN[j].coordinates[0]:
if coords_i[1] == IN[j].coordinates[1]:
sn_append(node_i)
return SN

def doubles6():
"then simplify tests"
SN = []
sn_append = SN.append
in_len = len(IN)
for i in xrange(in_len):
node_i = IN
coords_i = node_i.coordinates
for j in xrange(i+1, in_len):
if coords_i == IN[j].coordinates:
sn_append(node_i)
return SN

# Harald : uncomment this and run test_results. As far as I can tell, it
# doesn't yields the expected results

## IN7 = IN[:]
## def sortk7(n):
## return n.coordinates[0]

## def doubles7():
## "is ordering better ? - Nope Sir, it's broken..."
## IN7.sort(key=sortk)
## SN = []
## sn_append = SN.append
## in_len = len(IN)
## for i in xrange(in_len):
## node_i = IN
## coords_i = node_i.coordinates
## for j in xrange(i+1, in_len):
## if coords_i[0] == IN[j].coordinates[0]:
## if coords_i[1] == IN[j].coordinates[1]:
## sn_append(node_i)
## else:
## break

## return SN

def doubles8():
"Using a set ?"
dup = set()
SN = []
for item in IN:
c = item.coordinates
if c in dup:
SN.append(item)
else:
dup.add(c)
return SN

def doubles9():
"Using a set is way faster - but can still be improved a bit"
dup = set()
dup_add = dup.add
SN = []
sn_append = SN.append

for item in IN:
c = item.coordinates
if c in dup:
sn_append(item)
else:
dup_add(c)
return SN

# need this for tests:
names_funcs = sorted(
(n, f) for n, f in locals().iteritems()
if n.startswith('doubles')
)

def test_results():
"""
make sure all solutions give correct results,
assuming the OP solution is at least correct !-)
"""
results = [
(n, set(o.coordinates for o in f()))
for n, f in names_funcs
]
_, correct = results[0]
ok = True
for n, r in results[1:]:
if r != correct:
print "error on %s ?\n expected %s\n, got %s" \
% (n, correct, r)
ok = False
return ok

def test_times():
" And the winner is..."
results = [
(n, Timer(
'%s()' % n,
'from __main__ import %s' % n
).timeit(100)
)
for n, _ in names_funcs
]
print "\n".join("%s : %s" % r for r in results)

Results here (py2.5, gentoo linux, athlonxp1800, 512 ram):
doubles0 : 1.55667901039
doubles1 : 0.719144105911
doubles2 : 0.703393936157
doubles3 : 0.700654983521
doubles4 : 0.706257104874
doubles5 : 0.528184890747
doubles6 : 0.461633205414
doubles8 : 0.0134379863739
doubles9 : 0.0108540058136
Not surprisingly, half less iterations makes for half less time.
Aliasing, as often, proves to be a good optimization too. But obviously,
using the correct data structure / algorithm combo is the key : simpler
code, and 115 times faster (143 times with aliasing). If pruebono
solution's is correct (and it as AFAICT), your 15 hours computation
should by now take less than 10 minutes...

Paul Hankin · Sep 18, 2008

Hello there ,

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)

Unfortunately my len(IN) is about 100.000 and the running time about
15h !!!!

Any idea to improve it?

I have already tried to group the "if statements" in a single one:

Code: Select all
if i <> j and if IN.coordinates[0] == IN[j].coordinates[0] and
if IN.coordinates[1] == IN[j].coordinates[1]:

A simple O(N) algorithm:

from collections import defaultdict

d = defaultdict(int)
for x in IN:
d[x] += 1

SN = [x for (x, c) in d.iteritems() if c > 1]

Paul Hankin · Sep 18, 2008

Hello there ,

I am a python newbie and need to run following code for a task in an
external simulation programm called "Abaqus" which makes use of python
to access the mesh (ensamble of nodes with xy coordinates) of a
certain geometrical model.

[IN is the starting input containing the nodes to be check, there are
some double nodes with the same x and y coordinates which need to be
removed. SN is the output containing such double nodes]

Code: Select all
for i in range(len(IN)): #scan all elements of the list IN
for j in range(len(IN)):
if i <> j:
if IN.coordinates[0] == IN[j].coordinates[0]:
if IN.coordinates[1] == IN[j].coordinates[1]:
SN.append(IN.label)

Using only a little extra storage to compute IN (but O(N log N)
time complexity):

import itertools

IN.sort()
SN = [k for k, v in itertools.groupby(IN) if len(list(v)) > 1]

bearophileHUGS · Sep 18, 2008

Bruno Desthuilliers:

def doubles9():
...
SN = []
sn_append = SN.append

Few more:

import psyco

def doubles10():
dup = set()
SN = []
for item in IN:
c = item.coordinates
if c in dup:
SN.append(item)
else:
dup.add(c)
return SN

psyco.bind(doubles10)

from collections import deque

def doubles11():
dup = set()
dup_add = dup.add
SN = deque()
sn_append = SN.append

for item in IN:
c = item.coordinates
if c in dup:
sn_append(item)
else:
dup_add(c)
return SN

def doubles12():
dup = set()
SN = deque()
for item in IN:
c = item.coordinates
if c in dup:
SN.append(item)
else:
dup.add(c)
return SN

psyco.bind(doubles12)

Timings:

doubles00 : 0.522365288653
doubles01 : 0.247219812198
doubles02 : 0.237889823898
doubles03 : 0.238638921389
doubles04 : 0.23821698217
doubles05 : 0.177042495425
doubles06 : 0.13166199162
doubles08 : 0.00569725197252
doubles09 : 0.00418566685667
doubles10 : 0.00192086920869
doubles11 : 0.00403324533245
doubles12 : 0.00184026840268

Hopefully that's fast enough

Bye,
bearophile

Steven D'Aprano · Sep 18, 2008

Now the obvious winner is pruebono - even unoptimized, using sets seems
to be *way* faster than even the most optimized corrected version of
your algorithm.

I'm not being snarky about losing priority here, but I submitted
essentially the same solution two hours earlier than pruebono.

Are people not seeing my posts? Have I been kill-filed by everyone except
Mensator? I also asked a question about HTTPError and I haven't seen any
responses at all.

bearophileHUGS · Sep 18, 2008

bearophile:

doubles12 : 0.00184026840268

For fun, and to have an almost baseline reference, I have done a
little benchmark in D too:

import std.stdio: writefln;

// Using Tango std lib you don't need all this
version (Win32) {
import std.c.windows.windows;
double clock() {
long t;
QueryPerformanceCounter(&t);
return cast(double)t / queryPerformanceFrequency;
}

long queryPerformanceFrequency;
static this() {
QueryPerformanceFrequency(&queryPerformanceFrequency);
}
}

union N {
struct { int x, y; }
long xy;
}

auto IN = [
N(4, 9), N(5, 0), N(6, 6), N(7, 2), N(3, 6), N(9, 6), N(0, 1), N(1,
6),
N(0, 5), N(1, 2), N(8, 9), N(5, 4), N(1, 6), N(7, 6), N(9, 1), N(7,
6),
N(0, 1), N(7, 4), N(7, 4), N(8, 4), N(8, 4), N(3, 5), N(9, 6), N(6,
1),
N(3, 4), N(4, 5), N(0, 5), N(6, 3), N(2, 4), N(1, 6), N(9, 5), N(1,
2),
N(5, 8), N(8, 5), N(3, 1), N(9, 4), N(9, 4), N(3, 3), N(4, 8), N(9,
7),
N(8, 4), N(6, 2), N(1, 5), N(5, 8), N(8, 6), N(0, 8), N(5, 2), N(3,
4),
N(0, 5), N(4, 4), N(2, 9), N(7, 7), N(1, 0), N(4, 2), N(5, 7), N(0,
4),
N(2, 5), N(0, 8), N(7, 3), N(9, 1), N(0, 4), N(5, 0), N(4, 9), N(0,
6),
N(3, 0), N(3, 0), N(3, 9), N(8, 3), N(7, 9), N(8, 5), N(7, 6), N(1,
5),
N(0, 6), N(5, 9), N(6, 8), N(0, 0), N(4, 1), N(3, 3), N(5, 4), N(5,
3),
N(6, 1), N(5, 4), N(4, 5), N(5, 8), N(4, 1), N(3, 6), N(1, 9), N(0,
5),
N(6, 5), N(5, 5), N(6, 0), N(0, 9), N(2, 6), N(0, 7), N(5, 9), N(7,
3),
N(7, 9), N(5, 4), N(4, 9), N(2, 9)
];

N[] doubles13() {
size_t[N] dup; // used as set
N[] SN;
foreach (item; IN) {
if (item in dup)
SN ~= item;
else
dup[item] = 0;
}
return SN;
}

N[] doubles0() {
N[] SN;
for (int i; i < IN.length; i++)
for (int j; j < IN.length; j++)
if (i != j)
if (IN == IN[j])
SN ~= IN;
return SN;
}

void test_results() {
size_t[N] set1, set2;
foreach (n; doubles0())
set1[n] = 0;
foreach (n; doubles13())
set2[n] = 0;
if (set1.keys.sort != set2.keys.sort)
throw new Error("ERROR");
}

void main() {
int n = 150_000;
test_results();

int count = n;
auto t0 = clock();
while (count--)
doubles13();
auto t1 = clock();

writefln("%0.10f", cast(double)(t1 - t0) / n);
}

doubles13() needs 0.000027-0.000037 seconds (*), about 60-75 times
faster than doubles12, this means about 3-4 seconds instead of 15h (on
the original computer).
Using C++ with GCC (using a <hash_map> for dup and a <vector> for SN)
you can probably go 10-40% faster still

(*) Probably there's such variability of timings because the current
DMD compiler I have used doesn't add "align" instructions in the asm
it produces as GCC does. With modern CPUs very sensitive to code
alignment this leads to even 20-30% runtime differences in small tight
loops.

Bye,
bearophile

Terry Reedy · Sep 19, 2008

Steven said:
I'm not being snarky about losing priority here, but I submitted
essentially the same solution two hours earlier than pruebono.

Are people not seeing my posts? Have I been kill-filed by everyone except
Mensator? I also asked a question about HTTPError and I haven't seen any
responses at all.

Yes, after figuring out what to do from the original post, I saw yours
and then Pruebono's and decided that since two people had submitted the
jackpot algorithm, I need not say more. I will say this: this solution
amounts to finding equivalence classes (the sets of items with a given
'key') and then finding the classes (sets) with more than one member.
Defaultdict is great for this.

tjr

S N log log N sort	6	Oct 5, 2025
Index Error during backpropagation in a multilayer neural network.	1	Jun 17, 2023
Need help with this Python code.	2	Jun 13, 2023
speed up a numpy code with huge array	5	May 25, 2010
how to improve this cycle (extracting data from structured array)?	2	Feb 11, 2010
My subroutine won't call within a for and if...is this normal?	1	Jan 25, 2022
More efficient merge sort	2	May 11, 2022
A more efficient code	1	Apr 11, 2022

improving a huge double-for cycle

Alexzive

skip

Peter Otten

Tim Chase

bearophileHUGS

Steven D'Aprano

Tim Chase

J. Cliff Dyer

pruebauno

Jake Anderson

giltay

Harald Luessen

Bruno Desthuilliers

Bruno Desthuilliers

Paul Hankin

Paul Hankin

bearophileHUGS

Steven D'Aprano

bearophileHUGS

Terry Reedy

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads