To count number of quadruplets with sum = 0

N

n00m

i have no NumPy to test it...
without Psyco Anton's code is the winner: ~48sec vs ~58sec of my code
But with Psyco my runtime is ~28sec; Anton's - ~30sec (PC: 1.6 ghz,
512 mb)
Not so bad.. keeping in mind that 256000 billions quadruplets to
check :)


import psyco, time
psyco.full()
t = time.clock()

def main():
q,w,e,r,sch,h = [],[],[],[],0,{}

f = open("D:/m4000.txt","rt")
for o in range(f.readline()):
row = map(int, f.readline().split())
q.append(row[0])
w.append(row[1])
e.append(row[2])
r.append(row[3])
f.close()

for x in q:
for y in w:
if h.has_key(x+y):
h[x+y] += 1
else:
h[x+y] = 1

for x in e:
for y in r:
if h.has_key(-(x+y)):
sch += h[-(x+y)]

q,w,e,r,h = None,None,None,None,None
print sch

main()
print time.clock() - t
 
B

bearophileHUGS

n00m:
i have no NumPy to test it...
without Psyco Anton's code is the winner: ~48sec vs ~58sec of my code
But with Psyco my runtime is ~28sec; Anton's - ~30sec (PC: 1.6 ghz,
512 mb)
Not so bad.. keeping in mind that 256000 billions quadruplets to
check :)

I have oiled it a bit, you can try the speed of this too (for dicts in
is always faster than has_key).

from collections import defaultdict
import time, psyco

def main():
sch = 0
q,w,e,r = [],[],[],[]
h = defaultdict(int)

datafile = file("m1000.txt")
datafile.next()
xrows = (map(int, line.split()) for line in datafile)
q, w, e, r = zip(*xrows)

for x in q:
for y in w:
h[x+y] += 1

for x in e:
for y in r:
if -x-y in h:
sch += h[-x-y]

print sch

t = time.clock()
psyco.full()
main()
print round(time.clock() - t, 2), "secs"
 
P

Paul Rubin

for x in e:
for y in r:
if -x-y in h:
sch += h[-x-y]

I wonder whether

g = h.get
for x in e:
for y in r:
if -x-y in h:
sch += g(-x-y, 0)

might be a little bit faster. Also, -x-y could be saved in a variable.

It's unfortunate that array.array objects don't support the .sort()
operation. It would be interesting to compare the sorting-based scheme
with the hashing-based one under psyco. O(n**2*log(n)) local memory
references might be faster than O(n**2) hash operations and random
lookups that almost always miss the cache.
 
M

mark.dufour

Paul said:
Yeah, I see now that we both used the same algorithm. At first glance
I thought you had done something much slower. The 10 second limit
they gave looks like they intended to do it about this way, but with a
compiled language. 68 seconds isn't so bad for 4000 entries in pure
CPython. Next thing to do I think is use psyco or pyrex.

FWIW, the original program can also be compiled with Shed Skin (http://
mark.dufour.googlepages.com), an experimental (static-)Python-to-C++
compiler, resulting in a speedup of about 8 times for a single test
with 500 tuples. here's a slightly modified version that works with
Shed Skin CVS at least:

import time
t = time.clock()

q,w,e,r,sch,h = [],[],[],[],0,{}

f = open("m33.txt","rt")

n = int(f.readline())

for o in range(n):
row = [int(x) for x in f.readline().split()]
q.append(row[0])
w.append(row[1])
e.append(row[2])
r.append(row[3])

f.close()

for x in q:
for y in w:
if h.has_key(x+y):
h[x+y] += 1
else:
h[x+y] = 1

for x in e:
for y in r:
sch += h.get(-(x+y),0)

print sch
print time.clock() - t


Thanks,
Mark Dufour (Shed Skin author - send me bug reports!)
 
B

bearophileHUGS

Mark Dufour:
FWIW, the original program can also be compiled with Shed Skin (http://
mark.dufour.googlepages.com), an experimental (static-)Python-to-C++
compiler, resulting in a speedup of about 8 times for a single test
with 500 tuples.

If we want to play, then this is a literal translation to D (I am not
much good in D yet, so maybe there are ways to improve this code a
bit), as you can see D syntax isn't that bad (I think sometimes D
copies some Python syntax):


// Compile with: dmd -O -release solver.d
import std.stdio, std.stream, std.string;

void main() {
size_t sch;
size_t[size_t] h;
size_t[] q,w,e,r;

int nrow = -1;
auto datafile = new File("m4000.txt", FileMode.In);
foreach(char[] row; datafile) {
if (nrow == -1) {
q.length = row.atoi();
w.length = row.atoi();
e.length = row.atoi();
r.length = row.atoi();
} else {
char[][] srow = row.split();
q[nrow] = srow[0].atoi();
w[nrow] = srow[1].atoi();
e[nrow] = srow[2].atoi();
r[nrow] = srow[3].atoi();
}
nrow++;
}

foreach (x; q)
foreach (y; w)
h[x+y]++;

foreach (x; e)
foreach (y; r) {
size_t* pointer = -x-y in h;
if (pointer)
sch += *pointer;
// simpler but slower:
// if (-x-y in h)
// sch += h[-x-y];
}

writefln(sch);
}


On my PC with the 1000 lines file this is about 2.2 times faster than
my Psyco version and about 4.6 times faster than the same code
without Psyco.

Bye,
bearophile
 
N

n00m

bearophileH!

I gave to your "oil" svrl runs ("z in dict" instd of "dict.has_key"
saved only ~0.4 sec).
The result is (and I'm completely lost in all these
*optimizations* :)):

0
34.78 secs (bearophileH)0
34.77 secs (bearophileH)0
34.76 secs (bearophileH)0
27.2460938471 secs (n00m)0
27.2953596058 secs (n00m)0
27.3709929614 secs (n00m)


In the name of exactness this is the tested codes:

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
from collections import defaultdict
import time, psyco


def main():
sch = 0
q,w,e,r = [],[],[],[]
h = defaultdict(int)


datafile = file("D:/m4000.txt")
datafile.next()
xrows = (map(int, line.split()) for line in datafile)
q, w, e, r = zip(*xrows)


for x in q:
for y in w:
h[x+y] += 1


for x in e:
for y in r:
if -x-y in h:
sch += h[-x-y]


print sch


t = time.clock()
psyco.full()
main()
print round(time.clock() - t, 2), "secs (bearophileH)"
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
import psyco, time

def main():
q,w,e,r,sch,h = [],[],[],[],0,{}
f = open("D:/m4000.txt","rt")
for o in range(int(f.readline())):
row = map(int, f.readline().split())
q.append(row[0])
w.append(row[1])
e.append(row[2])
r.append(row[3])
f.close()

for x in q:
for y in w:
if (x+y) in h:
h[x+y] += 1
else:
h[x+y] = 1

for x in e:
for y in r:
if -(x+y) in h:
sch += h[-(x+y)]

q,w,e,r,h = None,None,None,None,None
print sch

t = time.clock()
psyco.full()
main()
print time.clock() - t,"secs (n00m)"
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

PS Both Python & Psyco of 2.5 ver.
 
N

n00m

================================ RESTART ===0
30.4740708665 secs (Anton Vredegoor)0
30.4132625795 secs (Anton Vredegoor)0
30.4812175849 secs (Anton Vredegoor)

+++++++++++++++++++++++++++++++++++++++++++++++++++++++
import psyco, time

def freq(L):
D = {}
for x in L:
D[x] = D.get(x,0)+1
return D

def test():
f = file('D:/m4000.txt')
f.readline()
L = []
for line in f:
L.append(map(int,line.split()))

q,w,e,r = map(freq,zip(*L))
sch,h = 0,{}
for xk,xv in q.iteritems():
for yk,yv in w.iteritems():
if h.has_key(xk+yk):
h[xk+yk] += xv*yv
else:
h[xk+yk] = xv*yv

for xk,xv in e.iteritems():
for yk,yv in r.iteritems():
if h.has_key(-(xk+yk)):
sch += h[-(xk+yk)]*xv*yv

print sch

t = time.clock()
psyco.full()
test()
print time.clock() - t,"secs (Anton Vredegoor)"
+++++++++++++++++++++++++++++++++++++++++++++++++++++++

PS
The same PC: AMD Sempron +2600 (1.6 GHz); 512 MB RAM.
 
M

mark.dufour

FWIW, the original program can also be compiled with Shed Skin (http://
mark.dufour.googlepages.com), an experimental (static-)Python-to-C++
compiler, resulting in a speedup of about 8 times for a single test
with 500 tuples. here's a slightly modified version that works with
Shed Skin CVS at least:

after optimizing dicts a bit for Shedskin 0.0.21 (http://
mark.dufour.googlepages.com), the speedup for this program is now
about 16.5 times for the same test.


Thanks,
Mark Dufour (Shed Skin author - send me bug reports!)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top