Jack said:
Jack said:
On Wed, Feb 08, 2006 at 12:50:19PM -0800, Swroteb wrote:
Hi there,
I've got a reasonably sized list of objects that I'd like to pull out
all combinations of five elements from. Right now I have a way to do
this that's quite slow, but manageable. I know there must be a better
way to do this, but I'm not sure what it is. Here's what I've got so
far:
import probstat #
http://probstat.sourceforge.net
for (a, b, c) in probstat.Combination(range(9), 3): # 0..9, pick 3
print a, b, c
It is a C extension that does permutations & combinations and is
about 10x faster than doing it in pure python [I'm the author].
It is also the 3rd result for "python combination" and 5th for
"python permutaiton" but every month someone posts to c.l.py asking
for something similar. Go figure.
Did you ever figure that some people use Windows?
I don't have a windows dev box so I can't vouch for this binary but
a user sent me a windows .pyd yesterday.
http://jackdied.com/static/probstat.pyd
-jackdied
Hey, thanks!
I have a pure Python permutation generator that generates
a Cartesian product of a string with itself with filters
to emulate
permutation w/ replacemment
combination w/ replacemment
permutation w/o replacemment
combination w/o replacemment
so I put together a little test program to see how probstat
compares. Correspondence to my emulations is (as far as I can tell)
permutation w/ replacemment: equivalent to probstat.Cartesian
combination w/ replacemment: no equivalent
permutation w/o replacemment: equivalent to probstat.Permutation
combination w/o replacemment: equivalent to probstat.Combination
Here's the test program:
-----------------------------------------------------
import probstat
import time
def cxxx(m):
return '(' + ' and '.join(['(i%s!=i%s)' % (m,i) for i in range(m)])
+ ')'
def ooloop5(a, n, perm=True, repl=True):
if (not repl) and (n>len(a)): return
p = []
loop = '\n'.join([(' ' * i) + 'for i%s in a:' % i for i in range(n)])
+ '\n'
indt = ' ' * n
sub1 = indt + 'if perm and repl:\n'
sub2 = indt + 'if (not perm) and repl:\n'
ccc2 = ' and '.join(['(i%s>=i%s)' % (i,i-1) for i in range(1,n)])
con2 = indt + ' if ' + ccc2 + ':\n'
sub3 = indt + 'if perm and (not repl):\n'
cccx = ' and '.join([cxxx(m) for m in range(1,n)])
con3 = indt + ' if ' + cccx + ':\n'
sub4 = indt + 'if (not perm) and (not repl):\n'
ccc4 = ' and '.join(['(i%s>i%s)' % (i,i-1) for i in range(1,n)])
con4 = indt + ' if ' + ccc4 + ':\n'
bod1 = indt + ' s = ' + '+'.join(['i%s' % i for i in range(n)]) + '\n'
bod2 = indt + ' p.append(s)\n'
bod3 = indt + ' s = ' + '+'.join(['i%s' % i for i in range(n)]) +
'\n'
bod4 = indt + ' p.append(s)\n'
e = loop + sub1 + bod1 + bod2 + sub2 + con2 + bod3 + bod4 + sub3 +
con3 + bod3 + bod4 + sub4 + con4 + bod3 + bod4
exec e
return p
def p_cart(a,n):
A = list(a)
c = probstat.Cartesian([A for i in range(n)])
p = []
for i in c:
p.append(''.join(i))
return p
def p_perm(a,n):
A = list(a)
t0 = time.time()
if len(A)<n:
c = probstat.Permutation(A,n)
else:
c = probstat.Permutation(A)
print time.time()-t0
p = []
for i in c:
p.append(''.join(i))
return p
def p_comb(a,n):
A = list(a)
c = probstat.Combination(A,n)
p = []
for i in c:
p.append(''.join(i))
return p
print 'permutation w/ replacemment'
p = ooloop5("abc", 3, True, True)
print p
print
print 'combination w/ replacemment'
p = ooloop5("abc", 3, False, True)
print p
print
print 'permutation w/o replacemment'
p = ooloop5("abc", 3, True, False)
print p
print
print 'combination w/o replacemment'
p = ooloop5("abc", 3, False, False)
print p
print
print 'probstat.Cartesian'
p = p_cart('abc',3)
print p
print
print 'probstat.Permutation'
p = p_perm('abc',3)
print p
print
print 'probstat.Combination'
p = p_comb('abc',3)
print p
print
print 'permutation w/ replacemment "abcdefghijklmnopqrstuvwxyz":4'
t0 = time.time()
p = ooloop5("abcdefghijklmnopqrstuvwxyz", 4, True, True)
t1 = time.time()
print len(p),'4-letter words',t1-t0,'seconds'
print 'combination w/ replacemment "abcdefghijklmnopqrstuvwxyz":4'
t0 = time.time()
p = ooloop5("abcdefghijklmnopqrstuvwxyz", 4, False, True)
t1 = time.time()
print len(p),'4-letter words',t1-t0,'seconds'
print 'permutation w/o replacemment "abcdefghijklmnopqrstuvwxyz":4'
t0 = time.time()
p = ooloop5("abcdefghijklmnopqrstuvwxyz", 4, True, False)
t1 = time.time()
print len(p),'4-letter words',t1-t0,'seconds'
print 'combination w/o replacemment "abcdefghijklmnopqrstuvwxyz":4'
t0 = time.time()
p = ooloop5("abcdefghijklmnopqrstuvwxyz", 4, False, False)
t1 = time.time()
print len(p),'4-letter words',t1-t0,'seconds'
print
print 'probstat.Cartesian "abcdefghijklmnopqrstuvwxyz":4'
t0 = time.time()
p = p_cart('abcdefghijklmnopqrstuvwxyz',4)
t1 = time.time()
print len(p),'4-letter words',t1-t0,'seconds'
print 'probstat.Permutation "abcdefghijklmnopqrstuvwxyz":4'
t0 = time.time()
p = p_perm('abcdefghijklmnopqrstuvwxyz',4)
t1 = time.time()
print len(p),'4-letter words',t1-t0,'seconds'
print 'probstat.Combination "abcdefghijklmnopqrstuvwxyz":4'
t0 = time.time()
p = p_comb('abcdefghijklmnopqrstuvwxyz',4)
t1 = time.time()
print len(p),'4-letter words',t1-t0,'seconds'
-----------------------------------------------------
The short tests worked out fine
permutation w/ replacemment
['aaa', 'aab', 'aac', 'aba', 'abb', 'abc', 'aca', 'acb', 'acc', 'baa',
'bab', 'bac', 'bba', 'bbb', 'bbc', 'bca', 'bcb', 'bcc', 'caa', 'cab',
'cac', 'cba', 'cbb', 'cbc', 'cca', 'ccb', 'ccc']
combination w/ replacemment
['aaa', 'aab', 'aac', 'abb', 'abc', 'acc', 'bbb', 'bbc', 'bcc', 'ccc']
permutation w/o replacemment
['abc', 'acb', 'bac', 'bca', 'cab', 'cba']
combination w/o replacemment
['abc']
probstat.Cartesian
['aaa', 'baa', 'caa', 'aba', 'bba', 'cba', 'aca', 'bca', 'cca', 'aab',
'bab', 'cab', 'abb', 'bbb', 'cbb', 'acb', 'bcb', 'ccb', 'aac', 'bac',
'cac', 'abc', 'bbc', 'cbc', 'acc', 'bcc', 'ccc']
probstat.Permutation
['abc', 'acb', 'bac', 'bca', 'cab', 'cba']
probstat.Combination
['abc']
-----------------------------------------------------
Unfortunately, the long test died (out of virtual memory) executing
the probstat.Permution test.
But here's the strange thing. When executed from the Idle prompt,
it works fine.
import probstat
A = list('abcdefghijklmnopqrstuvwxyz')
c = probstat.Permutation(A,4)
print len(c) 358800
for i in c[:10]
rint i
['a', 'b', 'c', 'd']
['a', 'b', 'd', 'c']
['a', 'c', 'b', 'd']
['a', 'c', 'd', 'b']
['a', 'd', 'b', 'c']
['a', 'd', 'c', 'b']
['b', 'a', 'c', 'd']
['b', 'a', 'd', 'c']
['b', 'c', 'a', 'd']
['b', 'c', 'd', 'a']
p = []
for i in c: p.append(''.join(i))
for i in p[:10]
rint i
abcd
abdc
acbd
acdb
adbc
adcb
bacd
badc
bcad
bcda
-----------------------------------------------------
And if I comment out the Permutation test, everything else
works ok.
permutation w/ replacemment "abcdefghijklmnopqrstuvwxyz":4
456976 4-letter words 1.23500013351 seconds
combination w/ replacemment "abcdefghijklmnopqrstuvwxyz":4
23751 4-letter words 0.68799996376 seconds
permutation w/o replacemment "abcdefghijklmnopqrstuvwxyz":4
358800 4-letter words 1.67199993134 seconds
combination w/o replacemment "abcdefghijklmnopqrstuvwxyz":4
14950 4-letter words 0.59299993515 seconds
probstat.Cartesian "abcdefghijklmnopqrstuvwxyz":4
456976 4-letter words 1.42200016975 seconds
probstat.Combination "abcdefghijklmnopqrstuvwxyz":4
14950 4-letter words 0.0780000686646 seconds
-----------------------------------------------------
But it fails if I call it from inside a function.
A = list(a)
if len(A)<n:
c = probstat.Permutation(A,n)
else:
c = probstat.Permutation(A)
p = []
for i in c:
p.append(''.join(i))
return p
Crash! Out of virtual memory.
-----------------------------------------------------
I noticed that the probstat call takes virtually no time, most of it
taken by the p.append loop.
A = list(a)
if len(A)<n:
c = probstat.Permutation(A,n)
else:
c = probstat.Permutation(A)
print 'probstat finished'
p = []
for i in c:
p.append(''.join(i))
return p
probstat finished
Aha! Must be dying during the p.append loop.
-----------------------------------------------------
So we'll skip the append for now.
def p_perm(a,n):
A = list(a)
t0 = time.time()
if len(A)<n:
q = probstat.Permutation(A,n)
else:
q = probstat.Permutation(A)
print time.time()-t0
#p = []
#for i in c:
# p.append(''.join(i))
#return p
print len(q)
for i in q[:10]: print i
return q
probstat.Permutation "abcdefghijklmnopqrstuvwxyz":4
0.0
-1853882368
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'y', 'z']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'x', 'z', 'y']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'y', 'x', 'z']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'y', 'z', 'x']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'z', 'x', 'y']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'w', 'z', 'y', 'x']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'x', 'w', 'y', 'z']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'x', 'w', 'z', 'y']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'x', 'y', 'w', 'z']
['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j', 'k', 'l', 'm', 'n',
'o', 'p', 'q', 'r', 's', 't', 'u', 'v', 'x', 'y', 'z', 'w']
-1853882368 4-letter words 0.25 seconds
HUH??? No wonder the append loop has convulsions!
and why are the lists length 26? s/b len 4.
And yet...
q = probstat.Permutation(list('abcdefghijklmnopqrstuvwxyz'),4)
print len(q) 358800
for i in q[:10]
rint i
['a', 'b', 'c', 'd']
['a', 'b', 'd', 'c']
['a', 'c', 'b', 'd']
['a', 'c', 'd', 'b']
['a', 'd', 'b', 'c']
['a', 'd', 'c', 'b']
['b', 'a', 'c', 'd']
['b', 'a', 'd', 'c']
['b', 'c', 'a', 'd']
['b', 'c', 'd', 'a']
-----------------------------------------------------
That's one of the strangest problems I've ever seen.
I don't know if this has anything to do with the pyd file,
but I figured you would like to know.