how to - quickly make permutations?

M

Max Williams

can anyone provide an elegant implementation for this method?

#gives all distinct combinations of numbers up to n, with maximum size
max_size
def permutations(n,max_size)

so, eg,

permutations(4,2)
=> [[1],[2],[3],[4],[1,2],[1,3],[1,4],[2,3],[2,4],[3,4]]

permutations(4,3)
=> above + [[1,2,3],[1,2,4],[2,3,4]]

i'm guessing something recursive is the key but i can't quite work out
the best way.

thanks
max
 
A

Axel Etzold

-------- Original-Nachricht --------
Datum: Tue, 1 Jul 2008 02:58:51 +0900
Von: Max Williams <[email protected]>
An: (e-mail address removed)
Betreff: how to - quickly make permutations?
can anyone provide an elegant implementation for this method?

#gives all distinct combinations of numbers up to n, with maximum size
max_size
def permutations(n,max_size)

so, eg,

permutations(4,2)
=> [[1],[2],[3],[4],[1,2],[1,3],[1,4],[2,3],[2,4],[3,4]]

permutations(4,3)
=> above + [[1,2,3],[1,2,4],[2,3,4]]

i'm guessing something recursive is the key but i can't quite work out
the best way.

thanks
max

Max,

what you are asking for in your examples is not a permutation, but rather the collection of
subsets of up to k elements.
Naming the elements as 0,...,n-1 , consider this example code

n=5
numbers=(0...2**n)
set_size=3
sets=[]
for number in numbers
res =("%b" % number).split(//) # get a binary representation of number, as an Array of '0' and '1'
res=[0]*(n-res.length)+res # fill '0' in at the beginning, as many as necessary
el_number=res.dup.grep(/1/).length # search for those elements that have set_size times '1' in them
temp=[]
if el_number<=set_size
res.each_with_index{|x,i|
if x=='1'; temp<<i; end
}
sets<<temp
end
end
p sets

Somebody else might make this more concise, but as it is after 11pm here, not me..

Best regards,

Axel
 
F

Frederick Cheung

can anyone provide an elegant implementation for this method?

Quick bash at it:

def subsets(n, max_k)
results = []
1.upto(max_k) {|k| results << permutations(n,k, results.last)}
results
end

def permutations(n,k,previous_iteration=nil)
return (1..n).collect {|x| [x]} if k == 1
previous_iteration ||= permutations(n,k-1)
(1..n).inject([]) do |result, to_add|
result.concat( previous_iteration.inject([]) do |memo, perm|
memo << (perm + [to_add]).sort unless perm.include?(to_add)
memo
end)
end.uniq
end

This is recursive with a shortcut: since we are anyway accumulating
the previous results, there is no point calculating them over and over
again (if not p(n,1) is calculated max_k times, p(n,2) is calculated
max_k - 1 times etc...


Fred
#gives all distinct combinations of numbers up to n, with maximum size
max_size
def permutations(n,max_size)

so, eg,

permutations(4,2)
=> [[1],[2],[3],[4],[1,2],[1,3],[1,4],[2,3],[2,4],[3,4]]

permutations(4,3)
=> above + [[1,2,3],[1,2,4],[2,3,4]]

i'm guessing something recursive is the key but i can't quite work out
the best way.

thanks
max
 
J

jim finucane

[Note: parts of this message were removed to make it a legal post.]

each new element tries to double the size of the list

def permutations(n,max_size)
result =[[]]
Array.new(n).fill{|k|k+1}.each{|i|result +=result.collect{|x|
x.length<max_size ? x += : [] } }
return result.uniq - [[]]
end

or

def permutations (n,max_size)
result = [[]]
for i in 1..n
result +=result.collect{|x| x.length<max_size ? x += : [] }
end
return result.uniq - [[]]
end



On Mon, Jun 30, 2008 at 6:13 PM, Frederick Cheung <
On 30 Jun 2008, at 18:58, Max Williams wrote:

can anyone provide an elegant implementation for this method?Quick bash at it:

def subsets(n, max_k)
results = []
1.upto(max_k) {|k| results << permutations(n,k, results.last)}
results
end

def permutations(n,k,previous_iteration=nil)
return (1..n).collect {|x| [x]} if k == 1
previous_iteration ||= permutations(n,k-1)
(1..n).inject([]) do |result, to_add|
result.concat( previous_iteration.inject([]) do |memo, perm|
memo << (perm + [to_add]).sort unless perm.include?(to_add)
memo
end)
end.uniq
end

This is recursive with a shortcut: since we are anyway accumulating the
previous results, there is no point calculating them over and over again (if
not p(n,1) is calculated max_k times, p(n,2) is calculated max_k - 1 times
etc...


Fred


#gives all distinct combinations of numbers up to n, with maximum size
max_size
def permutations(n,max_size)

so, eg,

permutations(4,2)
=> [[1],[2],[3],[4],[1,2],[1,3],[1,4],[2,3],[2,4],[3,4]]

permutations(4,3)
=> above + [[1,2,3],[1,2,4],[2,3,4]]

i'm guessing something recursive is the key but i can't quite work out
the best way.

thanks
max
 
M

Max Williams

Frederick said:
This is recursive with a shortcut: since we are anyway accumulating
the previous results, there is no point calculating them over and over
again (if not p(n,1) is calculated max_k times, p(n,2) is calculated
max_k - 1 times etc...


Fred

Thanks everyone - Fred, this is perfect, having the different sized
groups seperated into their own arrays is actually even better for me
than what i specified. Thanks.

Axel - you're right, i did mean subsets. oops.


thanks again,
max
 
M

Max Williams

jim said:
def permutations (n,max_size)
result = [[]]
for i in 1..n
result +=result.collect{|x| x.length<max_size ? x += : [] }
end
return result.uniq - [[]]
end


Jim, this wins the elegance prize i think! It's almost magical...i
can't quite get my head round it. :)
 
R

Robert Dober

jim finucane wrote:
Jim, this wins the elegance prize i think! It's almost magical...i
can't quite get my head round it. :)
It runs into biiig performance issues for n >> size, and worse it does
not use inject ;)
look at these benchmarks
==================================
require 'benchmark'
def robert n, size
(1..n).inject([[]]){
|perms, ele|
perms.concat( perms.map{|perm| perm+[ele]} ).
uniq.delete_if{|p| p.size > size}
}
end

def jim n, size
result = [[]]
for i in 1..n
result +=result.collect{|x| x.length<size ? x += : [] }
end
result.uniq - [[]]
end

N,S,R = ARGV.map{|x| x.to_i}
Benchmark::bmbm do | report |
report.report("robert") do
R.times do
robert N, S
end
end

report.report("jim") do
R.times do
jim N, S
end
end
end
=============================

528/28 > ruby perms.rb 4 2 10000
Rehearsal ------------------------------------------
robert 1.328000 0.032000 1.360000 ( 1.359000)
jim 1.000000 0.015000 1.015000 ( 1.016000)
--------------------------------- total: 2.375000sec

user system total real
robert 1.344000 0.032000 1.376000 ( 1.375000)
jim 1.016000 0.031000 1.047000 ( 1.047000)
=======================================
529/29 > ruby perms.rb 10 2 100
Rehearsal ------------------------------------------
robert 0.141000 0.000000 0.141000 ( 0.141000)
jim 0.484000 0.016000 0.500000 ( 0.500000)
--------------------------------- total: 0.641000sec

user system total real
robert 0.141000 0.000000 0.141000 ( 0.141000)
jim 0.453000 0.000000 0.453000 ( 0.453000)
========================================

530/30 > ruby perms.rb 20 2 10
Rehearsal ------------------------------------------
robert 0.125000 0.000000 0.125000 ( 0.125000)
jim 53.922000 0.656000 54.578000 ( 54.610000)
-------------------------------- total: 54.703000sec

user system total real
robert 0.140000 0.000000 0.140000 ( 0.141000)
jim 57.750000 0.531000 58.281000 ( 58.297000)
========================================

of course if n and size are large there is not much that can be done
on the exponential runtime
as O(n!) is just O(e**n) :(

531/31 > ruby perms.rb 20 15 1
Rehearsal ------------------------------------------
robert 95.469000 0.312000 95.781000 ( 95.890000)
jim 95.703000 0.266000 95.969000 ( 96.219000)
------------------------------- total: 191.750000sec

user system total real
robert 95.422000 0.281000 95.703000 ( 96.141000)
jim 91.843000 0.172000 92.015000 ( 92.141000)

HTH
Robert
 
M

Max Williams

Robert said:
It runs into biiig performance issues for n >> size, and worse it does
not use inject ;)

That's good to know actually, my real numbers are likely to be n =
50ish, max_size = 10ish. Right in the pain spot. So maybe elegant
wasn't quite the right word...elegant in design but not operation?
 
A

Axel Etzold

-------- Original-Nachricht --------
Datum: Tue, 1 Jul 2008 22:50:05 +0900
Von: Max Williams <[email protected]>
An: (e-mail address removed)
Betreff: Re: how to - quickly make permutations?
Max,


That's good to know actually, my real numbers are likely to be n =
50ish, max_size = 10ish. Right in the pain spot. So maybe elegant
wasn't quite the right word...elegant in design but not operation?

what are you trying to do ? The number of subsets is enormous... maybe there's
an easier way to achieve your goal.

Best regards,

Axel
 
R

Robert Dober

That's good to know actually, my real numbers are likely to be n =
50ish, max_size = 10ish. Right in the pain spot. So maybe elegant
Oh that will be tough in Ruby, if I have some time I will try to
optimize this, inject is known to be slow, but one cannot expect a
speedup more than factor 2 or 3 by using each instead. Thus looking at
the performance right now,
I do not have much hope :(

I stopped the program after 15 minutes :(.


But there was a binding to a c framework for doing these things fast,
does anybody remember?
Cheers
Robert
 
M

Max Williams

Axel said:
what are you trying to do ? The number of subsets is enormous... maybe
there's
an easier way to achieve your goal.

Best regards,

Axel

Maybe there is...it's a little program to help my wife out with a bit of
data processing. I have a bunch of columns (50ish) that each list a
group of people. The people overlap between columns and we want to look
at what happens to the total degree of overlap when groups of columns
are removed. So, i was going to try removing each column and calculate
the total overlap for each case. Then, remove every possible pair of
columns, recalculate for each case. Then, every possible trio of
columns, etc. And keep going until the number of subsets to test starts
to get impractical. Which might happen quite quickly as you suggest.
 
A

ara.t.howard

Maybe there is...it's a little program to help my wife out with a
bit of
data processing. I have a bunch of columns (50ish) that each list a
group of people. The people overlap between columns and we want to
look
at what happens to the total degree of overlap when groups of columns
are removed. So, i was going to try removing each column and
calculate
the total overlap for each case. Then, remove every possible pair of
columns, recalculate for each case. Then, every possible trio of
columns, etc. And keep going until the number of subsets to test
starts
to get impractical. Which might happen quite quickly as you suggest.
--


gem install permutation


it's *super* fast.

a @ http://codeforpeople.com/
 
A

Axel Etzold

-------- Original-Nachricht --------
Datum: Tue, 1 Jul 2008 23:15:21 +0900
Von: Max Williams <[email protected]>
An: (e-mail address removed)
Betreff: Re: how to - quickly make permutations?
Max,





Maybe there is...it's a little program to help my wife out with a bit of
data processing. I have a bunch of columns (50ish) that each list a
group of people. The people overlap between columns and we want to look
at what happens to the total degree of overlap when groups of columns
are removed. So, i was going to try removing each column and calculate
the total overlap for each case. Then, remove every possible pair of
columns, recalculate for each case. Then, every possible trio of
columns, etc. And keep going until the number of subsets to test starts
to get impractical. Which might happen quite quickly as you suggest.

an alternative approach is simulated annealing ... it makes use of "jumping" in a
space of parameters, where the jumps are initially quite big, but get smaller as your solution gets better.
In this way, in nature, liquids that get cooled to crystals achieve very low energy configurations if cooled
slowly...
There's an implementation for Ruby using GSL

http://rb-gsl.rubyforge.org/siman.html

In your case, you could use a vector of 50 entries for the columns, each entry determining whether that column
is in your subset at hand or not.
In addition to that, you need a function that measures the quality of your solution, i.e. calculates the overlap.
As the value of the vector entries is changed by the Iterator function implemented in Ruby-gsl when it jumps in the
configuration space, you could say, "column i is in my subset, if the entry is positive, and otherwise not",
then calculate the overlap resulting from that collection of subsets using the Set class,say, and then decide whether a lot of overlap is good or bad for your solution.

The number of solutions to check is certainly far smaller than iterating through all subsets...

Best regards,

Axel
 
M

Max Williams

ara.t.howard said:
gem install permutation


it's *super* fast.

thanks - i looked at that but couldn't see how to only get distinct
subsets, ie to not get [1,2,3], [1,3,2], [3,1,2] etc in the same result
set. I was probably being a bit dumb though...i had just got back from
glastonbury when i was having a go :)
 
M

Max Williams

Axel said:
an alternative approach is simulated annealing ... it makes use of
"jumping" in a
space of parameters, where the jumps are initially quite big, but get
smaller as your solution gets better. ..
There's an implementation for Ruby using GSL

http://rb-gsl.rubyforge.org/siman.html

Thanks axel - i was thinking that some sort of 'homing in' approach
might be better but couldn't work out how to go about it. I'll have a
look at this.
 
K

krusty.ar

Thanks axel - i was thinking that some sort of 'homing in' approach
might be better but couldn't work out how to go about it.  I'll have a
look at this.

If you find it hard to define your "steps", you should try a genetic
algorithm aproach, using an array of 50 elements (representing if each
column is on the solution or not), and a function that measures the
fitness (how good is that combination).

Initially you create 100 random boolean arrays, select the best 20,
mix them in some way, for instance you could use the first half of one
array and the last half of another, or swaping a number of pairs of
columns in a single array (mutation), to create the remaining 80, and
repeat the process until your best candidate is good enough.

Of course the numbers can be tuned and will affect the efficience of
the program.

Lucas.
 
A

Adam Shelly

Oh that will be tough in Ruby, if I have some time I will try to
optimize this,
I stopped the program after 15 minutes :(.

Here's an alternate implementation using the "bankers order" algorithm from
http://applied-math.org/subset.pdf, and some timings using Robert's
program - for small samples they are similar, but this algorithm
scales better. (Unfortunately, it still brings my machine to its
knees on 50/10)
==========================
def adam( n, size)
r=[]
a = Array.new(n+1){0}
size.times{|sz|
while (a[sz+1]<1)
r<< (a-[0]) #.sort
a[i=0]+=1
a[i+=1]+=1 while (a > n-i )
(a[i-1]=a+1; i-=1)while (i>0)
end
}
r
end
==========================

C:\code\quiz>subsetsRubyTalk.rb 10 5 100
Rehearsal ------------------------------------------
robert 1.563000 0.000000 1.563000 ( 1.641000)
adam 1.516000 0.047000 1.563000 ( 1.562000)
--------------------------------- total: 3.126000sec

user system total real
robert 1.562000 0.016000 1.578000 ( 1.578000)
adam 1.515000 0.015000 1.530000 ( 1.531000)

C:\code\quiz>subsetsRubyTalk.rb 20 5 10
Rehearsal ------------------------------------------
robert 49.984000 0.109000 50.093000 ( 50.858000)
adam 7.282000 0.093000 7.375000 ( 7.375000)
------------------------------- total: 232.078000sec

user system total real
robert 46.344000 0.156000 46.500000 ( 46.608000)
adam 3.234000 0.047000 3.281000 ( 3.281000)


-Adam
 
J

jim finucane

[Note: parts of this message were removed to make it a legal post.]

1. I believe 50/10 is over ten billion long.
2. robert and all: thanks

3. A tolerably inefficient approach is to just take the column with the
most remaining folks each time
until you have covered everyone:


uncovered_list =[everyone]
while ( uncovered-list.length > 0 )
a= the column containing the longest list of folks not yet
covered;
uncovered_list -= a
columns.map!{|e| e-a}
end


Oh that will be tough in Ruby, if I have some time I will try to
optimize this,
I stopped the program after 15 minutes :(.

Here's an alternate implementation using the "bankers order" algorithm from
http://applied-math.org/subset.pdf, and some timings using Robert's
program - for small samples they are similar, but this algorithm
scales better. (Unfortunately, it still brings my machine to its
knees on 50/10)
==========================
def adam( n, size)
r=[]
a = Array.new(n+1){0}
size.times{|sz|
while (a[sz+1]<1)
r<< (a-[0]) #.sort
a[i=0]+=1
a[i+=1]+=1 while (a > n-i )
(a[i-1]=a+1; i-=1)while (i>0)
end
}
r
end
==========================

C:\code\quiz>subsetsRubyTalk.rb 10 5 100
Rehearsal ------------------------------------------
robert 1.563000 0.000000 1.563000 ( 1.641000)
adam 1.516000 0.047000 1.563000 ( 1.562000)
--------------------------------- total: 3.126000sec

user system total real
robert 1.562000 0.016000 1.578000 ( 1.578000)
adam 1.515000 0.015000 1.530000 ( 1.531000)

C:\code\quiz>subsetsRubyTalk.rb 20 5 10
Rehearsal ------------------------------------------
robert 49.984000 0.109000 50.093000 ( 50.858000)
adam 7.282000 0.093000 7.375000 ( 7.375000)
------------------------------- total: 232.078000sec

user system total real
robert 46.344000 0.156000 46.500000 ( 46.608000)
adam 3.234000 0.047000 3.281000 ( 3.281000)


-Adam
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,772
Messages
2,569,591
Members
45,103
Latest member
VinaykumarnNevatia
Top