A
Andrew
Hi, folks
a while ago i obtained a script that computes all unique subsets of a
set of N elements, which I am trying to apply to a new task.
The script works fine when N<14, but past that value my server gives up
with an "Out of memory!" message over CGI (server has 2GB RAM).
This is a one-time computation I need to do (not a regularly accessed
CGI script), so I can tolerate long waits, although anything beyond 30
minutes is worrisome (I may need to do last-minute update before a
deadline). A little about the application: I need to find out the most
balanced distribution (split) of a group of clusters, into two groups
of clusters (clusters themselves cannot be broken up). Clusters have
varying sizes, and it is the "size" quantity which I aggregate over the
two groups. The closer the two aggregate size quantities, the more
balanced the distribution. (There is a second-level sort to "break a
tie" but that's not essential to this post)
I realize that the underlying algorithm here is not Perl-specific, but
I am expressing it in Perl, and the "out of memory" phenomenon probably
has something to do with the Perl's memory management and/or with the
options of memory management available to script authors.
My current need is to compute combinations from a set of 32 elements.
At least process 32C15, 32C16, or thereabouts. Sadly, 32C16=601080390
.. The script below gives up at 32C6 (after processing only 242824
combinations). Is there nothing to be done besides seeking better
hardware? I'm afraid, though, that, given the exponential escalation of
memory usage, even some supercomputing, processor-clustering behemoth
at MIT couldn't handle this...(although that sounds unlikely).
Is there any way to optimize the code below, (clearing memory at
runtime, etc.)? Some completely different code? Could I discard
unwanted combinations (combinations with blatantly below-expected
balance) even while the recursion is in progress? (Can't quite figure
out how).
Here is the code:
sub combos {
my( $levels_left, $selection, @items ) = @_;
return () if $levels_left > @items;
return $selection unless $levels_left;
my @sets;
while( @items ) {
my $new_selection = $selection . ':'. shift @items;
push @sets,
combos( $levels_left-1, $new_selection, @items );
}
return @sets;
}
@combinations=combos( 16, '', (0..31));
-----------
returns an array of strings, each string containing colon-joined
members of array (0..31), a unique combination.
TIA
andrew
a while ago i obtained a script that computes all unique subsets of a
set of N elements, which I am trying to apply to a new task.
The script works fine when N<14, but past that value my server gives up
with an "Out of memory!" message over CGI (server has 2GB RAM).
This is a one-time computation I need to do (not a regularly accessed
CGI script), so I can tolerate long waits, although anything beyond 30
minutes is worrisome (I may need to do last-minute update before a
deadline). A little about the application: I need to find out the most
balanced distribution (split) of a group of clusters, into two groups
of clusters (clusters themselves cannot be broken up). Clusters have
varying sizes, and it is the "size" quantity which I aggregate over the
two groups. The closer the two aggregate size quantities, the more
balanced the distribution. (There is a second-level sort to "break a
tie" but that's not essential to this post)
I realize that the underlying algorithm here is not Perl-specific, but
I am expressing it in Perl, and the "out of memory" phenomenon probably
has something to do with the Perl's memory management and/or with the
options of memory management available to script authors.
My current need is to compute combinations from a set of 32 elements.
At least process 32C15, 32C16, or thereabouts. Sadly, 32C16=601080390
.. The script below gives up at 32C6 (after processing only 242824
combinations). Is there nothing to be done besides seeking better
hardware? I'm afraid, though, that, given the exponential escalation of
memory usage, even some supercomputing, processor-clustering behemoth
at MIT couldn't handle this...(although that sounds unlikely).
Is there any way to optimize the code below, (clearing memory at
runtime, etc.)? Some completely different code? Could I discard
unwanted combinations (combinations with blatantly below-expected
balance) even while the recursion is in progress? (Can't quite figure
out how).
Here is the code:
sub combos {
my( $levels_left, $selection, @items ) = @_;
return () if $levels_left > @items;
return $selection unless $levels_left;
my @sets;
while( @items ) {
my $new_selection = $selection . ':'. shift @items;
push @sets,
combos( $levels_left-1, $new_selection, @items );
}
return @sets;
}
@combinations=combos( 16, '', (0..31));
-----------
returns an array of strings, each string containing colon-joined
members of array (0..31), a unique combination.
TIA
andrew