Also sprach A. Sinan Unur:
(e-mail address removed)-state.edu (Michael T. Davis) wrote in
Say I want to match "gremlin" or the letters that compose the
word "gremlin", but in any order. Note that once "g" is consumed,
the set of available letters no longer includes "g". (Also, "g" isn't
necessarily going to be the first letter.) I would anticipate that a
proper solution for a word of <N> letters would approach a complexity
(or "big O") of N! (read "N factorial").
You are too pessimistic
Is there a solution which could be implemented as a single match,
or would this require some extra code around a match?
I don't see any mention of regexes in your post. I am not sure if that
is what you are after. There is a simple solution to this that falls
directly from your explanation of the problem:
use strict;
use warnings;
use Data:

umper;
sub check {
my ($orig, $target) = @_;
my %c;
use integer;
my @l = split //, $orig;
++$c{$_} for @l;
@l = split //, $target;
for (@l) {
if(exists($c{$_}) and $c{$_}) {
--$c{$_};
}
}
@l = grep { $_ > 0 } values %c;
scalar @l ? 0 : 1;
}
my %check = (
sinan => [ 'nasin', 'nasina', 'lasin' ],
gremlin => [ 'mergnil', 'mgrelanl', 'gremlin' ],
);
for my $k (keys %check) {
for my $t (@{ $check{$k} }) {
print "$k matches $t?: ";
if(check($k, $t)) {
print "Yes\n";
} else {
print "No\n";
}
}
}
A faster solution appears to involve sort(): split both strings, sort
them and compare for equality. According to a benchmark:
use strict;
use Benchmark qw/cmpthese/;
sub check {
my ($orig, $target) = @_;
my %c;
use integer;
my @l = split //, $orig;
++$c{$_} for @l;
@l = split //, $target;
for (@l) {
if(exists($c{$_}) and $c{$_}) {
--$c{$_};
}
}
@l = grep { $_ > 0 } values %c;
scalar @l ? 0 : 1;
}
sub check_sort {
my ($orig, $target) = @_;
my $o = join '', sort split //, $orig;
my $t = join '', sort split //, $target;
return $o eq $t;
}
my %check = (
sinan => [ 'nasin', 'nasina', 'lasin', ],
gremlin => [ 'mergnil', 'mgrelanl', 'gremlin' ],
);
cmpthese(-2, {
histo => sub {
for my $k (keys %check) {
for my $t (@{ $check{$k} }) {
check($k, $t);
}
}
},
sort => sub {
for my $k (keys %check) {
for my $t (@{ $check{$k} }) {
check_sort($k, $t);
}
}
},
});
__END__
Rate histo sort
histo 2280/s -- -43%
sort 3992/s 75% --
This might however be due to a denser implementation of check_sort()
avoiding temporary variables etc.
Also, check_sort() is more correct as it wont falsely report 'sinan' and
'nasina' as matching, which check() does. ;-) I'd write check() thusly:
sub check {
my ($orig, $target) = @_;
my %c;
++$c{$_} for split //, $orig;
--$c{$_} for split //, $target;
return ! grep $_, values %c;
}
This is still slower by roughly 25% than using sort. The 'use integer'
appears to have no effect on the benchmark.
Tassilo