M
Michele Dondi
This is a question of mine I asked in PerlMonks
(<http://perlmonks.org>) that turned into a very interesting
discussion, so interestin IMHO that I feel like reporting it here:
http://perlmonks.org/?node_id=625977
I would like to reproduce the whole thread, but I direct anyone who
could be interested there. For completeness I will adapt to pure text
the original post and paste it hereafter.
---------------------------------------------------------------------
In a recent discussion (<in ctt,
PerlTeX was mentioned (and I'm also mentioning it here because it is
not that widely known and IMHO it would deserve being) in relation to
shuffling, at which point in reply to a posted piece of code I wrote
about the
perldoc -q shuffle
FAQ entry, talking in particular about both Fisher-Yates and
List::Util's shuffle(). Now, I happened to check the source code for
the latter module, and since it has pure Perl code for the given
function, I noticed that in particular it is as follows:
sub shuffle (@) {
my @a=\(@_);
my $n;
my $i=@_;
map {
$n = rand($i--);
(${$a[$n]}, $a[$n] = $a[$i])[0];
} @_;
}
Now, at first sight it is a nice piece of Obfu, ain't it? Well, then
if you look at it, it's easy to see how it works and in particular
that it is still Fisher-Yates. But of course you have to think about
it for a while... and I wondered why it is like that...
To be fair, I think that taking references in the first place is to
avoid duplicating data in memory that could be heavy in memory
usage... but then I also thought that all that playing with references
would impose a performance penalty and -for once- I suppose that
performance does matter. So I tried the following Benchmark, comparing
a naive and fairly readable implementation of the algorithm with
List::Util's:
#!/usr/bin/perl
use strict;
use warnings;
use Benchmark qw/:all :hireswallclock/;
sub naive (@) {
my @l=@_;
for (reverse 1..$#l) {
my $r=int rand($_+1);
@l[$_,$r]=@l[$r,$_];
}
@l;
}
sub listutil (@) {
my @a=\(@_);
my $n;
my $i=@_;
map {
$n = rand($i--);
(${$a[$n]}, $a[$n] = $a[$i])[0];
} @_;
}
cmpthese -60, { map { $_ => "$_ 1..1000" } qw/naive listutil/ };
__END__
The results are as follows:
C:\temp>perl lus.pl
Rate naive listutil
naive 588/s -- -14%
listutil 684/s 16% --
So that pretty much may answer my question as to why the sub is
implemented like that... but then this raises the question as to how
could one come up with such an idea, because I'm sure I wouldn't
have...
Any comments?
(<http://perlmonks.org>) that turned into a very interesting
discussion, so interestin IMHO that I feel like reporting it here:
http://perlmonks.org/?node_id=625977
I would like to reproduce the whole thread, but I direct anyone who
could be interested there. For completeness I will adapt to pure text
the original post and paste it hereafter.
---------------------------------------------------------------------
In a recent discussion (<in ctt,
PerlTeX was mentioned (and I'm also mentioning it here because it is
not that widely known and IMHO it would deserve being) in relation to
shuffling, at which point in reply to a posted piece of code I wrote
about the
perldoc -q shuffle
FAQ entry, talking in particular about both Fisher-Yates and
List::Util's shuffle(). Now, I happened to check the source code for
the latter module, and since it has pure Perl code for the given
function, I noticed that in particular it is as follows:
sub shuffle (@) {
my @a=\(@_);
my $n;
my $i=@_;
map {
$n = rand($i--);
(${$a[$n]}, $a[$n] = $a[$i])[0];
} @_;
}
Now, at first sight it is a nice piece of Obfu, ain't it? Well, then
if you look at it, it's easy to see how it works and in particular
that it is still Fisher-Yates. But of course you have to think about
it for a while... and I wondered why it is like that...
To be fair, I think that taking references in the first place is to
avoid duplicating data in memory that could be heavy in memory
usage... but then I also thought that all that playing with references
would impose a performance penalty and -for once- I suppose that
performance does matter. So I tried the following Benchmark, comparing
a naive and fairly readable implementation of the algorithm with
List::Util's:
#!/usr/bin/perl
use strict;
use warnings;
use Benchmark qw/:all :hireswallclock/;
sub naive (@) {
my @l=@_;
for (reverse 1..$#l) {
my $r=int rand($_+1);
@l[$_,$r]=@l[$r,$_];
}
@l;
}
sub listutil (@) {
my @a=\(@_);
my $n;
my $i=@_;
map {
$n = rand($i--);
(${$a[$n]}, $a[$n] = $a[$i])[0];
} @_;
}
cmpthese -60, { map { $_ => "$_ 1..1000" } qw/naive listutil/ };
__END__
The results are as follows:
C:\temp>perl lus.pl
Rate naive listutil
naive 588/s -- -14%
listutil 684/s 16% --
So that pretty much may answer my question as to why the sub is
implemented like that... but then this raises the question as to how
could one come up with such an idea, because I'm sure I wouldn't
have...
Any comments?