Performance implications of using the Switch module

GreenLight · Apr 30, 2004

I hate looking at rows and rows of elsif statements as much as the
next person, but using the Switch module just doesn't cut it. I wrote:

use strict;
use warnings;
use Benchmark;
use Switch;

sub use_if() {
my ($tag, $value);
my @parsed;
my @tags = (
"TAG001^VALUE001", "TAG002^VALUE002", "TAG003^VALUE003",
"TAG004^VALUE004", "TAG005^VALUE005",
"TAG006^VALUE006", "TAG007^VALUE007", "TAG008^VALUE008",
"TAG009^VALUE009", "TAG010^VALUE010"
);
foreach my $next (@tags) {
($tag, $value) = ($next =~ /^(.*?)\^(.*?)$/);

if ($tag eq 'TAG001') { push @parsed, $value }
elsif ($tag eq 'TAG002') { push @parsed, $value }
elsif ($tag eq 'TAG003') { push @parsed, $value }
elsif ($tag eq 'TAG004') { push @parsed, $value }
elsif ($tag eq 'TAG005') { push @parsed, $value }
elsif ($tag eq 'TAG006') { push @parsed, $value }
elsif ($tag eq 'TAG007') { push @parsed, $value }
elsif ($tag eq 'TAG008') { push @parsed, $value }
elsif ($tag eq 'TAG009') { push @parsed, $value }
elsif ($tag eq 'TAG010') { push @parsed, $value }
else { die "Bad tag!" }
}
}

sub use_switch() {
my ($tag, $value);
my @parsed;
my @tags = (
"TAG001^VALUE001", "TAG002^VALUE002", "TAG003^VALUE003",
"TAG004^VALUE004", "TAG005^VALUE005",
"TAG006^VALUE006", "TAG007^VALUE007", "TAG008^VALUE008",
"TAG009^VALUE009", "TAG010^VALUE010"
);
foreach my $next (@tags) {
($tag, $value) = ($next =~ /^(.*?)\^(.*?)$/);

switch ($tag) {
case "TAG001" { push @parsed, $value }
case "TAG002" { push @parsed, $value }
case "TAG003" { push @parsed, $value }
case "TAG004" { push @parsed, $value }
case "TAG005" { push @parsed, $value }
case "TAG006" { push @parsed, $value }
case "TAG007" { push @parsed, $value }
case "TAG008" { push @parsed, $value }
case "TAG009" { push @parsed, $value }
case "TAG010" { push @parsed, $value }
else { die "Bad tag!" }
}
}
}

timethese (100000, {
"Using 'if'" => \&use_if,
"Using 'switch'" => \&use_switch
});

__END__

These subroutines adequately represent tasks performed thousands of
times per day at my client's site.
And the results:

Benchmark: timing 100000 iterations of Using 'if', Using 'switch'...
Using 'if': 17 wallclock secs (16.48 usr + 0.00 sys = 16.48 CPU) @
6066.49/s (n=100000)
Using 'switch': 153 wallclock secs (150.68 usr + 0.00 sys = 150.68
CPU) @ 663.68/s (n=100000)

Using "switch" was nearly an order of magnitude slower.

Now my real question: Does anyone know if the "forthcoming" Perl6
version (given/when, as described in "perldoc switch") will offer
better performance (that is, is anyone actually using any early
release and can comment upon the performance)?

Steven Kuo · Apr 30, 2004

I hate looking at rows and rows of elsif statements as much as the
next person, but using the Switch module just doesn't cut it. I wrote:

(snipped) ...

if ($tag eq 'TAG001') { push @parsed, $value }
elsif ($tag eq 'TAG002') { push @parsed, $value }
elsif ($tag eq 'TAG003') { push @parsed, $value }
elsif ($tag eq 'TAG004') { push @parsed, $value }
elsif ($tag eq 'TAG005') { push @parsed, $value }
elsif ($tag eq 'TAG006') { push @parsed, $value }
elsif ($tag eq 'TAG007') { push @parsed, $value }
elsif ($tag eq 'TAG008') { push @parsed, $value }
elsif ($tag eq 'TAG009') { push @parsed, $value }
elsif ($tag eq 'TAG010') { push @parsed, $value }
else { die "Bad tag!" }

(snipped) ...

switch ($tag) {
case "TAG001" { push @parsed, $value }
case "TAG002" { push @parsed, $value }
case "TAG003" { push @parsed, $value }
case "TAG004" { push @parsed, $value }
case "TAG005" { push @parsed, $value }
case "TAG006" { push @parsed, $value }
case "TAG007" { push @parsed, $value }
case "TAG008" { push @parsed, $value }
case "TAG009" { push @parsed, $value }
case "TAG010" { push @parsed, $value }
else { die "Bad tag!" }
}
....
timethese (100000, {
"Using 'if'" => \&use_if,
"Using 'switch'" => \&use_switch
});

__END__

These subroutines adequately represent tasks performed thousands of
times per day at my client's site.
And the results:

Benchmark: timing 100000 iterations of Using 'if', Using 'switch'...
Using 'if': 17 wallclock secs (16.48 usr + 0.00 sys = 16.48 CPU) @
6066.49/s (n=100000)
Using 'switch': 153 wallclock secs (150.68 usr + 0.00 sys = 150.68
CPU) @ 663.68/s (n=100000)

Using "switch" was nearly an order of magnitude slower.

I'd suggest that you use a hash instead. To me it's easier to read
and maintain:

{
# closure defined in this scope

my %good = map +('TAG' . sprintf("%03d", $_) => 1 ), ( 1 .. 10 );

sub use_hash {
my @parsed;
my @tags = (
"TAG001^VALUE001",
"TAG002^VALUE002",
"TAG003^VALUE003",
"TAG004^VALUE004",
"TAG005^VALUE005",
"TAG006^VALUE006",
"TAG007^VALUE007",
"TAG008^VALUE008",
"TAG009^VALUE009",
"TAG010^VALUE010",
);
foreach my $next (@tags) {
my ($tag, $value) = ($next =~ /^(.*?)\^(.*?)$/);
if ($good{$tag}) {
push @parsed, $value;
print "DEBUG: pushed $value into array \@parsed\n";
} else {
die "Bad tag ($tag)!";
}
}
}
}

use_hash();

I did not benchmark but it may be faster than what you've already
tried.

Tassilo v. Parseval · May 1, 2004

Also sprach GreenLight:

I hate looking at rows and rows of elsif statements as much as the
next person, but using the Switch module just doesn't cut it. I wrote:

use strict;
use warnings;
use Benchmark;
use Switch;

sub use_if() { [...]
if ($tag eq 'TAG001') { push @parsed, $value }
elsif ($tag eq 'TAG002') { push @parsed, $value }
elsif ($tag eq 'TAG003') { push @parsed, $value }
elsif ($tag eq 'TAG004') { push @parsed, $value }
elsif ($tag eq 'TAG005') { push @parsed, $value }
elsif ($tag eq 'TAG006') { push @parsed, $value }
elsif ($tag eq 'TAG007') { push @parsed, $value }
elsif ($tag eq 'TAG008') { push @parsed, $value }
elsif ($tag eq 'TAG009') { push @parsed, $value }
elsif ($tag eq 'TAG010') { push @parsed, $value }
else { die "Bad tag!" }
}
}

sub use_switch() { [...]
switch ($tag) {
case "TAG001" { push @parsed, $value }
case "TAG002" { push @parsed, $value }
case "TAG003" { push @parsed, $value }
case "TAG004" { push @parsed, $value }
case "TAG005" { push @parsed, $value }
case "TAG006" { push @parsed, $value }
case "TAG007" { push @parsed, $value }
case "TAG008" { push @parsed, $value }
case "TAG009" { push @parsed, $value }
case "TAG010" { push @parsed, $value }
else { die "Bad tag!" }
}
}
}

timethese (100000, {
"Using 'if'" => \&use_if,
"Using 'switch'" => \&use_switch
});

__END__

These subroutines adequately represent tasks performed thousands of
times per day at my client's site.
And the results:

Benchmark: timing 100000 iterations of Using 'if', Using 'switch'...
Using 'if': 17 wallclock secs (16.48 usr + 0.00 sys = 16.48 CPU) @
6066.49/s (n=100000)
Using 'switch': 153 wallclock secs (150.68 usr + 0.00 sys = 150.68
CPU) @ 663.68/s (n=100000)

Using "switch" was nearly an order of magnitude slower.

Now my real question: Does anyone know if the "forthcoming" Perl6
version (given/when, as described in "perldoc switch") will offer
better performance (that is, is anyone actually using any early
release and can comment upon the performance)?

Perl6 will have switch/case built into the language and it'll be part of
the Perl6's grammar. Perl5's switch, however, is done through cheating.
A source filter makes a preliminary run through your code at compile
time and translates it into code Perl5 can handle. Apparently, the
generated code is not very efficient. You can have a look at it by using
B:

eparse:

ethan@ethan:~$ perl -MSwitch -MO=Deparse
switch($var) {
case "bla" { print "bla\n" }
case "blu" { print "blu\n" }
}
^D
S_W_I_T_C_H: while (1) {
local $_S_W_I_T_C_H;
&Switch::switch($var);
if (&Switch::case('bla')) {
while (1) {
print "bla\n";
last S_W_I_T_C_H;
}
continue {
goto C_A_S_E_1;
}
last S_W_I_T_C_H;
C_A_S_E_1: ;
}
if (&Switch::case('blu')) {
while (1) {
print "blu\n";
last S_W_I_T_C_H;
}
continue {
goto C_A_S_E_2;
}
last S_W_I_T_C_H;
C_A_S_E_2: ;
}
}
continue {
last;
}

This is much slower than simple if/elsif/else chains because very
condition is handled through function calls (which itself are rather
slow in perl). Switch::case() then calls a codereference that does the
actual comparison, based upon the type of the 'case' condition. For a
simple string, as in your case, the codereference called from case()
looks like this:

$::_S_W_I_T_C_H =
sub { my $c_val = $_[0];
my $c_ref = ref $c_val;
return $s_val eq $c_val if $c_ref eq "";
return in([$s_val],$c_val) if $c_ref eq 'ARRAY';
return $c_val->($s_val) if $c_ref eq 'CODE';
return $c_val->call($s_val) if $c_ref eq 'Switch';
return scalar $s_val=~/$c_val/
if $c_ref eq 'Regexp';
return scalar $c_val->{$s_val}
if $c_ref eq 'HASH';
return;
};

It stands to reason that this has to be slow.

Tassilo

Anno Siegel · May 1, 2004

GreenLight said:
I hate looking at rows and rows of elsif statements as much as the
next person, but using the Switch module just doesn't cut it. I wrote:

use strict;
use warnings;
use Benchmark;
use Switch;

sub use_if() {
my ($tag, $value);
my @parsed;
my @tags = (
"TAG001^VALUE001", "TAG002^VALUE002", "TAG003^VALUE003",
"TAG004^VALUE004", "TAG005^VALUE005",
"TAG006^VALUE006", "TAG007^VALUE007", "TAG008^VALUE008",
"TAG009^VALUE009", "TAG010^VALUE010"
);
foreach my $next (@tags) {
($tag, $value) = ($next =~ /^(.*?)\^(.*?)$/);

if ($tag eq 'TAG001') { push @parsed, $value }
elsif ($tag eq 'TAG002') { push @parsed, $value }
elsif ($tag eq 'TAG003') { push @parsed, $value }
elsif ($tag eq 'TAG004') { push @parsed, $value }
elsif ($tag eq 'TAG005') { push @parsed, $value }
elsif ($tag eq 'TAG006') { push @parsed, $value }
elsif ($tag eq 'TAG007') { push @parsed, $value }
elsif ($tag eq 'TAG008') { push @parsed, $value }
elsif ($tag eq 'TAG009') { push @parsed, $value }
elsif ($tag eq 'TAG010') { push @parsed, $value }
else { die "Bad tag!" }
}
}

sub use_switch() {
my ($tag, $value);
my @parsed;
my @tags = (
"TAG001^VALUE001", "TAG002^VALUE002", "TAG003^VALUE003",
"TAG004^VALUE004", "TAG005^VALUE005",
"TAG006^VALUE006", "TAG007^VALUE007", "TAG008^VALUE008",
"TAG009^VALUE009", "TAG010^VALUE010"
);
foreach my $next (@tags) {
($tag, $value) = ($next =~ /^(.*?)\^(.*?)$/);

switch ($tag) {
case "TAG001" { push @parsed, $value }
case "TAG002" { push @parsed, $value }
case "TAG003" { push @parsed, $value }
case "TAG004" { push @parsed, $value }
case "TAG005" { push @parsed, $value }
case "TAG006" { push @parsed, $value }
case "TAG007" { push @parsed, $value }
case "TAG008" { push @parsed, $value }
case "TAG009" { push @parsed, $value }
case "TAG010" { push @parsed, $value }
else { die "Bad tag!" }
}
}
}

timethese (100000, {
"Using 'if'" => \&use_if,
"Using 'switch'" => \&use_switch
});

__END__

These subroutines adequately represent tasks performed thousands of
times per day at my client's site.

These only perform a single push as "payload" of the decision. Is that
a realistic representation of the program?

And the results:

Benchmark: timing 100000 iterations of Using 'if', Using 'switch'...
Using 'if': 17 wallclock secs (16.48 usr + 0.00 sys = 16.48 CPU) @
6066.49/s (n=100000)
Using 'switch': 153 wallclock secs (150.68 usr + 0.00 sys = 150.68
CPU) @ 663.68/s (n=100000)

Using "switch" was nearly an order of magnitude slower.

Appalling, isn't it? And yet, the result is misleading.

You say that these routines are performed thousands of times per day.
Let's be generous and say they're called 10_000 times. Your benchmark
has called them 100_000 times and lost about 150 seconds through the
use of "switch" (which over-estimates the loss, counting *all* of the
consumed time as overhead). So on 10_000 calls per day you waste a
total of 15 cpu-seconds, out of 24*60*60 = 86400 that are available.
Reason for concern? I think not.

Also, if you put a more realistic payload in the benchmarks, the result
is less dramatic. Assume the equivalent of "1 for 1 .. 10_000" instead
of a single "push". (My machine can run that 300 times per second,
so a few thousand times per day would still amount to very little.)
According to my benchmarks, this reduces the advantage of "if" over
"switch" to 7%. Again, no reason for much concern.

I, too, would like to see a light-weight switch statement, comparable
in execution speed to "if", but in many cases the Switch module will
still be adequate. If it isn't, there are a lot of ad-hoc alternatives
in Perl. Some of them, like dispatch tables, are blindingly fast where
applicable. Your branching on fixed strings looks like an invitation
to use a dispatch table.

Anno

Cost of qr// vs m//	5	Mar 28, 2014
RXParse module v.90 (by robic0)	0	May 29, 2006
RXParse module (by robic0), Version 0.1000	29	Apr 16, 2006
RXParse module v.91 (by robic0)	0	Jun 8, 2006
Once again: CGI help	3	Oct 6, 2009
hash of arrays	1	Sep 13, 2012
reuse code inquiry	3	Dec 5, 2007
Efficiently searching multiple files	10	May 20, 2010

Performance implications of using the Switch module

GreenLight

Steven Kuo

Tassilo v. Parseval

Anno Siegel

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads