Performance implications of using the Switch module

Discussion in 'Perl Misc' started by GreenLight, Apr 30, 2004.

  1. GreenLight

    GreenLight Guest

    I hate looking at rows and rows of elsif statements as much as the
    next person, but using the Switch module just doesn't cut it. I wrote:

    use strict;
    use warnings;
    use Benchmark;
    use Switch;

    sub use_if() {
    my ($tag, $value);
    my @parsed;
    my @tags = (
    "TAG001^VALUE001", "TAG002^VALUE002", "TAG003^VALUE003",
    "TAG004^VALUE004", "TAG005^VALUE005",
    "TAG006^VALUE006", "TAG007^VALUE007", "TAG008^VALUE008",
    "TAG009^VALUE009", "TAG010^VALUE010"
    );
    foreach my $next (@tags) {
    ($tag, $value) = ($next =~ /^(.*?)\^(.*?)$/);

    if ($tag eq 'TAG001') { push @parsed, $value }
    elsif ($tag eq 'TAG002') { push @parsed, $value }
    elsif ($tag eq 'TAG003') { push @parsed, $value }
    elsif ($tag eq 'TAG004') { push @parsed, $value }
    elsif ($tag eq 'TAG005') { push @parsed, $value }
    elsif ($tag eq 'TAG006') { push @parsed, $value }
    elsif ($tag eq 'TAG007') { push @parsed, $value }
    elsif ($tag eq 'TAG008') { push @parsed, $value }
    elsif ($tag eq 'TAG009') { push @parsed, $value }
    elsif ($tag eq 'TAG010') { push @parsed, $value }
    else { die "Bad tag!" }
    }
    }

    sub use_switch() {
    my ($tag, $value);
    my @parsed;
    my @tags = (
    "TAG001^VALUE001", "TAG002^VALUE002", "TAG003^VALUE003",
    "TAG004^VALUE004", "TAG005^VALUE005",
    "TAG006^VALUE006", "TAG007^VALUE007", "TAG008^VALUE008",
    "TAG009^VALUE009", "TAG010^VALUE010"
    );
    foreach my $next (@tags) {
    ($tag, $value) = ($next =~ /^(.*?)\^(.*?)$/);

    switch ($tag) {
    case "TAG001" { push @parsed, $value }
    case "TAG002" { push @parsed, $value }
    case "TAG003" { push @parsed, $value }
    case "TAG004" { push @parsed, $value }
    case "TAG005" { push @parsed, $value }
    case "TAG006" { push @parsed, $value }
    case "TAG007" { push @parsed, $value }
    case "TAG008" { push @parsed, $value }
    case "TAG009" { push @parsed, $value }
    case "TAG010" { push @parsed, $value }
    else { die "Bad tag!" }
    }
    }
    }

    timethese (100000, {
    "Using 'if'" => \&use_if,
    "Using 'switch'" => \&use_switch
    });

    __END__

    These subroutines adequately represent tasks performed thousands of
    times per day at my client's site.
    And the results:

    Benchmark: timing 100000 iterations of Using 'if', Using 'switch'...
    Using 'if': 17 wallclock secs (16.48 usr + 0.00 sys = 16.48 CPU) @
    6066.49/s (n=100000)
    Using 'switch': 153 wallclock secs (150.68 usr + 0.00 sys = 150.68
    CPU) @ 663.68/s (n=100000)

    Using "switch" was nearly an order of magnitude slower.

    Now my real question: Does anyone know if the "forthcoming" Perl6
    version (given/when, as described in "perldoc switch") will offer
    better performance (that is, is anyone actually using any early
    release and can comment upon the performance)?
     
    GreenLight, Apr 30, 2004
    #1
    1. Advertising

  2. GreenLight

    Steven Kuo Guest

    On 30 Apr 2004, GreenLight wrote:

    > I hate looking at rows and rows of elsif statements as much as the
    > next person, but using the Switch module just doesn't cut it. I wrote:


    (snipped) ...

    > if ($tag eq 'TAG001') { push @parsed, $value }
    > elsif ($tag eq 'TAG002') { push @parsed, $value }
    > elsif ($tag eq 'TAG003') { push @parsed, $value }
    > elsif ($tag eq 'TAG004') { push @parsed, $value }
    > elsif ($tag eq 'TAG005') { push @parsed, $value }
    > elsif ($tag eq 'TAG006') { push @parsed, $value }
    > elsif ($tag eq 'TAG007') { push @parsed, $value }
    > elsif ($tag eq 'TAG008') { push @parsed, $value }
    > elsif ($tag eq 'TAG009') { push @parsed, $value }
    > elsif ($tag eq 'TAG010') { push @parsed, $value }
    > else { die "Bad tag!" }



    (snipped) ...

    > switch ($tag) {
    > case "TAG001" { push @parsed, $value }
    > case "TAG002" { push @parsed, $value }
    > case "TAG003" { push @parsed, $value }
    > case "TAG004" { push @parsed, $value }
    > case "TAG005" { push @parsed, $value }
    > case "TAG006" { push @parsed, $value }
    > case "TAG007" { push @parsed, $value }
    > case "TAG008" { push @parsed, $value }
    > case "TAG009" { push @parsed, $value }
    > case "TAG010" { push @parsed, $value }
    > else { die "Bad tag!" }
    > }


    ....
    > timethese (100000, {
    > "Using 'if'" => \&use_if,
    > "Using 'switch'" => \&use_switch
    > });
    >
    > __END__
    >
    > These subroutines adequately represent tasks performed thousands of
    > times per day at my client's site.
    > And the results:
    >
    > Benchmark: timing 100000 iterations of Using 'if', Using 'switch'...
    > Using 'if': 17 wallclock secs (16.48 usr + 0.00 sys = 16.48 CPU) @
    > 6066.49/s (n=100000)
    > Using 'switch': 153 wallclock secs (150.68 usr + 0.00 sys = 150.68
    > CPU) @ 663.68/s (n=100000)
    >
    > Using "switch" was nearly an order of magnitude slower.




    I'd suggest that you use a hash instead. To me it's easier to read
    and maintain:

    {
    # closure defined in this scope

    my %good = map +('TAG' . sprintf("%03d", $_) => 1 ), ( 1 .. 10 );

    sub use_hash {
    my @parsed;
    my @tags = (
    "TAG001^VALUE001",
    "TAG002^VALUE002",
    "TAG003^VALUE003",
    "TAG004^VALUE004",
    "TAG005^VALUE005",
    "TAG006^VALUE006",
    "TAG007^VALUE007",
    "TAG008^VALUE008",
    "TAG009^VALUE009",
    "TAG010^VALUE010",
    );
    foreach my $next (@tags) {
    my ($tag, $value) = ($next =~ /^(.*?)\^(.*?)$/);
    if ($good{$tag}) {
    push @parsed, $value;
    print "DEBUG: pushed $value into array \@parsed\n";
    } else {
    die "Bad tag ($tag)!";
    }
    }
    }
    }

    use_hash();

    I did not benchmark but it may be faster than what you've already
    tried.

    --
    Hope this helps,
    Steven
     
    Steven Kuo, Apr 30, 2004
    #2
    1. Advertising

  3. Also sprach GreenLight:

    > I hate looking at rows and rows of elsif statements as much as the
    > next person, but using the Switch module just doesn't cut it. I wrote:
    >
    > use strict;
    > use warnings;
    > use Benchmark;
    > use Switch;
    >
    > sub use_if() {

    [...]
    > if ($tag eq 'TAG001') { push @parsed, $value }
    > elsif ($tag eq 'TAG002') { push @parsed, $value }
    > elsif ($tag eq 'TAG003') { push @parsed, $value }
    > elsif ($tag eq 'TAG004') { push @parsed, $value }
    > elsif ($tag eq 'TAG005') { push @parsed, $value }
    > elsif ($tag eq 'TAG006') { push @parsed, $value }
    > elsif ($tag eq 'TAG007') { push @parsed, $value }
    > elsif ($tag eq 'TAG008') { push @parsed, $value }
    > elsif ($tag eq 'TAG009') { push @parsed, $value }
    > elsif ($tag eq 'TAG010') { push @parsed, $value }
    > else { die "Bad tag!" }
    > }
    > }
    >
    > sub use_switch() {

    [...]
    > switch ($tag) {
    > case "TAG001" { push @parsed, $value }
    > case "TAG002" { push @parsed, $value }
    > case "TAG003" { push @parsed, $value }
    > case "TAG004" { push @parsed, $value }
    > case "TAG005" { push @parsed, $value }
    > case "TAG006" { push @parsed, $value }
    > case "TAG007" { push @parsed, $value }
    > case "TAG008" { push @parsed, $value }
    > case "TAG009" { push @parsed, $value }
    > case "TAG010" { push @parsed, $value }
    > else { die "Bad tag!" }
    > }
    > }
    > }
    >
    > timethese (100000, {
    > "Using 'if'" => \&use_if,
    > "Using 'switch'" => \&use_switch
    > });
    >
    > __END__
    >
    > These subroutines adequately represent tasks performed thousands of
    > times per day at my client's site.
    > And the results:
    >
    > Benchmark: timing 100000 iterations of Using 'if', Using 'switch'...
    > Using 'if': 17 wallclock secs (16.48 usr + 0.00 sys = 16.48 CPU) @
    > 6066.49/s (n=100000)
    > Using 'switch': 153 wallclock secs (150.68 usr + 0.00 sys = 150.68
    > CPU) @ 663.68/s (n=100000)
    >
    > Using "switch" was nearly an order of magnitude slower.
    >
    > Now my real question: Does anyone know if the "forthcoming" Perl6
    > version (given/when, as described in "perldoc switch") will offer
    > better performance (that is, is anyone actually using any early
    > release and can comment upon the performance)?


    Perl6 will have switch/case built into the language and it'll be part of
    the Perl6's grammar. Perl5's switch, however, is done through cheating.
    A source filter makes a preliminary run through your code at compile
    time and translates it into code Perl5 can handle. Apparently, the
    generated code is not very efficient. You can have a look at it by using
    B::Deparse:

    ethan@ethan:~$ perl -MSwitch -MO=Deparse
    switch($var) {
    case "bla" { print "bla\n" }
    case "blu" { print "blu\n" }
    }
    ^D
    S_W_I_T_C_H: while (1) {
    local $_S_W_I_T_C_H;
    &Switch::switch($var);
    if (&Switch::case('bla')) {
    while (1) {
    print "bla\n";
    last S_W_I_T_C_H;
    }
    continue {
    goto C_A_S_E_1;
    }
    last S_W_I_T_C_H;
    C_A_S_E_1: ;
    }
    if (&Switch::case('blu')) {
    while (1) {
    print "blu\n";
    last S_W_I_T_C_H;
    }
    continue {
    goto C_A_S_E_2;
    }
    last S_W_I_T_C_H;
    C_A_S_E_2: ;
    }
    }
    continue {
    last;
    }

    This is much slower than simple if/elsif/else chains because very
    condition is handled through function calls (which itself are rather
    slow in perl). Switch::case() then calls a codereference that does the
    actual comparison, based upon the type of the 'case' condition. For a
    simple string, as in your case, the codereference called from case()
    looks like this:

    $::_S_W_I_T_C_H =
    sub { my $c_val = $_[0];
    my $c_ref = ref $c_val;
    return $s_val eq $c_val if $c_ref eq "";
    return in([$s_val],$c_val) if $c_ref eq 'ARRAY';
    return $c_val->($s_val) if $c_ref eq 'CODE';
    return $c_val->call($s_val) if $c_ref eq 'Switch';
    return scalar $s_val=~/$c_val/
    if $c_ref eq 'Regexp';
    return scalar $c_val->{$s_val}
    if $c_ref eq 'HASH';
    return;
    };

    It stands to reason that this has to be slow.

    Tassilo
    --
    $_=q#",}])!JAPH!qq(tsuJ[{@"tnirp}3..0}_$;//::niam/s~=)]3[))_$-3(rellac(=_$({
    pam{rekcahbus})(rekcah{lrePbus})(lreP{rehtonabus})!JAPH!qq(rehtona{tsuJbus#;
    $_=reverse,s+(?<=sub).+q#q!'"qq.\t$&."'!#+sexisexiixesixeseg;y~\n~~dddd;eval
     
    Tassilo v. Parseval, May 1, 2004
    #3
  4. GreenLight

    Anno Siegel Guest

    GreenLight <> wrote in comp.lang.perl.misc:
    > I hate looking at rows and rows of elsif statements as much as the
    > next person, but using the Switch module just doesn't cut it. I wrote:
    >
    > use strict;
    > use warnings;
    > use Benchmark;
    > use Switch;
    >
    > sub use_if() {
    > my ($tag, $value);
    > my @parsed;
    > my @tags = (
    > "TAG001^VALUE001", "TAG002^VALUE002", "TAG003^VALUE003",
    > "TAG004^VALUE004", "TAG005^VALUE005",
    > "TAG006^VALUE006", "TAG007^VALUE007", "TAG008^VALUE008",
    > "TAG009^VALUE009", "TAG010^VALUE010"
    > );
    > foreach my $next (@tags) {
    > ($tag, $value) = ($next =~ /^(.*?)\^(.*?)$/);
    >
    > if ($tag eq 'TAG001') { push @parsed, $value }
    > elsif ($tag eq 'TAG002') { push @parsed, $value }
    > elsif ($tag eq 'TAG003') { push @parsed, $value }
    > elsif ($tag eq 'TAG004') { push @parsed, $value }
    > elsif ($tag eq 'TAG005') { push @parsed, $value }
    > elsif ($tag eq 'TAG006') { push @parsed, $value }
    > elsif ($tag eq 'TAG007') { push @parsed, $value }
    > elsif ($tag eq 'TAG008') { push @parsed, $value }
    > elsif ($tag eq 'TAG009') { push @parsed, $value }
    > elsif ($tag eq 'TAG010') { push @parsed, $value }
    > else { die "Bad tag!" }
    > }
    > }
    >
    > sub use_switch() {
    > my ($tag, $value);
    > my @parsed;
    > my @tags = (
    > "TAG001^VALUE001", "TAG002^VALUE002", "TAG003^VALUE003",
    > "TAG004^VALUE004", "TAG005^VALUE005",
    > "TAG006^VALUE006", "TAG007^VALUE007", "TAG008^VALUE008",
    > "TAG009^VALUE009", "TAG010^VALUE010"
    > );
    > foreach my $next (@tags) {
    > ($tag, $value) = ($next =~ /^(.*?)\^(.*?)$/);
    >
    > switch ($tag) {
    > case "TAG001" { push @parsed, $value }
    > case "TAG002" { push @parsed, $value }
    > case "TAG003" { push @parsed, $value }
    > case "TAG004" { push @parsed, $value }
    > case "TAG005" { push @parsed, $value }
    > case "TAG006" { push @parsed, $value }
    > case "TAG007" { push @parsed, $value }
    > case "TAG008" { push @parsed, $value }
    > case "TAG009" { push @parsed, $value }
    > case "TAG010" { push @parsed, $value }
    > else { die "Bad tag!" }
    > }
    > }
    > }
    >
    > timethese (100000, {
    > "Using 'if'" => \&use_if,
    > "Using 'switch'" => \&use_switch
    > });
    >
    > __END__
    >
    > These subroutines adequately represent tasks performed thousands of
    > times per day at my client's site.


    These only perform a single push as "payload" of the decision. Is that
    a realistic representation of the program?

    > And the results:
    >
    > Benchmark: timing 100000 iterations of Using 'if', Using 'switch'...
    > Using 'if': 17 wallclock secs (16.48 usr + 0.00 sys = 16.48 CPU) @
    > 6066.49/s (n=100000)
    > Using 'switch': 153 wallclock secs (150.68 usr + 0.00 sys = 150.68
    > CPU) @ 663.68/s (n=100000)
    >
    > Using "switch" was nearly an order of magnitude slower.


    Appalling, isn't it? And yet, the result is misleading.

    You say that these routines are performed thousands of times per day.
    Let's be generous and say they're called 10_000 times. Your benchmark
    has called them 100_000 times and lost about 150 seconds through the
    use of "switch" (which over-estimates the loss, counting *all* of the
    consumed time as overhead). So on 10_000 calls per day you waste a
    total of 15 cpu-seconds, out of 24*60*60 = 86400 that are available.
    Reason for concern? I think not.

    Also, if you put a more realistic payload in the benchmarks, the result
    is less dramatic. Assume the equivalent of "1 for 1 .. 10_000" instead
    of a single "push". (My machine can run that 300 times per second,
    so a few thousand times per day would still amount to very little.)
    According to my benchmarks, this reduces the advantage of "if" over
    "switch" to 7%. Again, no reason for much concern.

    I, too, would like to see a light-weight switch statement, comparable
    in execution speed to "if", but in many cases the Switch module will
    still be adequate. If it isn't, there are a lot of ad-hoc alternatives
    in Perl. Some of them, like dispatch tables, are blindingly fast where
    applicable. Your branching on fixed strings looks like an invitation
    to use a dispatch table.

    Anno
     
    Anno Siegel, May 1, 2004
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mark
    Replies:
    0
    Views:
    381
  2. rfractal30
    Replies:
    0
    Views:
    2,350
    rfractal30
    Apr 11, 2005
  3. Tom  Kerigan
    Replies:
    2
    Views:
    467
    Peter Flynn
    Oct 25, 2005
  4. leo
    Replies:
    8
    Views:
    397
    Tom Anderson
    Oct 5, 2005
  5. Replies:
    2
    Views:
    365
    Steve C. Orr [MVP, MCSD]
    Nov 11, 2006
Loading...

Share This Page