Cost of qr// vs m//

Discussion in 'Perl Misc' started by Adrien BARREAU, Mar 28, 2014.

  1. Hello.


    Here is a piece of code:

    ---
    #!/usr/bin/perl

    use strict;
    use warnings;

    use Time::HiRes;
    use Benchmark;

    my @strings = map { sprintf "%08X\n", rand(0xffffffff); } 1 .. 100;

    my $r = qr/some/;

    sub compiled
    {
    /$r/ for (@strings)
    }

    sub live
    {
    /some/ for (@strings)
    }


    my $results = Benchmark::timethese(100000, {
    'compiled' => \&compiled,
    'live' => \&live,
    });

    Benchmark::cmpthese($results);
    ---

    Running it gives me:

    Benchmark: timing 100000 iterations of compiled, live...
    compiled: 2 wallclock secs ( 2.67 usr + 0.00 sys = 2.67 CPU) @
    37453.18/s (n=100000)
    live: 1 wallclock secs ( 0.90 usr + 0.00 sys = 0.90 CPU) @
    111111.11/s (n=100000)
    Rate compiled live
    compiled 37453/s -- -66%
    live 111111/s 197% --

    On: This is perl 5, version 14, subversion 2 (v5.14.2) built for
    x86_64-linux-gnu-thread-multi



    I don't really understand these results: qr// seems to cost more, but I
    don't find anything in the perldoc about that.

    Do I miss an error in this benchmark?
    Does somebody have any information about that overhead I see?

    If I had to guess, I would suspect dereferencing cost of a Regexp ref.
    Could it be right?


    Adrien.
     
    Adrien BARREAU, Mar 28, 2014
    #1
    1. Advertising

  2. Adrien BARREAU schrieb am 28.03.2014 13:32:
    > [...]
    > my $r = qr/some/;
    >
    > sub compiled
    > {
    > /$r/ for (@strings)
    > [...]


    That's done twice ...
    Try
    $r for (@strings)
    and you will see a speed advantage for the compiled version.

    Regards, Horst
    --
    <remove S P A M 2x from my email address to get the real one>
     
    Horst-W. Radners, Mar 28, 2014
    #2
    1. Advertising

  3. Adrien BARREAU <> writes:
    > Here is a piece of code:
    >
    > ---
    > #!/usr/bin/perl
    >
    > use strict;
    > use warnings;
    >
    > use Time::HiRes;
    > use Benchmark;
    >
    > my @strings = map { sprintf "%08X\n", rand(0xffffffff); } 1 .. 100;
    >
    > my $r = qr/some/;
    >
    > sub compiled
    > {
    > /$r/ for (@strings)
    > }
    >
    > sub live
    > {
    > /some/ for (@strings)
    > }
    >
    >
    > my $results = Benchmark::timethese(100000, {
    > 'compiled' => \&compiled,
    > 'live' => \&live,
    > });
    >
    > Benchmark::cmpthese($results);
    > ---
    >
    > Running it gives me:
    >
    > Benchmark: timing 100000 iterations of compiled, live...
    > compiled: 2 wallclock secs ( 2.67 usr + 0.00 sys = 2.67 CPU) @
    > 37453.18/s (n=100000)
    > live: 1 wallclock secs ( 0.90 usr + 0.00 sys = 0.90 CPU) @
    > 111111.11/s (n=100000)
    > Rate compiled live
    > compiled 37453/s -- -66%
    > live 111111/s 197% --


    [...]

    > I don't really understand these results: qr// seems to cost more, but
    > I don't find anything in the perldoc about that.


    This mystery is easily explained when looking the the decompiled/
    disassembled internal represention (I've omitted everything except the
    actual loop). 'live' becomes

    [rw@sable]/tmp#perl -MO=Concise,live b.pl
    main::live:
    - <1> null K/1 ->b
    a <|> and(other->7) K/1 ->b
    9 <0> iter s ->a
    - <@> lineseq sK ->-
    7 </> match(/"some.*3"/) v/RTIME ->8
    8 <0> unstack s ->9

    In contrast to that, 'compiled' is

    - <1> null K/1 ->i
    h <|> and(other->b) K/1 ->i
    g <0> iter s ->h
    - <@> lineseq sK ->-
    e </> match() vK/RTIME ->f
    d <|> regcomp(other->e) sK/1 ->e
    b <1> regcreset sK/1 ->c
    c <0> padsv[$r:601,602] s ->d
    f <0> unstack s ->g

    For the qr'ed case, it actually calls into the top-level regexp compiler
    routine (pp_regcomp) on each iteration which gets the already compiled
    regexp out of the passed argument in case contained a (reference) to an
    already compiled regexp instead of calling the 'real' regexp compiler.
    Judging from the (5.10.1) C-code, the compiled regexp is also copied to
    'a temporary object' for each match.

    A more interesting result: Adding a

    my $other = 'some';

    sub interpolated
    {
    /$other/ for @strings;
    }

    shows that this is faster (at least for me) as well. Presumably, this
    happens because the 'last regex compiled for this op' is cached 'in the
    op' and it will be re-used without recompilation if the 'source pattern'
    didn't really change. In this case, no 'temporary copy' is made.
     
    Rainer Weikusat, Mar 28, 2014
    #3
  4. "Horst-W. Radners" <> writes:
    > Adrien BARREAU schrieb am 28.03.2014 13:32:
    >> [...]
    >> my $r = qr/some/;
    >>
    >> sub compiled
    >> {
    >> /$r/ for (@strings)
    >> [...]

    >
    > That's done twice ...
    > Try
    > $r for (@strings)
    > and you will see a speed advantage for the compiled version.


    That's hardly suprising, given that this code doesn't do a regexp-match
    at all :).
     
    Rainer Weikusat, Mar 28, 2014
    #4
  5. Rainer Weikusat schrieb am 28.03.2014 14:57:
    > "Horst-W. Radners" <> writes:
    >> Adrien BARREAU schrieb am 28.03.2014 13:32:
    >>> [...]
    >>> my $r = qr/some/;
    >>>
    >>> sub compiled
    >>> {
    >>> /$r/ for (@strings)
    >>> [...]

    >>
    >> That's done twice ...
    >> Try
    >> $r for (@strings)
    >> and you will see a speed advantage for the compiled version.

    >
    > That's hardly suprising, given that this code doesn't do a regexp-match
    > at all :).
    >

    Ah, sorry, I misunderstood the 'used standalone' in the qr section of
    perldoc perlop.

    Regards, Horst
    --
    <remove S P A M 2x from my email address to get the real one>
     
    Horst-W. Radners, Mar 28, 2014
    #5
  6. Adrien BARREAU

    Dr.Ruud Guest

    On 2014-03-28 14:51, Rainer Weikusat wrote:

    > A more interesting result: Adding a
    >
    > my $other = 'some';
    >
    > sub interpolated
    > {
    > /$other/ for @strings;
    > }
    >
    > shows that this is faster (at least for me) as well. Presumably, this
    > happens because the 'last regex compiled for this op' is cached 'in the
    > op' and it will be re-used without recompilation if the 'source pattern'
    > didn't really change. In this case, no 'temporary copy' is made.


    With a recent Perl:

    perl -Mstrict -wE'
    use Benchmark ":hireswallclock";

    say "\nPerl $]\n";

    my @strings = map { sprintf "%08X\n", rand(0xffffffff); } 1 .. 100;

    my $qr = qr/some/;
    my $some = "some";

    my $results = Benchmark::timethese( -3, {
    compiled => sub { /$qr/ for @strings },
    literal => sub { /some/ for @strings },
    interpol => sub { /$some/ for @strings },
    });

    say "";
    Benchmark::cmpthese($results);
    '

    Perl 5.019006

    Benchmark: running compiled, interpol, literal for at least 3 CPU seconds...
    compiled: 3.14617 wallclock secs ( 3.13 usr + 0.00 sys = 3.13 CPU)
    @ 19945.69/s (n=62430)
    interpol: 3.03571 wallclock secs ( 3.01 usr + 0.00 sys = 3.01 CPU)
    @ 59009.30/s (n=177618)
    literal: 3.09564 wallclock secs ( 3.09 usr + 0.00 sys = 3.09 CPU)
    @ 106284.79/s (n=328420)

    Rate compiled interpol literal
    compiled 19946/s -- -66% -81%
    interpol 59009/s 196% -- -44%
    literal 106285/s 433% 80% --

    --
    Ruud
     
    Dr.Ruud, Mar 28, 2014
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Brendan Lynskey

    Low-cost ASIC tools

    Brendan Lynskey, Sep 26, 2003, in forum: VHDL
    Replies:
    2
    Views:
    2,086
    Jerry
    Sep 27, 2003
  2. Mr. x

    cost of SSL on the client

    Mr. x, Dec 2, 2003, in forum: ASP .Net
    Replies:
    3
    Views:
    364
    Mr. x
    Dec 3, 2003
  3. Ali

    Cost for using Server control

    Ali, Dec 5, 2003, in forum: ASP .Net
    Replies:
    1
    Views:
    322
    Jason S
    Dec 5, 2003
  4. msn
    Replies:
    9
    Views:
    350
  5. John Bailo

    Cost of web.config parameters

    John Bailo, Apr 14, 2004, in forum: ASP .Net
    Replies:
    4
    Views:
    431
    John Bailo
    Apr 15, 2004
Loading...

Share This Page