Regexp-engine: ruby vs. perl

Discussion in 'Ruby' started by Axel Schmalowsky, Jul 6, 2009.

  1. -----BEGIN PGP SIGNED MESSAGE-----
    Hash: SHA1

    Hello list,

    I've got a question about ruby's regexp-engine: I'm wondering why ruby's
    regexp-engine is soo much slower than perl's.

    My test file looks like this (status.dat from nagios):

    <status-type> {
    key=value
    ... ~20 further key=value pairs
    }

    This file's size is about 100MB.

    [perl -- v5.8.8]
    time perl -wnl -00 -e 'print if /host_name=monslave\d+/ and
    /service_description=load/ and /servicestatus\s+{[^}]+}/m'
    /tmp/status.dat >/dev/null
    perl -wnl -00 -e /tmp/status.dat > /dev/null 0.90s user 0.11s system
    51% cpu 1.946 total

    [ruby19 -- ruby 1.9.1p129 (2009-05-12 revision 23412) [i686-linux]]
    time ruby19 -wnl -00 -e 'print if /host_name=monslave\d+/ and
    /service_description=load/ and /servicestatus\s+{[^}]+}/m'
    /tmp/status.dat >/dev/null
    ruby19 -wnl -00 -e /tmp/status.dat > /dev/null 5.13s user 0.15s system
    50% cpu 10.449 total


    [ruby18 -- ruby 1.8.7p5000 (2009-02-19) [i686-linux]]
    time ruby18 -wnl -00 -e 'print if /host_name=monslave\d+/ and
    /service_description=load/ and /servicestatus\s+\{[^}]+\}/m'
    /tmp/status.dat >/dev/null
    ruby18 -wnl -00 -e /tmp/status.dat > /dev/null 3.93s user 0.05s system
    48% cpu 8.153 total

    So, both versions of ruby are slower than perl and I'm wondering why.

    I'd like to integrate ruby in my daily work (it's actually a
    wonderful/beatiful language) it's hard to justify when things like the
    trivial regexp above is about a factor of 4-5 slower than in perl.
    And writing/using regexps is part of my daily work.


    Thanks

    - --
    Freundliche Grüße / Kind regards

    Axel Schmalowsky
    Platform Engineer
    ___________________________________

    domainfactory GmbH
    Oskar-Messter-Str. 33
    85737 Ismaning
    Germany

    Mobil: +49 (0)176 / 10246727
    Telefon: +49 (0)89 / 55266-356
    Telefax: +49 (0)89 / 55266-222

    E-Mail:
    Internet: www.df.eu

    Registergericht: Amtsgericht München
    HRB-Nummer 150294, Geschäftsführer:
    Tobia Sara Marburg, Jochen Tuchbreiter
    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v2.0.10 (MingW32)
    Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

    iEYEARECAAYFAkpSEsIACgkQsuqpduCyZM1hdgCguZab/bhqUBpyCLbEKvIoM2nj
    NigAn1pvoVHCzGNIUve+0NgcYprlKCeZ
    =tZ+c
    -----END PGP SIGNATURE-----
     
    Axel Schmalowsky, Jul 6, 2009
    #1
    1. Advertising

  2. On 7/6/09, Axel Schmalowsky <> wrote:
    > I've got a question about ruby's regexp-engine: I'm wondering why ruby's
    > regexp-engine is soo much slower than perl's.


    Just guessing here, but usually when regexes are slow it's because of
    backtracking. Since it looks like you don't need any backtracking in
    this little script, you might try throwing in some (?> ) around your
    repetitions. (And yes, perl doesn't require this hack to be fast.
    Perl's probably applying it for you automatically... perl's regex is
    smarter than ruby's; what can I say?) HTH
     
    Caleb Clausen, Jul 6, 2009
    #2
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    724
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Replies:
    1
    Views:
    385
    Sybren Stuvel
    Apr 10, 2006
  3. Sasha
    Replies:
    3
    Views:
    606
    Sasha
    May 22, 2007
  4. Mikel Lindsaar
    Replies:
    0
    Views:
    508
    Mikel Lindsaar
    Mar 31, 2008
  5. Joao Silva
    Replies:
    16
    Views:
    381
    7stud --
    Aug 21, 2009
Loading...

Share This Page