laziest / fastest way to match last characters of a string

Discussion in 'Perl Misc' started by hofer, Sep 11, 2008.

  1. hofer

    hofer Guest

    Hi,
    Let's look at following example:

    $text = "Today is a nice day";
    $end = "day";

    print "text ends with $end" if $text =~ /$end$/;

    Would the regular expression be efficient for long strings?

    The alternative is a little more awkward to type

    print "text ends with $end" substr($text,-length($end)) eq $end; # I
    didn't try this line, but it should work I think

    Is there any core module containing something like
    print "text ends with $end" if endswith($text,$end);


    thans and bye


    H
    hofer, Sep 11, 2008
    #1
    1. Advertising

  2. hofer

    J. Gleixner Guest

    hofer wrote:
    > Hi,
    > Let's look at following example:
    >
    > $text = "Today is a nice day";
    > $end = "day";
    >
    > print "text ends with $end" if $text =~ /$end$/;
    >
    > Would the regular expression be efficient for long strings?


    Why not benchmark some different alternatives to see? Your 'long
    strings' might not be all that long.

    >
    > The alternative is a little more awkward to type
    >
    > print "text ends with $end" substr($text,-length($end)) eq $end; # I
    > didn't try this line, but it should work I think
    >
    > Is there any core module containing something like
    > print "text ends with $end" if endswith($text,$end);


    Don't know if it'll be faster, but using length and index would be an
    alternative, another would be substr.

    perldoc -f index
    perldoc -f length
    perldoc -f substr
    J. Gleixner, Sep 11, 2008
    #2
    1. Advertising

  3. hofer

    Ben Morrow Guest

    Quoth hofer <>:
    >
    > $text = "Today is a nice day";
    > $end = "day";
    >
    > print "text ends with $end" if $text =~ /$end$/;
    >
    > Would the regular expression be efficient for long strings?


    ~% perl -Mre=debug -e'$end="day"; "Today is a nice day" =~ /$end$/'
    Freeing REx: `","'
    Compiling REx `day$'
    size 4 Got 36 bytes for offset annotations.
    first at 1
    1: EXACT <day>(3)
    3: EOL(4)
    4: END(0)
    anchored "day"$ at 0 (checking anchored isall) minlen 3
    Offsets: [4]
    1[3] 0[0] 4[1] 5[0]
    Guessing start of match, REx "day$" against "Today is a nice day"...
    Found anchored substr "day"$ at offset 16...
    Starting position does not contradict /^/m...
    Guessed: match at offset 16
    Freeing REx: `"day$"'

    The first thing it tries is a direct match against the last three
    characters, which is as fast as it gets.

    Ben

    --
    Outside of a dog, a book is a man's best friend.
    Inside of a dog, it's too dark to read.
    Groucho Marx
    Ben Morrow, Sep 11, 2008
    #3
  4. hofer <> wrote:
    >$text = "Today is a nice day";
    >$end = "day";
    >print "text ends with $end" if $text =~ /$end$/;
    >
    >Would the regular expression be efficient for long strings?
    >
    >The alternative is a little more awkward to type
    >
    >print "text ends with $end" substr($text,-length($end)) eq $end; # I
    >didn't try this line, but it should work I think


    These two versions do very different things. If you need REs, then the
    second version won't do you any good.
    If you want textual comparison without RE-behaviour then the first
    version is wrong unless you have a very limited set of possible data.

    Use the one that matches your needs. Usually correct is more important
    than fast.

    jue
    Jürgen Exner, Sep 11, 2008
    #4
  5. hofer

    hofer Guest

    On Sep 11, 8:51 pm, Jürgen Exner <> wrote:

    > >print "text ends with $end" if $text =~ /$end$/;

    >
    > >print "text ends with $end"  substr($text,-length($end)) eq $end;  #I

    >
    > These two versions do very different things. If you need REs, then the
    > second version won't do you any good.
    > If you want textual comparison without RE-behaviour then the first
    > version is wrong unless you have a very limited set of possible data.
    >
    > Use the one that matches your needs. Usually correct is more important
    > than fast.
    >

    Hi Juergen,

    In fact I don't need REs and the finishing strings won't contain
    backslashes, dots or other characters, that could be taken as RE.

    So in my special case both are interchangable.

    For me the RE is visualy more intuitive than the substr with the -
    length() and the fact, that the string to be searched has
    to be entered twice if it were a constant and not a variable

    I just wondered if perl has a built-in string_ends_with() function or
    whether REs would be much slower.

    As it Ben pointed out the first thing the RE search does is checking
    at the end of the string, so I guess I'll stick with REs


    bye


    N
    hofer, Sep 11, 2008
    #5
  6. hofer

    Ben Morrow Guest

    Quoth hofer <>:
    > On Sep 11, 8:51 pm, Jürgen Exner <> wrote:
    >
    > > >print "text ends with $end" if $text =~ /$end$/;

    > >
    > > >print "text ends with $end"  substr($text,-length($end)) eq $end;  # I

    > >
    > > These two versions do very different things. If you need REs, then the
    > > second version won't do you any good.
    > > If you want textual comparison without RE-behaviour then the first
    > > version is wrong unless you have a very limited set of possible data.
    > >
    > > Use the one that matches your needs. Usually correct is more important
    > > than fast.
    > >

    > Hi Juergen,
    >
    > In fact I don't need REs and the finishing strings won't contain
    > backslashes, dots or other characters, that could be taken as RE.
    >
    > So in my special case both are interchangable.


    Be aware that /$/ has rather odd semantics: it will match before a
    newline at the end of the string, in a somewhat misguided attempt to
    handle reading from a filehandle without chomping. If this is an issue
    (if your string might contain newlines, and you *don't* want to match
    them like this), use /\z/ instead.

    Also, it's always worth interpolating a variable that's meant to be
    taken literally like this:

    /\Q$end\E$/

    just in case.

    > For me the RE is visualy more intuitive than the substr with the -
    > length() and the fact, that the string to be searched has
    > to be entered twice if it were a constant and not a variable


    The second is a nonissue. Allowing you to type things only once is what
    variables are *for* :).

    > I just wondered if perl has a built-in string_ends_with() function or
    > whether REs would be much slower.


    Well, yes; it's called a regex.

    Ben

    --
    All persons, living or dead, are entirely coincidental.
    Kurt Vonnegut
    Ben Morrow, Sep 12, 2008
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Replies:
    1
    Views:
    367
    Real Gagnon
    Jan 5, 2005
  2. Johny
    Replies:
    8
    Views:
    390
  3. Daniel Nugent
    Replies:
    2
    Views:
    96
    Daniel Nugent
    Sep 8, 2005
  4. Old Echo
    Replies:
    1
    Views:
    180
    Adam Shelly
    Sep 4, 2008
  5. bukzor

    Fastest way to find a match?

    bukzor, Mar 12, 2008, in forum: Perl Misc
    Replies:
    10
    Views:
    182
    Michele Dondi
    Mar 16, 2008
Loading...

Share This Page