laziest / fastest way to match last characters of a string

H

hofer

Hi,
Let's look at following example:

$text = "Today is a nice day";
$end = "day";

print "text ends with $end" if $text =~ /$end$/;

Would the regular expression be efficient for long strings?

The alternative is a little more awkward to type

print "text ends with $end" substr($text,-length($end)) eq $end; # I
didn't try this line, but it should work I think

Is there any core module containing something like
print "text ends with $end" if endswith($text,$end);


thans and bye


H
 
J

J. Gleixner

hofer said:
Hi,
Let's look at following example:

$text = "Today is a nice day";
$end = "day";

print "text ends with $end" if $text =~ /$end$/;

Would the regular expression be efficient for long strings?

Why not benchmark some different alternatives to see? Your 'long
strings' might not be all that long.
The alternative is a little more awkward to type

print "text ends with $end" substr($text,-length($end)) eq $end; # I
didn't try this line, but it should work I think

Is there any core module containing something like
print "text ends with $end" if endswith($text,$end);

Don't know if it'll be faster, but using length and index would be an
alternative, another would be substr.

perldoc -f index
perldoc -f length
perldoc -f substr
 
B

Ben Morrow

Quoth hofer said:
$text = "Today is a nice day";
$end = "day";

print "text ends with $end" if $text =~ /$end$/;

Would the regular expression be efficient for long strings?

~% perl -Mre=debug -e'$end="day"; "Today is a nice day" =~ /$end$/'
Freeing REx: `","'
Compiling REx `day$'
size 4 Got 36 bytes for offset annotations.
first at 1
1: EXACT <day>(3)
3: EOL(4)
4: END(0)
anchored "day"$ at 0 (checking anchored isall) minlen 3
Offsets: [4]
1[3] 0[0] 4[1] 5[0]
Guessing start of match, REx "day$" against "Today is a nice day"...
Found anchored substr "day"$ at offset 16...
Starting position does not contradict /^/m...
Guessed: match at offset 16
Freeing REx: `"day$"'

The first thing it tries is a direct match against the last three
characters, which is as fast as it gets.

Ben
 
J

Jürgen Exner

hofer said:
$text = "Today is a nice day";
$end = "day";
print "text ends with $end" if $text =~ /$end$/;

Would the regular expression be efficient for long strings?

The alternative is a little more awkward to type

print "text ends with $end" substr($text,-length($end)) eq $end; # I
didn't try this line, but it should work I think

These two versions do very different things. If you need REs, then the
second version won't do you any good.
If you want textual comparison without RE-behaviour then the first
version is wrong unless you have a very limited set of possible data.

Use the one that matches your needs. Usually correct is more important
than fast.

jue
 
H

hofer

These two versions do very different things. If you need REs, then the
second version won't do you any good.
If you want textual comparison without RE-behaviour then the first
version is wrong unless you have a very limited set of possible data.

Use the one that matches your needs. Usually correct is more important
than fast.
Hi Juergen,

In fact I don't need REs and the finishing strings won't contain
backslashes, dots or other characters, that could be taken as RE.

So in my special case both are interchangable.

For me the RE is visualy more intuitive than the substr with the -
length() and the fact, that the string to be searched has
to be entered twice if it were a constant and not a variable

I just wondered if perl has a built-in string_ends_with() function or
whether REs would be much slower.

As it Ben pointed out the first thing the RE search does is checking
at the end of the string, so I guess I'll stick with REs


bye


N
 
B

Ben Morrow

Quoth hofer said:
Hi Juergen,

In fact I don't need REs and the finishing strings won't contain
backslashes, dots or other characters, that could be taken as RE.

So in my special case both are interchangable.

Be aware that /$/ has rather odd semantics: it will match before a
newline at the end of the string, in a somewhat misguided attempt to
handle reading from a filehandle without chomping. If this is an issue
(if your string might contain newlines, and you *don't* want to match
them like this), use /\z/ instead.

Also, it's always worth interpolating a variable that's meant to be
taken literally like this:

/\Q$end\E$/

just in case.
For me the RE is visualy more intuitive than the substr with the -
length() and the fact, that the string to be searched has
to be entered twice if it were a constant and not a variable

The second is a nonissue. Allowing you to type things only once is what
variables are *for* :).
I just wondered if perl has a built-in string_ends_with() function or
whether REs would be much slower.

Well, yes; it's called a regex.

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,010
Latest member
MerrillEic

Latest Threads

Top