Regexp: look ahead and match

J

jm

I was using the following script to find two identical letters in a row,
and now would like to find two letters in a row that are in alphabetic
order,
e.g. ab, or mn, or yz, etc... Haven't had any luck changing the
script
to do that. I'm thinking that I should be able to look ahead to the next
letter.
Is there a way to increment a backreference to do that?
Thanks in advance.


use strict:
use warnings;

while (my $str = <> ){
chomp($str);
while($string =~ /([a-z])(?=\1)/cgi) {
print "Match\n";
}
 
M

Matt Garrish

jm said:
I was using the following script to find two identical letters in a row,
and now would like to find two letters in a row that are in alphabetic
order,
e.g. ab, or mn, or yz, etc... Haven't had any luck changing the
script
to do that. I'm thinking that I should be able to look ahead to the next
letter.
Is there a way to increment a backreference to do that?
Thanks in advance.


use strict:
use warnings;

while (my $str = <> ){
chomp($str);
while($string =~ /([a-z])(?=\1)/cgi) {
print "Match\n";
}

I don't think it can be done from within a regex (at least I can't think of
a way that won't result in an eval error). Something simple like the
following should work, though:

while (my $str = <>) {

my $lval = 0;

foreach my $char ($str =~ /(.)/g) {

my $ordval = ord($char);

if ($char =~ /[A-Za-z]/) {
if ($ordval == ($lval + 1)) {
print chr($lval) . "$char\n";
}
}

$lval = $ordval;

}

}

Matt
 
J

Jay Tilton

: I was using the following script to find two identical letters in a row,
: and now would like to find two letters in a row that are in alphabetic
: order,
: e.g. ab, or mn, or yz, etc... Haven't had any luck changing the
: script
: to do that. I'm thinking that I should be able to look ahead to the next
: letter.
: Is there a way to increment a backreference to do that?

/([[:alpha:]])(??{ chr(ord($1)+1) })/

That assumes that alphabetical order and character code order are the same
thing, which isn't necessarily true.
 
M

Matt Garrish

Jay Tilton said:
: I was using the following script to find two identical letters in a row,
: and now would like to find two letters in a row that are in alphabetic
: order,
: e.g. ab, or mn, or yz, etc... Haven't had any luck changing the
: script
: to do that. I'm thinking that I should be able to look ahead to the next
: letter.
: Is there a way to increment a backreference to do that?

/([[:alpha:]])(??{ chr(ord($1)+1) })/

Brain not function good today. I was using (?{ }) and it just wouldn't work.
I should've gone back to perlre...

Is anyone aware of just how "experimental" these extended regexes are? I
know code can always be rewritten, but I don't like the thought of my
scripts breaking just because they're being run under a newer version of
perl (hence I would generally only use something like the above in a
throw-away script).

Matt
 
J

Jay Tilton

: while (my $str = <>) {
: my $lval = 0;
: foreach my $char ($str =~ /(.)/g) {
: my $ordval = ord($char);
: if ($char =~ /[A-Za-z]/) {
: if ($ordval == ($lval + 1)) {
: print chr($lval) . "$char\n";
: }
: }
: $lval = $ordval;
: }
: }

I'm hip on being cautious around any regex patterns labelled as
"experimental" (re:eek:ther branch of this thread). That technique is
comparatively bulletproof. The \G regex meta can tighten it up without
corrupting the spirit of the algorithm.

while( $str =~ /([[:alpha:]])/g ) {
my $m = $1;
my $n = chr( ord($m)+1 );
print "$m$n\n" if $str =~ /\G$n/;
}
 
M

Matt Garrish

Jay Tilton said:
: while (my $str = <>) {
: my $lval = 0;
: foreach my $char ($str =~ /(.)/g) {
: my $ordval = ord($char);
: if ($char =~ /[A-Za-z]/) {
: if ($ordval == ($lval + 1)) {
: print chr($lval) . "$char\n";
: }
: }
: $lval = $ordval;
: }
: }

I'm hip on being cautious around any regex patterns labelled as
"experimental" (re:eek:ther branch of this thread). That technique is
comparatively bulletproof. The \G regex meta can tighten it up without
corrupting the spirit of the algorithm.

while( $str =~ /([[:alpha:]])/g ) {
my $m = $1;
my $n = chr( ord($m)+1 );
print "$m$n\n" if $str =~ /\G$n/;
}

I would only comment that you would pay a price in speed by running the
second regex every time though the loop. Otherwise, as you say, it is much
more compact.

Matt
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top