A couple of questions regarding runtime generation of REGEXP's

S

sln

I'm probably going to use some wrong terms here but I
hope to give enough detail that I can get a definative
resolution to this, once and for all.

Basically I'm writing a sub that wants to take a regular
expression as a parameter. It then blindly operates on data,
matching, and posible substitution.

Apparently qr// will only function on the matching side, something like this:

# works
$rx = qr/\Q$sometext\E/s;
$data =~ /$rx/;
# or $data =~ $rx/

But this:

# does not work, no way no how
$rx = qr{s/\Q$sometext\E/junk/g};
$data =~ $rx;

Even though qr{s/\Q$sometext\E/junk/g} will pass warnings and errors,
even though the substitution is constant (ie, no runtime $1,$2, etc..)
it never matches.

I mean I could see a failure scenario if using $1.. on the substitution side
because it breaks undefined'ness, but if its given a constant it should work IMO.
And if it does compile, like the above does, it should work.

The fall back is to use an eval "" where something like this is possible:

$rx = "s/\\Q$sometext(.*?)\\E/junk\$1/g";
$expression = "\$res = \$data =~ $rx";
eval $expression;
if ($res) {
...
}

But eval is 2 to 4 times slower.

They only thing "dynamic" about the regualar expression above is the case of
substitution of $1.. Surely this could be taken into account when say using
the qr// construct couldn't it? Is it really breaking the rules, or would it
factor down to an eval anyway in that case? But the constant substitution,
I don't see why that can't work.

Is there anyway possible the substitution side will work?

TIA,
sln
 
M

Martien Verbruggen

On Mon, 03 Nov 2008 00:24:30 GMT,
I'm probably going to use some wrong terms here but I
hope to give enough detail that I can get a definative
resolution to this, once and for all.

Basically I'm writing a sub that wants to take a regular
expression as a parameter. It then blindly operates on data,
matching, and posible substitution.

Apparently qr// will only function on the matching side, something like this:

# works
$rx = qr/\Q$sometext\E/s;
$data =~ /$rx/;
# or $data =~ $rx/

The matching is done by the // operator. Not because you happened to use
qr// a bit earlier.
But this:

# does not work, no way no how
$rx = qr{s/\Q$sometext\E/junk/g};
$data =~ $rx;

A bare regex is simply not going to work on the right hand side of a =~
operator. It's the operator on the right hand side that does the
matching, not the =~ operator itself. That only binds an expression
instead of $_ to that matching operator.

More detail:

From perlop:

Binary "=~" binds a scalar expression to a pattern match. Certain
operations search or modify the string $_ by default. This operator
makes that kind of operation work on some other string. The right
argument is a search pattern, substitution, or transliteration.

Note that 'pattern' or 'regular expression' are not part of the allowed
right arguments.

Further down in the same document, under "Quote and Quote-like
Operators":

Customary Generic Meaning Interpolates
'' q{} Literal no
"" qq{} Literal yes
‘‘ qx{} Command yes*
qw{} Word list no
// m{} Pattern match yes*
qr{} Pattern yes*
s{}{} Substitution yes*
tr{}{} Transliteration no (but see below)
<<EOF here-doc yes*

And a little further down again:

Regexp Quote-Like Operators

Here are the quote-like operators that apply to pattern matching and
related activities.
[snip]

Martien
 
T

Tad J McClellan

Basically I'm writing a sub that wants to take a regular
expression as a parameter. It then blindly operates on data,
matching, and posible substitution.

Apparently qr// will only function on the matching side, something like this:


"qr" stands for "quote regular expression" and the so called
"matching side" of s/// is the part that is a regular expression.

qr will work fine there.

(the other "side" is the "replacement string", ie. it is not
a regular expression at all.)

# does not work, no way no how


Of course not. You are trying to quote something that is not
a regular expression.

$rx = qr{s/\Q$sometext\E/junk/g};

That regular expression will match if the string contains:
an "s" character followed by
a "/" character followed by
the literal contents of $sometext followed by
a "/" character followed by
a "j" character followed by
a "u" character followed by
...

So that will match if:

my $data = "s/$sometext/junk/g";

$data =~ $rx;

my $rx = qr/\Q$sometext\E/; # quote only the regex part
$data =~ s/$rx/junk/g; # works fine

And if it does compile, like the above does, it should work.


It does work (but only if $data actually contains the characters listed above).

Is there anyway possible the substitution side will work?


Yes. See above.
 
T

Tim Greer

# does not work, no way no how
$rx = qr{s/\Q$sometext\E/junk/g};
$data =~ $rx;

Looks like you're unintentionally trying to run a regex within the
regex, where the regex within is actually just trying to match a string
(not a functional regex).
 
M

Michele Dondi

Looks like you're unintentionally trying to run a regex within the
regex, where the regex within is actually just trying to match a string
(not a functional regex).

(S)he's just trying to "save" a substitution as first-order object,
and (s)he blindily tried some "random" syntax that's not going to work
of course.


Michele
 
M

Michele Dondi

I'm probably going to use some wrong terms here but I
hope to give enough detail that I can get a definative
resolution to this, once and for all.

Basically I'm writing a sub that wants to take a regular
expression as a parameter. It then blindly operates on data,
matching, and posible substitution. [cut]
# does not work, no way no how
$rx = qr{s/\Q$sometext\E/junk/g};

Actually, this comes out oh so often! Others duly explained to you
what's going on. Bottom line is, you *can't* "save" a substitution as
a first order object of the language. The substitution part of a
substitution, though, is "simply" a string: well, either that or code
- if the /e modifier is supplied. In both cases you can *think* of it,
possibly at the expense of a tiny wrapper layer, as a sub. Thus a
solution to your problem, albeit not just as "slim" as you may have
hoped for, may be given in terms of a couple consisting of a regex and
a sub. Sounds reasonable?


Michele
 
M

Michele Dondi

A bare regex is simply not going to work on the right hand side of a =~
operator. It's the operator on the right hand side that does the
matching, not the =~ operator itself. That only binds an expression
instead of $_ to that matching operator.

This is simply not true:

$ perl -E '$r=qr/\w+\s(\w+)\s\w+/;
"foo bar baz" =~ $r and say $1'
bar

In fact...
More detail:

From perlop:

Binary "=~" binds a scalar expression to a pattern match. Certain
operations search or modify the string $_ by default. This operator
makes that kind of operation work on some other string. The right
argument is a search pattern, substitution, or transliteration.
^^^^^^^^^^^^^^
^^^^^^^^^^^^^^

It's simply *ad hoc* in Perl 5.


Michele
 
S

sln

"qr" stands for "quote regular expression" and the so called
"matching side" of s/// is the part that is a regular expression.

qr will work fine there.

(the other "side" is the "replacement string", ie. it is not
a regular expression at all.)




Of course not. You are trying to quote something that is not
a regular expression.



That regular expression will match if the string contains:
an "s" character followed by
a "/" character followed by
the literal contents of $sometext followed by
a "/" character followed by
a "j" character followed by
a "u" character followed by
...

So that will match if:

my $data = "s/$sometext/junk/g";



my $rx = qr/\Q$sometext\E/; # quote only the regex part
$data =~ s/$rx/junk/g; # works fine




It does work (but only if $data actually contains the characters listed above).




Yes. See above.

Thats clear, no suprises then.
Thanks!

sln
 
S

sln

I'm probably going to use some wrong terms here but I
hope to give enough detail that I can get a definative
resolution to this, once and for all.

Basically I'm writing a sub that wants to take a regular
expression as a parameter. It then blindly operates on data,
matching, and posible substitution. [cut]
# does not work, no way no how
$rx = qr{s/\Q$sometext\E/junk/g};

Actually, this comes out oh so often! Others duly explained to you
what's going on. Bottom line is, you *can't* "save" a substitution as
a first order object of the language. The substitution part of a
substitution, though, is "simply" a string: well, either that or code
- if the /e modifier is supplied. In both cases you can *think* of it,
possibly at the expense of a tiny wrapper layer, as a sub. Thus a
solution to your problem, albeit not just as "slim" as you may have
hoped for, may be given in terms of a couple consisting of a regex and
a sub. Sounds reasonable?


Michele

No matter how I look at it, the replacement is still a string-
constructed in the scope of the block that invokes regexp engine.

So s/.../$somereplacement$1$2$3/ can be valid.
Or s/.../somesub($1,$2,$3)/e can be valid.

And only qr// can be compiled ahead of =~ if constant, ie: the regular expression.
In this case (s)///(g) or //(g) has no meaning, nor does //(e) I take it,
because the (.) is not part of the regular expression, but some modifiers are like //i
because it acts on the regular expression.

To me then it is a misnomer to call this: 's/$regx/$txt/g' a regular expression since
it can't be known before a scope block that invokes it, but qr// can be.

In my opinion, s///g should be allowed by qr{} using the scoping block it was created
in, and later correctly used (s///g) within the context of a block that invokes the engine.

This may violate 'first-order object' of the language. But then why are code extensions allowed?
qr/(?{ code })/ and what is the scoping for them? To me this looks like parsing issues and
if allowed would would internally result in a dynamic code issue like eval.
I don't that this 'code' extension isn't treated as a literal anyway.

I don't know if invoking a 'sub' (/e) is going to be any better than having to
parse through a passed in argument list for the proper form. In all cases, it looks
like the replacement text cannot include special var's unles an eval is used
at runtime.

Can you give an example of your regex and a sub solution?

Thanks.

sln
 
M

Michele Dondi

In my opinion, s///g should be allowed by qr{} using the scoping block it was created
in, and later correctly used (s///g) within the context of a block that invokes the engine.

This may violate 'first-order object' of the language. But then why are code extensions allowed?
qr/(?{ code })/ and what is the scoping for them? To me this looks like parsing issues and
if allowed would would internally result in a dynamic code issue like eval.
I don't that this 'code' extension isn't treated as a literal anyway.

Do not misunderstand me, I'm all with you: would you write a Perl
extension that allows to treat substitutions as first order objects of
the language? I would cherish that... Unfortunately I *for one*
haven't the slightest idea of where one could begin!

In the meanwhile we must be happy with a clumsier solution, like...
I don't know if invoking a 'sub' (/e) is going to be any better than having to
parse through a passed in argument list for the proper form. In all cases, it looks
like the replacement text cannot include special var's unles an eval is used
at runtime.

Can you give an example of your regex and a sub solution?

.... sure:

my %subst = ( regex => qr/.../, code => sub { ... } );

And then you use that to perform the substitution. You may even make
that the core data of a class, thus allowing objects like $subst with
a suitable ->apply($string) method.


Michele
 
S

sln

Do not misunderstand me, I'm all with you: would you write a Perl
extension that allows to treat substitutions as first order objects of
the language? I would cherish that... Unfortunately I *for one*
haven't the slightest idea of where one could begin!

In the meanwhile we must be happy with a clumsier solution, like...


... sure:

my %subst = ( regex => qr/.../, code => sub { ... } );

And then you use that to perform the substitution. You may even make
that the core data of a class, thus allowing objects like $subst with
a suitable ->apply($string) method.


Michele

I'm in your debt. There is virtually no overhead in calling that
sub for the substitution, and it executes in context. There is no
comparison with eval, this is the way to go for me.

I will, and have already resigned that its the callers responsibility
to ensure proper regexp usage, so/and I am just providing the rope.

In my circumstances, its all about performance. Any added indirection,
calls/assignments, etc.. will mean hazard in my usage. I won't get into
the gory details unless you want to know.

Below, is raw isolated test code, in the case of method 2, no error checking.
I already have an object function that an array of regex/code sub's could be passed to
where it then operates on data highly bound to the object.

Introducing a new object, RegxProc in the simple case below, would aleviate parsing,
but an unknown object type might not be acessable. But would aleviate internal processing.
I could internalize the RegxProc in the existing class, providing a wrapper method I guess
but the caller could not specify search/replace/replace global without additional parameter
parsing.

This is a relief for me though. Thanks alot...

sln

-----------------

use strict;
use warnings;

# method 1
# ------------
# my $data = "This is some data, this gets substituted";
# my $subst = {
# 'regex' => qr/(\whis)/i,
# 'code' => sub { print "$1\n"; return 'That'; }
# };
# $data =~ s/$subst->{'regex'}/ &{$subst->{'code'}}/ge;
# print "$data\n";


# method 2
# -------------
my $data = "This(1) is some data, this(2) gets substituted,
and so does this(3).";

print "\nData = $data\n\n";

my $rxp = new RegxProc (
'regex' => qr/(\whis\(\d\))/si,
'code' => sub { print "\ncode: \$1 = $1\n"; return 'That'; }
);
if ($rxp->search ($data)) {
print "search worked\n";
}
if ($rxp->replace ($data)) {
print "replace worked, data = $data\n";
}
if ($rxp->replace_g ($data)) {
print "global replace worked, data = $data\n";
}

package RegxProc;
use vars qw(@ISA);
@ISA = qw();

sub new
{
my ($class, @args) = @_;
my $self = {};
while (my ($name, $val) = splice (@args, 0, 2)) {
if ('regex' eq lc $name) {
$self->{regex} = $val;
}
elsif ('code' eq lc $name) {
$self->{code} = $val;
}
}
return bless ($self, $class);
}
sub search
{
my $self = shift;
return 0 unless (defined $_[0]);
return $_[0] =~ /$self->{regex}/;
}
sub replace
{
my $self = shift;
return 0 unless (defined $_[0]);
return $_[0] =~ s/$self->{regex}/&{$self->{code}}/e;
}
sub replace_g
{
my $self = shift;
return 0 unless (defined $_[0]);
return $_[0] =~ s/$self->{regex}/&{$self->{code}}/ge;
}

__END__

Data = This(1) is some data, this(2) gets substituted,
and so does this(3).

search worked

code: $1 = This(1)
replace worked, data = That is some data, this(2) gets substituted,
and so does this(3).

code: $1 = this(2)

code: $1 = this(3)
global replace worked, data = That is some data, That gets substituted,
and so does That.
 
S

sln

Do not misunderstand me, I'm all with you: would you write a Perl
extension that allows to treat substitutions as first order objects of
the language? I would cherish that... Unfortunately I *for one*
haven't the slightest idea of where one could begin!

In the meanwhile we must be happy with a clumsier solution, like...


... sure:

my %subst = ( regex => qr/.../, code => sub { ... } );

And then you use that to perform the substitution. You may even make
that the core data of a class, thus allowing objects like $subst with
a suitable ->apply($string) method.


Michele
[snip]

This is a relief for me though. Thanks alot...
[snip]

I settled on this lightweight class that handles the substution with some
variable type's. Still it is with minimal error checking to reduce overhead.
Added a few methods to generalize access, and it benchmarks pretty good.

See any potential problems or performance issues ?

sln

----------------------
use strict;
use warnings;

my $data = "This(1) is some data, this(2) gets substituted,
and so does this(3).";
my $tempdata = $data;

my $rxp = RxP->new (
'regex' => qr/(\whis\(\d\))/si,
'code' => sub { print "code: \$1 = $1\n"; return 'That'; },
'type' => 'r'
);

# test apply, set/get_type methods
if (1)
{
print "\n","-"x20,"\nData = $data\n\n";

$rxp->set_type('s');
if ($rxp->apply ($data)) {
print "Apply '".$rxp->get_type."' worked, data = $data\n\n";
}
$rxp->set_type('r');
if ($rxp->apply ($data)) {
print "Apply '".$rxp->get_type."' worked, data = $data\n\n";
}
$rxp->set_type('g');
if ($rxp->apply ($data)) {
print "Apply '".$rxp->get_type."' worked, data = $data\n\n";
}
}

# test direct call and search, replace, replace_g methods
if (1)
{
$rxp->set_type('r');
$data = $tempdata;
print "\n","-"x20,"\nData = $data\n\n";

if ($rxp->{'dflt_sub'}($rxp, $data)) {
print "Direct {dflt_sub} worked, data = $data\n\n";
}
if ($rxp->search ($data)) {
print "Search worked, data = $data\n\n";
}
if ($rxp->replace ($data)) {
print "Replace worked, data = $data\n\n";
}
if ($rxp->replace_g ($data)) {
print "Global replace worked, data = $data\n\n";
}
}


package RxP;
use vars qw(@ISA);
@ISA = qw();

sub new
{
my ($class, @args) = @_;
my $self = {
'dflt_sub' => \&search,
'type' => 's'
};
while (my ($name, $val) = splice (@args, 0, 2)) {
if ('regex' eq lc $name) {
$self->{'regex'} = $val;
}
elsif ('code' eq lc $name) {
$self->{'code'} = $val;
}
elsif ('type' eq lc $name && $val =~ /(s|r|g)/i) {
set_type ($self, $1);
}
}
return bless ($self, $class);
}
sub get_type
{
return $_[0]->{'type'};
}
sub set_type
{
return 0 unless (defined $_[1]);
if ($_[1] =~ /(s|r|g)/i) {
$_[0]->{'dflt_sub'} = {
's' => \&search,
'r' => \&replace,
'g' => \&replace_g
}->{$1};
$_[0]->{'type'} = $1;
return 1;
}
return 0;
}
sub apply
{
return 0 unless (defined $_[1]);
return &{$_[0]->{'dflt_sub'}};
}
sub search
{
return 0 unless (defined $_[1]);
return $_[1] =~ /$_[0]->{'regex'}/;
}
sub replace
{
return 0 unless (defined $_[1]);
return $_[1] =~ s/$_[0]->{'regex'}/&{$_[0]->{'code'}}/e;
}
sub replace_g
{
return 0 unless (defined $_[1]);
return $_[1] =~ s/$_[0]->{'regex'}/&{$_[0]->{'code'}}/ge;
}

__END__

--------------------
Data = This(1) is some data, this(2) gets substituted,
and so does this(3).

Apply 's' worked, data = This(1) is some data, this(2) gets substituted,
and so does this(3).

code: $1 = This(1)
Apply 'r' worked, data = That is some data, this(2) gets substituted,
and so does this(3).

code: $1 = this(2)
code: $1 = this(3)
Apply 'g' worked, data = That is some data, That gets substituted,
and so does That.


--------------------
Data = This(1) is some data, this(2) gets substituted,
and so does this(3).

code: $1 = This(1)
Direct {dflt_sub} worked, data = That is some data, this(2) gets substituted,
and so does this(3).

Search worked, data = That is some data, this(2) gets substituted,
and so does this(3).

code: $1 = this(2)
Replace worked, data = That is some data, That gets substituted,
and so does this(3).

code: $1 = this(3)
Global replace worked, data = That is some data, That gets substituted,
and so does That.
 
S

sln

On Mon, 03 Nov 2008 23:01:35 GMT, (e-mail address removed) wrote:

In my opinion, s///g should be allowed by qr{} using the scoping block it was created
in, and later correctly used (s///g) within the context of a block that invokes the engine.

This may violate 'first-order object' of the language. But then why are code extensions allowed?
qr/(?{ code })/ and what is the scoping for them? To me this looks like parsing issues and
if allowed would would internally result in a dynamic code issue like eval.
I don't that this 'code' extension isn't treated as a literal anyway.

Do not misunderstand me, I'm all with you: would you write a Perl
extension that allows to treat substitutions as first order objects of
the language? I would cherish that... Unfortunately I *for one*
haven't the slightest idea of where one could begin!

In the meanwhile we must be happy with a clumsier solution, like...

I don't know if invoking a 'sub' (/e) is going to be any better than having to
parse through a passed in argument list for the proper form. In all cases, it looks
like the replacement text cannot include special var's unles an eval is used
at runtime.

Can you give an example of your regex and a sub solution?

... sure:

my %subst = ( regex => qr/.../, code => sub { ... } );

And then you use that to perform the substitution. You may even make
that the core data of a class, thus allowing objects like $subst with
a suitable ->apply($string) method.


Michele
[snip]

This is a relief for me though. Thanks alot...
[snip]

I settled on this lightweight class that handles the substution with some
variable type's. Still it is with minimal error checking to reduce overhead.
Added a few methods to generalize access, and it benchmarks pretty good.

See any potential problems or performance issues ?

sln

----------------------
use strict;
use warnings;

my $data = "This(1) is some data, this(2) gets substituted,
and so does this(3).";
my $tempdata = $data;

my $rxp = RxP->new (
'regex' => qr/(\whis\(\d\))/si,
'code' => sub { print "code: \$1 = $1\n"; return 'That'; },
'type' => 'r'
);

# test apply, set/get_type methods
if (1)
{
print "\n","-"x20,"\nData = $data\n\n";

$rxp->set_type('s');
if ($rxp->apply ($data)) {
print "Apply '".$rxp->get_type."' worked, data = $data\n\n";
}
$rxp->set_type('r');
if ($rxp->apply ($data)) {
print "Apply '".$rxp->get_type."' worked, data = $data\n\n";
}
$rxp->set_type('g');
if ($rxp->apply ($data)) {
print "Apply '".$rxp->get_type."' worked, data = $data\n\n";
}
}

# test direct call and search, replace, replace_g methods
if (1)
{
$rxp->set_type('r');
$data = $tempdata;
print "\n","-"x20,"\nData = $data\n\n";

if ($rxp->{'dflt_sub'}($rxp, $data)) {
print "Direct {dflt_sub} worked, data = $data\n\n";
}
if ($rxp->search ($data)) {
print "Search worked, data = $data\n\n";
}
if ($rxp->replace ($data)) {
print "Replace worked, data = $data\n\n";
}
if ($rxp->replace_g ($data)) {
print "Global replace worked, data = $data\n\n";
}
}


package RxP;
use vars qw(@ISA);
@ISA = qw();
[snip]
Its better to have the regexp fail for some other reason than
undefined'ness.

Performance benchmarks are very good. Thx...


sub new
{
my ($class, @args) = @_;
my $self = {
'regex' => '',
'code' => '',
'type' => 's',
'dflt_sub' => \&search,
};
while (my ($name, $val) = splice (@args, 0, 2)) {
next if (!defined $val);
if ('regex' eq lc $name) {
$self->{'regex'} = $val;
}
elsif ('code' eq lc $name) {
$self->{'code'} = $val;
}
elsif ('type' eq lc $name && $val =~ /(s|r|g)/i) {
set_type ($self, $1);
}
}
return bless ($self, $class);
}
sub get_type
{
return $_[0]->{'type'};
}
sub set_type
{
return 0 unless (defined $_[1]);
if ($_[1] =~ /(s|r|g)/i) {
$_[0]->{'dflt_sub'} = {
's' => \&search,
'r' => \&replace,
'g' => \&replace_g
}->{$1};
$_[0]->{'type'} = $1;
return 1;
}
return 0;
}
sub apply
{
return 0 unless (defined $_[1]);
return &{$_[0]->{'dflt_sub'}};
}
sub search
{
return 0 unless (defined $_[1]);
return $_[1] =~ /$_[0]->{'regex'}/;
}
sub replace
{
return 0 unless (defined $_[1]);
return $_[1] =~ s/$_[0]->{'regex'}/&{$_[0]->{'code'}}/e;
}
sub replace_g
{
return 0 unless (defined $_[1]);
return $_[1] =~ s/$_[0]->{'regex'}/&{$_[0]->{'code'}}/ge;
}

__END__

--------------------
Data = This(1) is some data, this(2) gets substituted,
and so does this(3).

Apply 's' worked, data = This(1) is some data, this(2) gets substituted,
and so does this(3).

code: $1 = This(1)
Apply 'r' worked, data = That is some data, this(2) gets substituted,
and so does this(3).

code: $1 = this(2)
code: $1 = this(3)
Apply 'g' worked, data = That is some data, That gets substituted,
and so does That.


--------------------
Data = This(1) is some data, this(2) gets substituted,
and so does this(3).

code: $1 = This(1)
Direct {dflt_sub} worked, data = That is some data, this(2) gets substituted,
and so does this(3).

Search worked, data = That is some data, this(2) gets substituted,
and so does this(3).

code: $1 = this(2)
Replace worked, data = That is some data, That gets substituted,
and so does this(3).

code: $1 = this(3)
Global replace worked, data = That is some data, That gets substituted,
and so does That.
 
M

Michele Dondi

I settled on this lightweight class that handles the substution with some
variable type's. Still it is with minimal error checking to reduce overhead.
Added a few methods to generalize access, and it benchmarks pretty good.

See any potential problems or performance issues ?

I don't have time enough to dig through your implementation, but it
seems to me that you set up a fairly complete thingie: now,
performance is not generally a concern of mine. If it is for you, then
just profile your app. For the rest, I can only suggest you to set up
a test suite as well. As far as your implementation complies, you may
consider yourself reasonalby safe, ain't it?


Michele
 
S

sln

I don't have time enough to dig through your implementation, but it
seems to me that you set up a fairly complete thingie: now,
performance is not generally a concern of mine. If it is for you, then
just profile your app. For the rest, I can only suggest you to set up
a test suite as well. As far as your implementation complies, you may
consider yourself reasonalby safe, ain't it?


Michele

Never heard of test suites/cases. On my really big app, I'm making changes
so fast it scares me. I miss a compiler as opposed to a syntax checker.
No, no. Nunit isin't for me. I live on the edge, die on the edge, one
man - one piece of art...

sln
 
S

sln

I don't have time enough to dig through your implementation, but it
seems to me that you set up a fairly complete thingie: now,
performance is not generally a concern of mine. If it is for you, then
just profile your app. For the rest, I can only suggest you to set up
a test suite as well. As far as your implementation complies, you may
consider yourself reasonalby safe, ain't it?


Michele

I've already integrated this package into my bigger package and have exported
a thin wrapper sub that instantiates objects which are used as a drop in
by the caller, specifically used as a parameter (a ref from NewRxP) that
gets passed to the larger package method. Like a macro almost.

I'm learning the gory details of classes in Perl, something I didn't think
I would need to know beyond casual knowledge. I'm a hard core Windows
MFC C++ programmer, its how I make my living as a contractor.
Periodically, I'm laid off, like now. Perl is like candy to me, sweet to the
tongue, especially regular expressions. Its almost addicting. Unemployment is
running out, nobody is calling, I'm sure I will have to give this up and work
as a brick layer, my long past proffession, again. So, if I dissapear, its
been nice knowing you!

sln
 
S

sln

On Mon, 03 Nov 2008 23:01:35 GMT, (e-mail address removed) wrote:

In my opinion, s///g should be allowed by qr{} using the scoping block it was created
in, and later correctly used (s///g) within the context of a block that invokes the engine.

This may violate 'first-order object' of the language. But then why are code extensions allowed?
qr/(?{ code })/ and what is the scoping for them? To me this looks like parsing issues and
if allowed would would internally result in a dynamic code issue like eval.
I don't that this 'code' extension isn't treated as a literal anyway.

Do not misunderstand me, I'm all with you: would you write a Perl
extension that allows to treat substitutions as first order objects of
the language? I would cherish that... Unfortunately I *for one*
haven't the slightest idea of where one could begin!

In the meanwhile we must be happy with a clumsier solution, like...

I don't know if invoking a 'sub' (/e) is going to be any better than having to
parse through a passed in argument list for the proper form. In all cases, it looks
like the replacement text cannot include special var's unles an eval is used
at runtime.

Can you give an example of your regex and a sub solution?

... sure:

my %subst = ( regex => qr/.../, code => sub { ... } );

And then you use that to perform the substitution. You may even make
that the core data of a class, thus allowing objects like $subst with
a suitable ->apply($string) method.


Michele
[snip]

This is a relief for me though. Thanks alot...
[snip]

I settled on this lightweight class that handles the substution with some
variable type's. Still it is with minimal error checking to reduce overhead.
Added a few methods to generalize access, and it benchmarks pretty good.

See any potential problems or performance issues ?

sln

Ran into issues that were fixed. I just want to close this out with
the correct default 'code' sub, changed types, and added 'search_g()' method.
Thanks.

sln



sub NewRxP
{
my ($regex,$code,$type) = @_;
if (defined $code && ref($code) ne 'CODE') {
my $temp = $type;
$type = $code;
$code = $temp;
}
return RxP->new('regex'=>$regex,'code'=>$code,'type'=>$type);
}


# =================

package RxP;
use vars qw(@ISA);
@ISA = qw();

sub new
{
my ($class, @args) = @_;
my $self = {
'regex' => '',
'code' => sub{''},
'type' => 's',
'dflt_sub' => \&search
};
while (my ($name, $val) = splice (@args, 0, 2)) {
next if (!defined $val);
if ('regex' eq lc $name) {
$self->{'regex'} = $val;
}
elsif ('code' eq lc $name && ref($val) eq 'CODE') {
$self->{'code'} = $val;
}
elsif ('type' eq lc $name && $val =~ /(sg|gs|rg|gr|s|r)/i) {
set_type ($self, lc $1);
}
}
return bless ($self, $class);
}
sub get_type
{
return $_[0]->{'type'};
}
sub set_type
{
return 0 unless (defined $_[1]);
if ($_[1] =~ /(sg|gs|rg|gr|s|r)/i) {
$_[0]->{'dflt_sub'} = {
's' => \&search,
'sg' => \&search_g,
'gs' => \&search_g,
'r' => \&replace,
'rg' => \&replace_g,
'gr' => \&replace_g
}->{lc $1};
$_[0]->{'type'} = lc $1;
return 1;
}
return 0;
}
sub apply
{
return 0 unless (defined $_[1]);
return &{$_[0]->{'dflt_sub'}};
}
sub search
{
return 0 unless (defined $_[1]);
return $_[1] =~ /$_[0]->{'regex'}/;
}
sub search_g
{
return 0 unless (defined $_[1]);
return $_[1] =~ /$_[0]->{'regex'}/g;
}
sub replace
{
return 0 unless (defined $_[1]);
return $_[1] =~ s/$_[0]->{'regex'}/&{$_[0]->{'code'}}/e;
}
sub replace_g
{
return 0 unless (defined $_[1]);
return $_[1] =~ s/$_[0]->{'regex'}/&{$_[0]->{'code'}}/ge;
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top