about condensed regular expression syntax

R

raksha34

hi all,

i have to match the following types of strings:

PTY
IN_B
IN[3]
ADD<2>
SUM{25}
MULT(9)

Here's my attempt at condensing the regular expression:

use strict;
use warnings;

my @Data = qw(
PTY
COUNT2
IN_B
IN[3]
ADD<2>
SUM{25}
MULT(9)
);

my %h = qw(
[ ]
{ }
( )
< >
);

my $pin_re = q/\A[a-zA-Z]\w*(?:([<[({])\d+$h{\1})?\z/;

for my $var (@Data) {
if ($var =~ m/$pin_re/) {
print "$var match";
}
else {
print "$var NOmatch";
}
}
**************************** END of CODE **************

This is what if get:

PTY match
COUNT2 match
IN_B match
IN[3] NOmatch
ADD<2> NOmatch
SUM{25} NOmatch
MULT(9) NOmatch
****************************** END of OUTPUT **********

The reason for writing the regular expression in this format
was to avoid having to use a lot ORs.

but it doesnt work.

Can you suggest someway of fixing this?

Thanks,
Rakesh
 
J

Jürgen Exner

i have to match the following types of strings:

my @Data = qw(
PTY
COUNT2
IN_B
IN[3]
ADD<2>
SUM{25}
MULT(9)
);

The RE
/.+/
will perfectly match those strings.

It will also match a few other strings, quite a few actually, but as you
didn't specify any criteria for what strings not to match that should be ok.

jue
 
R

raksha34

Ok, a valid string is of the following form:

i) must start with an alphabet
ii) then it can be any alphanumeric after that. it can end here, but
if not then rule iii) applies
iii) and finally it may or may not end in the following 4 forms:

[num]
<num>
{num}
(num)

*** num means any nonnegative integer.


thanks,

Rakesh




Jürgen Exner said:
i have to match the following types of strings:

my @Data = qw(
PTY
COUNT2
IN_B
IN[3]
ADD<2>
SUM{25}
MULT(9)
);

The RE
/.+/
will perfectly match those strings.

It will also match a few other strings, quite a few actually, but as you
didn't specify any criteria for what strings not to match that should be ok.

jue
 
A

anno4000

hi all,

i have to match the following types of strings:

PTY
IN_B
IN[3]
ADD<2>
SUM{25}
MULT(9)

Here's my attempt at condensing the regular expression:

use strict;
use warnings;

my @Data = qw(
PTY
COUNT2
IN_B
IN[3]
ADD<2>
SUM{25}
MULT(9)
);

my %h = qw(
[ ]
{ }
( )
< >
);

my $pin_re = q/\A[a-zA-Z]\w*(?:([<[({])\d+$h{\1})?\z/;

Uh, no, that won't work. I'm not sure how it even compiles, but
that kind of match-time replacement only works on the replacement
side of an s///, not in a regex.

[...]
The reason for writing the regular expression in this format
was to avoid having to use a lot ORs.

but it doesnt work.

Can you suggest someway of fixing this?

Well, use the or's. You don't have to write them yourself. Using your
table %h from above:

my $paren_re = join '|' => map "\Q$_\E\\d+\Q$h{$_}\E" => keys %h;
my $pin_re = qr/\A[a-zA-Z]\w*(?:$paren_re)?\z/;

That should do what you want.

The alternative would be to use (?{{ code }}) insertions to provide
the the closing counterpart, but ugh... I haven't tried this.

Anno
 
M

Michele Dondi

i) must start with an alphabet

[a-zA-Z] or [a-z] with -i
ii) then it can be any alphanumeric after that. it can end here, but
if not then rule iii) applies

"any" means zero or more? \w*
iii) and finally it may or may not end in the following 4 forms:

[num]
<num>
{num}
(num)

Simple enough IMHO to go with the "or":
(?:\[\d+\]|<\d+>|\{\d+\}|\(\d+\))?. I must say that I've spent some
time now trying to do the same thing with a hash approach, but all in
all it seems to me that all attempts are more costly in terms of
space. All in all I would go this way (/x added for clarity):

/[a-z]
\w+
(?:
\[\d+\]
|
<\d+>
|
\{\d+\}
|
\(\d+\)
)?/ix


Michele
 
P

Paul Lalli

Ok, a valid string is of the following form:

i) must start with an alphabet

/^[a-zA-Z] said:
ii) then it can be any alphanumeric after that. it can end here, but

if not then rule iii) applies
iii) and finally it may or may not end in the following 4 forms:

[num]
<num>
{num}
(num)

<...> (?:\[\d+\]|<\d+>|\{\d+\}|\(\d+\)) <...>


Put it all together:
/^ #beginning of string
[a-zA-Z] #start with an alpha
[a-zA-Z0-9]+ #continue with 1 or more alphanums
(?:\[\d+\]|<\d+>|\{\d+\}|\(\d+\))? #optionally your digits
$/x #end of string

Paul Lalli

P.S. I'm not entirely certain that all of ] } and ) need to be
escaped, but they won't hurt.
 
M

Mirco Wahab

Ok, a valid string is of the following form:

i) must start with an alphabet
ii) then it can be any alphanumeric after that. it can end here, but
if not then rule iii) applies
iii) and finally it may or may not end in the following 4 forms:

[num]
<num>
{num}
(num)

*** num means any nonnegative integer.

Your approach wasn't that bad in the first place.
Please note that some of your replacement chars
might be special in regex context ==> the ')'.

The hash thing needs to be enveloped into an
code assertion, like

...
my @Data = qw'
PTY
COUNT2
IN_B
IN[3]
ADD<2>
SUM{25}
MULT(9) ';

my %h = qw' [ ] { } ( \) < > ';

my $pin_re = qr/^[A-z]+\w?
(?:
( [<{[(] ) \d+
(??{"$h{$1}"})
)?
$/x;

for (@Data) {
print "$_ " . (/$pin_re/ ? 'OK' : 'NO') . " match\n"
}
...

Regards

M.
 
R

raksha34

Hello all,

Thank you all for helping me out on this. I really appreciate
everybody's help!

Actually, what I wanted to do was what Mirco has given.
The (??{....}) concept is really great.

I wanted to avoid doing the ORing in the regular expressions as it
hurts the scalability.

Thanks once again everybody for the help.

Regards,
Rakesh.



Mirco said:
Ok, a valid string is of the following form:

i) must start with an alphabet
ii) then it can be any alphanumeric after that. it can end here, but
if not then rule iii) applies
iii) and finally it may or may not end in the following 4 forms:

[num]
<num>
{num}
(num)

*** num means any nonnegative integer.

Your approach wasn't that bad in the first place.
Please note that some of your replacement chars
might be special in regex context ==> the ')'.

The hash thing needs to be enveloped into an
code assertion, like

...
my @Data = qw'
PTY
COUNT2
IN_B
IN[3]
ADD<2>
SUM{25}
MULT(9) ';

my %h = qw' [ ] { } ( \) < > ';

my $pin_re = qr/^[A-z]+\w?
(?:
( [<{[(] ) \d+
(??{"$h{$1}"})
)?
$/x;

for (@Data) {
print "$_ " . (/$pin_re/ ? 'OK' : 'NO') . " match\n"
}
...

Regards

M.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top