Using a variable size with the repetition quantifier

P

Philippe Aymer

Hi all,

I'm looking at a PERL regex (if possible) that will be able to use a
repetition quantifier metachar, but the number of repetition is
unknown until runtime.
For example:

X3xyz...

the number 3 give me the number of "repetition" for the next chars
(length of string), something like:

/X(\d)(\w{\1})/

but \1 is not possible within {} the repetition quantifier.

Is there a way to use {} with the repetition number only known from
the regex ?

Thanks,

Phil.
 
J

Jeff 'japhy' Pinyan

I'm looking at a PERL regex (if possible) that will be able to use a
repetition quantifier metachar, but the number of repetition is
unknown until runtime.
For example:

X3xyz...

the number 3 give me the number of "repetition" for the next chars
(length of string), something like:

/X(\d)(\w{\1})/

but \1 is not possible within {} the repetition quantifier.

Is there a way to use {} with the repetition number only known from
the regex ?

Not exactly. You have to do it by some other means. Here are two
examples:

$str =~ /X(\d)/g and $str =~ /\G(\w{$1})/

and

$str =~ /X(\d)((??{ "\\w{$1}" }))/

--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
Senior Dean, Fall 2004 % have long ago been overpaid?
RPI Corporation Secretary %
http://japhy.perlmonk.org/ % -- Meister Eckhart
 
B

Brian McCauley

Philippe said:
I'm looking at a PERL regex (if possible) that will be able to use a
repetition quantifier metachar, but the number of repetition is
unknown until runtime.

In general if you want a regex that adapts itself during its own
execution you want (??{}).
For example:

X3xyz...

the number 3 give me the number of "repetition" for the next chars
(length of string), something like:

/X(\d)(\w{\1})/

but \1 is not possible within {} the repetition quantifier.

Is there a way to use {} with the repetition number only known from
the regex ?

/X(\d)((??{"\\w{$1}"}))/
 
P

Philippe Aymer

Jeff 'japhy' Pinyan said:
Not exactly. You have to do it by some other means. Here are two
examples:

$str =~ /X(\d)/g and $str =~ /\G(\w{$1})/

and

$str =~ /X(\d)((??{ "\\w{$1}" }))/

This looks like a good solution if there is nothing after in the
string... But in my case, my regex is longer. I should have give this
info before =(

So for example:

X3xyzA4abc....

and only the number can give me the length of the string I want to
grab.

Thanks again,

Phil.
 
J

Jeff 'japhy' Pinyan

This looks like a good solution if there is nothing after in the
string... But in my case, my regex is longer. I should have give this
info before =(

So for example:

X3xyzA4abc....

and only the number can give me the length of the string I want to
grab.

I don't think you actually tried my solution, then.

$str = "X3xyzA4abc";
$str =~ /X(\d)((??{ "\\w{$1}" }))/
and print "$1 -> '$2'\n";

That prints: 3 -> 'xyz'

--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
Senior Dean, Fall 2004 % have long ago been overpaid?
RPI Corporation Secretary %
http://japhy.perlmonk.org/ % -- Meister Eckhart
 
B

Brian McCauley

Philippe said:
This looks like a good solution if there is nothing after in the
string... But in my case, my regex is longer.

Can you explain why you think this is a problem?
 
C

Charles DeRykus

This looks like a good solution if there is nothing after in the
string... But in my case, my regex is longer. I should have give this
info before =(

So for example:

X3xyzA4abc....

and only the number can give me the length of the string I want to
grab.

If you're trying to grab 'em all, maybe:

$str="X3XyzA4abcd....";

print "$2\n" while $str =~/\D*(\d)((??{ "\\w{$1}" }))/g;
 
P

Philippe Aymer

Great guys! Thank you!

I was sure PERL would do it. I was aware of (??{}), but for "simple"
pattern, I didn't know the use of '"' which can be usefull for more
complex regex.

Now, I still have a trouble. Because:

/X(\d)((??{"\\w{$1}"}))/

works, but in my string, I also have to match newline. So I did:

/X(\d)(??{"\\w{$1}"})/s

which doesn't work (seems to apply only to //, not things within
(?..)), then:

/X(\d)(??{"[\\w\n]{$1}"})/

which doesn't work neither... (?)

Any idea ?

Thanks again for your response, quick and clean!

Phil.
 
J

Jeff 'japhy' Pinyan

Now, I still have a trouble. Because:

/X(\d)((??{"\\w{$1}"}))/

works, but in my string, I also have to match newline. So I did:

/X(\d)(??{"\\w{$1}"})/s

which doesn't work (seems to apply only to //, not things within
(?..)), then:

The /s modifier only affects the '.' metacharacter. \w doesn't match \n.
/X(\d)(??{"[\\w\n]{$1}"})/

which doesn't work neither... (?)

This should work:

/X(\d)((??{ "[\\w\\n]{$1}" }))/

--
Jeff "japhy" Pinyan % How can we ever be the sold short or
RPI Acacia Brother #734 % the cheated, we who for every service
Senior Dean, Fall 2004 % have long ago been overpaid?
RPI Corporation Secretary %
http://japhy.perlmonk.org/ % -- Meister Eckhart
 
P

Philippe Aymer

Jeff 'japhy' Pinyan said:
The /s modifier only affects the '.' metacharacter. \w doesn't match \n.

oups... I should have written:

/X(\d)(??{".{$1}"})/s

that's what I'm using ("xyz" in my example coule be anything, even non
printable char).
/X(\d)(??{"[\\w\n]{$1}"})/

which doesn't work neither... (?)

This should work:

/X(\d)((??{ "[\\w\\n]{$1}" }))/

ok, I have trouble with my fingers... I'm using ".\\n", but no it's
not working.

So I try this program:

my $string = "DA3xyzB4ab\nc";

print "==>$string<==" . "\n\n";

if ($string =~ /
D
(
A
(\d)
(?{ print "===>$2<===\n"; })
( (??{ "[\\w\\n]{$2}" }) )
(?{ print "===>$3<===\n"; })
)
(
B
(\d)
(?{ print "===>$5<===\n"; })
( (??{ "[\\w\\n]{$5}" }) )
(?{ print "===>$6<===\n"; })
)
/xs) {
print "\n";
print "DATA : =>$1<= " . length($1) . "\n";
print "DATA : =>$4<= " . length($4) . "\n";
}

The second pattern : "[.\\n]{$5}" doesn't work... If I replace "." by
"\\w" for this example it works, but I need to match "." (everything)
not "\w".

Thanks again!

Phil.
 
P

Philippe Aymer

Jeff 'japhy' Pinyan said:
Now, I still have a trouble. Because:

/X(\d)((??{"\\w{$1}"}))/

works, but in my string, I also have to match newline. So I did:

/X(\d)(??{"\\w{$1}"})/s

which doesn't work (seems to apply only to //, not things within
(?..)), then:

The /s modifier only affects the '.' metacharacter. \w doesn't match \n.
/X(\d)(??{"[\\w\n]{$1}"})/

which doesn't work neither... (?)

This should work:

/X(\d)((??{ "[\\w\\n]{$1}" }))/

By the way, when writing my question, I found one solution (is there
another one TIMTOWTDI ?):

([^\\n]|\\n){$1}

it works!

Regards,

Phil.
 
B

Ben Morrow

Quoth (e-mail address removed) (Philippe Aymer):
oups... I should have written:

/X(\d)(??{".{$1}"})/s

that's what I'm using ("xyz" in my example coule be anything, even non
printable char).

Maybe /s doesn't correctly propagate into (regex)-runtime-interpolated
strings (this is probably a bug in the regex engine, if it's true); try

/X(\d)(??{"(?s).{$1}"})/s
/X(\d)(??{"[\\w\n]{$1}"})/

which doesn't work neither... (?)

This should work:

/X(\d)((??{ "[\\w\\n]{$1}" }))/

ok, I have trouble with my fingers... I'm using ".\\n", but no it's
not working.

CUT AND PASTE CODE. NEVER RETYPE IT.
So I try this program:

my $string = "DA3xyzB4ab\nc";

print "==>$string<==" . "\n\n";

if ($string =~ /
D
(
A
(\d)
(?{ print "===>$2<===\n"; })
( (??{ "[\\w\\n]{$2}" }) )

Again you have \w... please say what you mean.
(?{ print "===>$3<===\n"; })
)
(
B
(\d)
(?{ print "===>$5<===\n"; })
( (??{ "[\\w\\n]{$5}" }) )
(?{ print "===>$6<===\n"; })
)
/xs) {
print "\n";
print "DATA : =>$1<= " . length($1) . "\n";
print "DATA : =>$4<= " . length($4) . "\n";
}

The second pattern : "[.\\n]{$5}" doesn't work...

What do you mean, it doesn't work? . is not a metachar inside character
classes, so this matches $5 occurences of "." or "\n". You want

"(?:.|\\n){$5}"

or use (?s) as above.

Ben
 
P

Philippe Aymer

Hi guys,

My error, I didn't know the "." metachar was not available in
character class (ie "[]").

And yes, it seems that /s doesn't correctly propagate into
(regex)-runtime-interpolated strings as in:
/X(\d)(??{".{$1}"})/s
I don't know if this is by design or a bug.

Thanks again for your help. It was much appreciated!

Phil.

Ben Morrow said:
Quoth (e-mail address removed) (Philippe Aymer):
oups... I should have written:

/X(\d)(??{".{$1}"})/s

that's what I'm using ("xyz" in my example coule be anything, even non
printable char).

Maybe /s doesn't correctly propagate into (regex)-runtime-interpolated
strings (this is probably a bug in the regex engine, if it's true); try

/X(\d)(??{"(?s).{$1}"})/s
/X(\d)(??{"[\\w\n]{$1}"})/

which doesn't work neither... (?)

This should work:

/X(\d)((??{ "[\\w\\n]{$1}" }))/

ok, I have trouble with my fingers... I'm using ".\\n", but no it's
not working.

CUT AND PASTE CODE. NEVER RETYPE IT.
So I try this program:

my $string = "DA3xyzB4ab\nc";

print "==>$string<==" . "\n\n";

if ($string =~ /
D
(
A
(\d)
(?{ print "===>$2<===\n"; })
( (??{ "[\\w\\n]{$2}" }) )

Again you have \w... please say what you mean.
(?{ print "===>$3<===\n"; })
)
(
B
(\d)
(?{ print "===>$5<===\n"; })
( (??{ "[\\w\\n]{$5}" }) )
(?{ print "===>$6<===\n"; })
)
/xs) {
print "\n";
print "DATA : =>$1<= " . length($1) . "\n";
print "DATA : =>$4<= " . length($4) . "\n";
}

The second pattern : "[.\\n]{$5}" doesn't work...

What do you mean, it doesn't work? . is not a metachar inside character
classes, so this matches $5 occurences of "." or "\n". You want

"(?:.|\\n){$5}"

or use (?s) as above.

Ben
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,011
Latest member
AjaUqq1950

Latest Threads

Top