Using a variable size with the repetition quantifier

Discussion in 'Perl Misc' started by Philippe Aymer, Oct 7, 2004.

  1. Hi all,

    I'm looking at a PERL regex (if possible) that will be able to use a
    repetition quantifier metachar, but the number of repetition is
    unknown until runtime.
    For example:

    X3xyz...

    the number 3 give me the number of "repetition" for the next chars
    (length of string), something like:

    /X(\d)(\w{\1})/

    but \1 is not possible within {} the repetition quantifier.

    Is there a way to use {} with the repetition number only known from
    the regex ?

    Thanks,

    Phil.
     
    Philippe Aymer, Oct 7, 2004
    #1
    1. Advertising

  2. On 7 Oct 2004, Philippe Aymer wrote:

    >I'm looking at a PERL regex (if possible) that will be able to use a
    >repetition quantifier metachar, but the number of repetition is
    >unknown until runtime.
    >For example:
    >
    >X3xyz...
    >
    >the number 3 give me the number of "repetition" for the next chars
    >(length of string), something like:
    >
    >/X(\d)(\w{\1})/
    >
    >but \1 is not possible within {} the repetition quantifier.
    >
    >Is there a way to use {} with the repetition number only known from
    >the regex ?


    Not exactly. You have to do it by some other means. Here are two
    examples:

    $str =~ /X(\d)/g and $str =~ /\G(\w{$1})/

    and

    $str =~ /X(\d)((??{ "\\w{$1}" }))/

    --
    Jeff "japhy" Pinyan % How can we ever be the sold short or
    RPI Acacia Brother #734 % the cheated, we who for every service
    Senior Dean, Fall 2004 % have long ago been overpaid?
    RPI Corporation Secretary %
    http://japhy.perlmonk.org/ % -- Meister Eckhart
     
    Jeff 'japhy' Pinyan, Oct 7, 2004
    #2
    1. Advertising

  3. Philippe Aymer wrote:
    >
    > I'm looking at a PERL regex (if possible) that will be able to use a
    > repetition quantifier metachar, but the number of repetition is
    > unknown until runtime.


    In general if you want a regex that adapts itself during its own
    execution you want (??{}).

    > For example:
    >
    > X3xyz...
    >
    > the number 3 give me the number of "repetition" for the next chars
    > (length of string), something like:
    >
    > /X(\d)(\w{\1})/
    >
    > but \1 is not possible within {} the repetition quantifier.
    >
    > Is there a way to use {} with the repetition number only known from
    > the regex ?


    /X(\d)((??{"\\w{$1}"}))/
     
    Brian McCauley, Oct 8, 2004
    #3
  4. Jeff 'japhy' Pinyan <> wrote in message news:<>...
    > On 7 Oct 2004, Philippe Aymer wrote:
    >
    > >I'm looking at a PERL regex (if possible) that will be able to use a
    > >repetition quantifier metachar, but the number of repetition is
    > >unknown until runtime.
    > >For example:
    > >
    > >X3xyz...
    > >
    > >the number 3 give me the number of "repetition" for the next chars
    > >(length of string), something like:
    > >
    > >/X(\d)(\w{\1})/
    > >
    > >but \1 is not possible within {} the repetition quantifier.
    > >
    > >Is there a way to use {} with the repetition number only known from
    > >the regex ?

    >
    > Not exactly. You have to do it by some other means. Here are two
    > examples:
    >
    > $str =~ /X(\d)/g and $str =~ /\G(\w{$1})/
    >
    > and
    >
    > $str =~ /X(\d)((??{ "\\w{$1}" }))/


    This looks like a good solution if there is nothing after in the
    string... But in my case, my regex is longer. I should have give this
    info before =(

    So for example:

    X3xyzA4abc....

    and only the number can give me the length of the string I want to
    grab.

    Thanks again,

    Phil.
     
    Philippe Aymer, Oct 8, 2004
    #4
  5. On 8 Oct 2004, Philippe Aymer wrote:

    >> $str =~ /X(\d)((??{ "\\w{$1}" }))/

    >
    >This looks like a good solution if there is nothing after in the
    >string... But in my case, my regex is longer. I should have give this
    >info before =(
    >
    >So for example:
    >
    >X3xyzA4abc....
    >
    >and only the number can give me the length of the string I want to
    >grab.


    I don't think you actually tried my solution, then.

    $str = "X3xyzA4abc";
    $str =~ /X(\d)((??{ "\\w{$1}" }))/
    and print "$1 -> '$2'\n";

    That prints: 3 -> 'xyz'

    --
    Jeff "japhy" Pinyan % How can we ever be the sold short or
    RPI Acacia Brother #734 % the cheated, we who for every service
    Senior Dean, Fall 2004 % have long ago been overpaid?
    RPI Corporation Secretary %
    http://japhy.perlmonk.org/ % -- Meister Eckhart
     
    Jeff 'japhy' Pinyan, Oct 8, 2004
    #5
  6. Philippe Aymer wrote:

    > Jeff 'japhy' Pinyan <> wrote in message news:<>...
    >
    >>On 7 Oct 2004, Philippe Aymer wrote:
    >>>
    >>>/X(\d)(\w{\1})/
    >>>
    >>>but \1 is not possible within {} the repetition quantifier.

    >>
    >> $str =~ /X(\d)/g and $str =~ /\G(\w{$1})/
    >>
    >> $str =~ /X(\d)((??{ "\\w{$1}" }))/

    >
    > This looks like a good solution if there is nothing after in the
    > string... But in my case, my regex is longer.


    Can you explain why you think this is a problem?
     
    Brian McCauley, Oct 8, 2004
    #6
  7. In article <>,
    Philippe Aymer <> wrote:
    >Jeff 'japhy' Pinyan <> wrote in message news:<>...
    >> On 7 Oct 2004, Philippe Aymer wrote:
    >>
    >> >I'm looking at a PERL regex (if possible) that will be able to use a
    >> >repetition quantifier metachar, but the number of repetition is
    >> >unknown until runtime.
    >> >For example:
    >> >
    >> >X3xyz...
    >> >
    >> >the number 3 give me the number of "repetition" for the next chars
    >> >(length of string), something like:
    >> >
    >> >/X(\d)(\w{\1})/
    >> >
    >> >but \1 is not possible within {} the repetition quantifier.
    >> >
    >> >Is there a way to use {} with the repetition number only known from
    >> >the regex ?

    >>
    >> Not exactly. You have to do it by some other means. Here are two
    >> examples:
    >>
    >> $str =~ /X(\d)/g and $str =~ /\G(\w{$1})/
    >>
    >> and
    >>
    >> $str =~ /X(\d)((??{ "\\w{$1}" }))/

    >
    >This looks like a good solution if there is nothing after in the
    >string... But in my case, my regex is longer. I should have give this
    >info before =(
    >
    >So for example:
    >
    >X3xyzA4abc....
    >
    >and only the number can give me the length of the string I want to
    >grab.
    >


    If you're trying to grab 'em all, maybe:

    $str="X3XyzA4abcd....";

    print "$2\n" while $str =~/\D*(\d)((??{ "\\w{$1}" }))/g;


    --
    Charles DeRykus
     
    Charles DeRykus, Oct 8, 2004
    #7
  8. Great guys! Thank you!

    I was sure PERL would do it. I was aware of (??{}), but for "simple"
    pattern, I didn't know the use of '"' which can be usefull for more
    complex regex.

    Now, I still have a trouble. Because:

    /X(\d)((??{"\\w{$1}"}))/

    works, but in my string, I also have to match newline. So I did:

    /X(\d)(??{"\\w{$1}"})/s

    which doesn't work (seems to apply only to //, not things within
    (?..)), then:

    /X(\d)(??{"[\\w\n]{$1}"})/

    which doesn't work neither... (?)

    Any idea ?

    Thanks again for your response, quick and clean!

    Phil.

    Brian McCauley <> wrote in message news:<ck60m1$549$>...
    > Philippe Aymer wrote:
    > >
    > > I'm looking at a PERL regex (if possible) that will be able to use a
    > > repetition quantifier metachar, but the number of repetition is
    > > unknown until runtime.

    >
    > In general if you want a regex that adapts itself during its own
    > execution you want (??{}).
    >
    > > For example:
    > >
    > > X3xyz...
    > >
    > > the number 3 give me the number of "repetition" for the next chars
    > > (length of string), something like:
    > >
    > > /X(\d)(\w{\1})/
    > >
    > > but \1 is not possible within {} the repetition quantifier.
    > >
    > > Is there a way to use {} with the repetition number only known from
    > > the regex ?

    >
    > /X(\d)((??{"\\w{$1}"}))/
     
    Philippe Aymer, Oct 12, 2004
    #8
  9. On 12 Oct 2004, Philippe Aymer wrote:

    >Now, I still have a trouble. Because:
    >
    >/X(\d)((??{"\\w{$1}"}))/
    >
    >works, but in my string, I also have to match newline. So I did:
    >
    >/X(\d)(??{"\\w{$1}"})/s
    >
    >which doesn't work (seems to apply only to //, not things within
    >(?..)), then:


    The /s modifier only affects the '.' metacharacter. \w doesn't match \n.

    >/X(\d)(??{"[\\w\n]{$1}"})/
    >
    >which doesn't work neither... (?)


    This should work:

    /X(\d)((??{ "[\\w\\n]{$1}" }))/

    --
    Jeff "japhy" Pinyan % How can we ever be the sold short or
    RPI Acacia Brother #734 % the cheated, we who for every service
    Senior Dean, Fall 2004 % have long ago been overpaid?
    RPI Corporation Secretary %
    http://japhy.perlmonk.org/ % -- Meister Eckhart
     
    Jeff 'japhy' Pinyan, Oct 12, 2004
    #9
  10. Jeff 'japhy' Pinyan <> wrote in message news:<>...
    > On 12 Oct 2004, Philippe Aymer wrote:
    >
    > >Now, I still have a trouble. Because:
    > >
    > >/X(\d)((??{"\\w{$1}"}))/
    > >
    > >works, but in my string, I also have to match newline. So I did:
    > >
    > >/X(\d)(??{"\\w{$1}"})/s
    > >
    > >which doesn't work (seems to apply only to //, not things within
    > >(?..)), then:

    >
    > The /s modifier only affects the '.' metacharacter. \w doesn't match \n.


    oups... I should have written:

    /X(\d)(??{".{$1}"})/s

    that's what I'm using ("xyz" in my example coule be anything, even non
    printable char).

    > >/X(\d)(??{"[\\w\n]{$1}"})/
    > >
    > >which doesn't work neither... (?)

    >
    > This should work:
    >
    > /X(\d)((??{ "[\\w\\n]{$1}" }))/


    ok, I have trouble with my fingers... I'm using ".\\n", but no it's
    not working.

    So I try this program:

    my $string = "DA3xyzB4ab\nc";

    print "==>$string<==" . "\n\n";

    if ($string =~ /
    D
    (
    A
    (\d)
    (?{ print "===>$2<===\n"; })
    ( (??{ "[\\w\\n]{$2}" }) )
    (?{ print "===>$3<===\n"; })
    )
    (
    B
    (\d)
    (?{ print "===>$5<===\n"; })
    ( (??{ "[\\w\\n]{$5}" }) )
    (?{ print "===>$6<===\n"; })
    )
    /xs) {
    print "\n";
    print "DATA : =>$1<= " . length($1) . "\n";
    print "DATA : =>$4<= " . length($4) . "\n";
    }

    The second pattern : "[.\\n]{$5}" doesn't work... If I replace "." by
    "\\w" for this example it works, but I need to match "." (everything)
    not "\w".

    Thanks again!

    Phil.
     
    Philippe Aymer, Oct 13, 2004
    #10
  11. Jeff 'japhy' Pinyan <> wrote in message news:<>...
    > On 12 Oct 2004, Philippe Aymer wrote:
    >
    > >Now, I still have a trouble. Because:
    > >
    > >/X(\d)((??{"\\w{$1}"}))/
    > >
    > >works, but in my string, I also have to match newline. So I did:
    > >
    > >/X(\d)(??{"\\w{$1}"})/s
    > >
    > >which doesn't work (seems to apply only to //, not things within
    > >(?..)), then:

    >
    > The /s modifier only affects the '.' metacharacter. \w doesn't match \n.
    >
    > >/X(\d)(??{"[\\w\n]{$1}"})/
    > >
    > >which doesn't work neither... (?)

    >
    > This should work:
    >
    > /X(\d)((??{ "[\\w\\n]{$1}" }))/


    By the way, when writing my question, I found one solution (is there
    another one TIMTOWTDI ?):

    ([^\\n]|\\n){$1}

    it works!

    Regards,

    Phil.
     
    Philippe Aymer, Oct 13, 2004
    #11
  12. Philippe Aymer

    Ben Morrow Guest

    Quoth (Philippe Aymer):
    > Jeff 'japhy' Pinyan <> wrote in message news:<>...
    > > On 12 Oct 2004, Philippe Aymer wrote:
    > >
    > > >Now, I still have a trouble. Because:
    > > >
    > > >/X(\d)((??{"\\w{$1}"}))/
    > > >
    > > >works, but in my string, I also have to match newline. So I did:
    > > >
    > > >/X(\d)(??{"\\w{$1}"})/s
    > > >
    > > >which doesn't work (seems to apply only to //, not things within
    > > >(?..)), then:

    > >
    > > The /s modifier only affects the '.' metacharacter. \w doesn't match \n.

    >
    > oups... I should have written:
    >
    > /X(\d)(??{".{$1}"})/s
    >
    > that's what I'm using ("xyz" in my example coule be anything, even non
    > printable char).


    Maybe /s doesn't correctly propagate into (regex)-runtime-interpolated
    strings (this is probably a bug in the regex engine, if it's true); try

    /X(\d)(??{"(?s).{$1}"})/s

    > > >/X(\d)(??{"[\\w\n]{$1}"})/
    > > >
    > > >which doesn't work neither... (?)

    > >
    > > This should work:
    > >
    > > /X(\d)((??{ "[\\w\\n]{$1}" }))/

    >
    > ok, I have trouble with my fingers... I'm using ".\\n", but no it's
    > not working.


    CUT AND PASTE CODE. NEVER RETYPE IT.

    > So I try this program:
    >
    > my $string = "DA3xyzB4ab\nc";
    >
    > print "==>$string<==" . "\n\n";
    >
    > if ($string =~ /
    > D
    > (
    > A
    > (\d)
    > (?{ print "===>$2<===\n"; })
    > ( (??{ "[\\w\\n]{$2}" }) )


    Again you have \w... please say what you mean.

    > (?{ print "===>$3<===\n"; })
    > )
    > (
    > B
    > (\d)
    > (?{ print "===>$5<===\n"; })
    > ( (??{ "[\\w\\n]{$5}" }) )
    > (?{ print "===>$6<===\n"; })
    > )
    > /xs) {
    > print "\n";
    > print "DATA : =>$1<= " . length($1) . "\n";
    > print "DATA : =>$4<= " . length($4) . "\n";
    > }
    >
    > The second pattern : "[.\\n]{$5}" doesn't work...


    What do you mean, it doesn't work? . is not a metachar inside character
    classes, so this matches $5 occurences of "." or "\n". You want

    "(?:.|\\n){$5}"

    or use (?s) as above.

    Ben

    --
    I've seen things you people wouldn't believe: attack ships on fire off
    the shoulder of Orion; I watched C-beams glitter in the dark near the
    Tannhauser Gate. All these moments will be lost, in time, like tears in rain.
    Time to die.
     
    Ben Morrow, Oct 13, 2004
    #12
  13. Hi guys,

    My error, I didn't know the "." metachar was not available in
    character class (ie "[]").

    And yes, it seems that /s doesn't correctly propagate into
    (regex)-runtime-interpolated strings as in:
    /X(\d)(??{".{$1}"})/s
    I don't know if this is by design or a bug.

    Thanks again for your help. It was much appreciated!

    Phil.

    Ben Morrow <> wrote in message news:<>...
    > Quoth (Philippe Aymer):
    > > Jeff 'japhy' Pinyan <> wrote in message news:<>...
    > > > On 12 Oct 2004, Philippe Aymer wrote:
    > > >
    > > > >Now, I still have a trouble. Because:
    > > > >
    > > > >/X(\d)((??{"\\w{$1}"}))/
    > > > >
    > > > >works, but in my string, I also have to match newline. So I did:
    > > > >
    > > > >/X(\d)(??{"\\w{$1}"})/s
    > > > >
    > > > >which doesn't work (seems to apply only to //, not things within
    > > > >(?..)), then:
    > > >
    > > > The /s modifier only affects the '.' metacharacter. \w doesn't match \n.

    > >
    > > oups... I should have written:
    > >
    > > /X(\d)(??{".{$1}"})/s
    > >
    > > that's what I'm using ("xyz" in my example coule be anything, even non
    > > printable char).

    >
    > Maybe /s doesn't correctly propagate into (regex)-runtime-interpolated
    > strings (this is probably a bug in the regex engine, if it's true); try
    >
    > /X(\d)(??{"(?s).{$1}"})/s
    >
    > > > >/X(\d)(??{"[\\w\n]{$1}"})/
    > > > >
    > > > >which doesn't work neither... (?)
    > > >
    > > > This should work:
    > > >
    > > > /X(\d)((??{ "[\\w\\n]{$1}" }))/

    > >
    > > ok, I have trouble with my fingers... I'm using ".\\n", but no it's
    > > not working.

    >
    > CUT AND PASTE CODE. NEVER RETYPE IT.
    >
    > > So I try this program:
    > >
    > > my $string = "DA3xyzB4ab\nc";
    > >
    > > print "==>$string<==" . "\n\n";
    > >
    > > if ($string =~ /
    > > D
    > > (
    > > A
    > > (\d)
    > > (?{ print "===>$2<===\n"; })
    > > ( (??{ "[\\w\\n]{$2}" }) )

    >
    > Again you have \w... please say what you mean.
    >
    > > (?{ print "===>$3<===\n"; })
    > > )
    > > (
    > > B
    > > (\d)
    > > (?{ print "===>$5<===\n"; })
    > > ( (??{ "[\\w\\n]{$5}" }) )
    > > (?{ print "===>$6<===\n"; })
    > > )
    > > /xs) {
    > > print "\n";
    > > print "DATA : =>$1<= " . length($1) . "\n";
    > > print "DATA : =>$4<= " . length($4) . "\n";
    > > }
    > >
    > > The second pattern : "[.\\n]{$5}" doesn't work...

    >
    > What do you mean, it doesn't work? . is not a metachar inside character
    > classes, so this matches $5 occurences of "." or "\n". You want
    >
    > "(?:.|\\n){$5}"
    >
    > or use (?s) as above.
    >
    > Ben
     
    Philippe Aymer, Oct 14, 2004
    #13
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. -

    Greedy quantifier

    -, Jul 11, 2005, in forum: Java
    Replies:
    0
    Views:
    496
  2. Replies:
    19
    Views:
    459
    Dr.Ruud
    May 7, 2006
  3. Francois Massion

    nested quantifier or unrecognized escape error

    Francois Massion, Jun 2, 2006, in forum: Perl Misc
    Replies:
    6
    Views:
    193
    Mirco Wahab
    Jun 2, 2006
  4. Jack
    Replies:
    2
    Views:
    323
    Tad McClellan
    Oct 4, 2006
  5. Replies:
    3
    Views:
    121
Loading...

Share This Page