simple regex

Discussion in 'Perl Misc' started by r3gis, Jun 5, 2007.

  1. r3gis

    r3gis Guest

    r3gis, Jun 5, 2007
    #1
    1. Advertising

  2. r3gis

    Paul Lalli Guest

    On Jun 5, 12:33 pm, r3gis <> wrote:
    > Hi
    > I am trying to extract all URLs ending with php or cgi or pl from one
    > website with the following code :
    > foreach $judge( $res->content=~m#((http://[a-z-\/\.~] \.(php|cgi|
    > pl)))#g)
    > {
    > print $judge,"\n";
    > }
    > But for some reason I get redundant results :
    >
    > http://www.kanazawa-gu.ac.jp/~hayashiy/cgi-bin/log/env.cgi
    > cgihttp://www.bsnoop.de/cgi-bin/jenv.cgi
    > cgi
    >
    > etc.
    >
    > Could someone explain to me why the file extension is present in this
    > result set . What am I doing wrong ?


    You have multiple capturing parentheses in your pattern match. A
    pattern match in list context (such as that imposed by the foreach
    loop) returns a list of ALL captured parentheses.

    Change the ones you don't want to capture to be noncapturing, by
    adding a ?: right after the (

    See also:
    perldoc perlre
    perldoc perlretut
    perldoc perlreref

    Paul Lalli
     
    Paul Lalli, Jun 5, 2007
    #2
    1. Advertising

  3. r3gis

    jeevs Guest

    well paul i tested this on windows n works fine... So is it really
    related to multiple paranthesis?.
    So I think, the input has to be checked. But r3gis please follow
    Paul's advice as I may be wrong being a newbie

    #!/usr/bin/perl
    use strict;
    use warnings;
    my @arr = ('http://www.kanazawa-gu.ac.jp/~hayashiy/cgi-bin/log/
    env.cgi', 'http://www.bsnoop.de/cgi-bin/jenv.cgi','asdadada');
    foreach (@arr) {
    if ($_=~m!((http://[a-z-\/\.~] \.(php|cgi|pl)))!g) {
    print $_;
    }
    }
     
    jeevs, Jun 6, 2007
    #3
  4. r3gis

    Paul Lalli Guest

    On Jun 6, 2:11 am, jeevs <> wrote:
    > well paul i tested this on windows n works fine... So is it really
    > related to multiple paranthesis?.


    You tested *what* on Windows? The code that r3gis posted, or the code
    that you posted? The code that r3gis posted is incomplete, so I'd
    like to see the actual program you used. The code that you posted has
    nothing at all to do with the original problem.


    Confused,
    Paul Lalli
     
    Paul Lalli, Jun 6, 2007
    #4
  5. r3gis

    jeevs Guest

    On Jun 6, 3:38 pm, Paul Lalli <> wrote:
    > On Jun 6, 2:11 am, jeevs <> wrote:
    >
    > > well paul i tested this on windows n works fine... So is it really
    > > related to multiple paranthesis?.


    > You tested *what* on Windows? The code that r3gis posted, or the code
    > that you posted?


    Sorry for my irrelevant post ... I meant the code posted by me which I
    accept was not at all related to the original problem and I apologize
    for taking your and others time into this.

    r3gis as suggested by Paul you can replace the following line in your
    code

    foreach $judge( $res->content=~m#((http://[a-z-\/\.~] \.(php|cgi|
    pl)))#g)

    by

    foreach $judge( $res->content=~m!(http://[a-z-\/\.~] \.(?:php|cgi|pl))!
    g)

    Thanks Paul. I will be carefull next time.
     
    jeevs, Jun 6, 2007
    #5
  6. r3gis

    r3gis Guest

    I did not realize that the capturing parentheses can be nested in
    another one independently of the whole regex.

    Thanks for help. Everything is working right now as it should :]
     
    r3gis, Jun 16, 2007
    #6
  7. On Sat, 16 Jun 2007 09:02:16 -0000, r3gis <> wrote:

    >I did not realize that the capturing parentheses can be nested in
    >another one independently of the whole regex.


    Yep: <http://perlmonks.org/?node_id=442322> ;-)


    Michele
    --
    {$_=pack'B8'x25,unpack'A8'x32,$a^=sub{pop^pop}->(map substr
    (($a||=join'',map--$|x$_,(unpack'w',unpack'u','G^<R<Y]*YB='
    ..'KYU;*EVH[.FHF2W+#"\Z*5TI/ER<Z`S(G.DZZ9OX0Z')=~/./g)x2,$_,
    256),7,249);s/[^\w,]/ /g;$ \=/^J/?$/:"\r";print,redo}#JAPH,
     
    Michele Dondi, Jun 16, 2007
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. =?Utf-8?B?SmViQnVzaGVsbA==?=

    Is ASP Validator Regex Engine Same As VS2003 Find Regex Engine?

    =?Utf-8?B?SmViQnVzaGVsbA==?=, Oct 22, 2005, in forum: ASP .Net
    Replies:
    2
    Views:
    745
    =?Utf-8?B?SmViQnVzaGVsbA==?=
    Oct 22, 2005
  2. Rick Venter

    perl regex to java regex

    Rick Venter, Oct 29, 2003, in forum: Java
    Replies:
    5
    Views:
    1,693
    Ant...
    Nov 6, 2003
  3. Replies:
    2
    Views:
    628
  4. Xah Lee
    Replies:
    1
    Views:
    972
    Ilias Lazaridis
    Sep 22, 2006
  5. Replies:
    3
    Views:
    834
    Reedick, Andrew
    Jul 1, 2008
Loading...

Share This Page