simple regex

R

r3gis

P

Paul Lalli

Hi
I am trying to extract all URLs ending with php or cgi or pl from one
website with the following code :
foreach $judge( $res->content=~m#((http://[a-z-\/\.~]+\.(php|cgi|
pl)))#g)
{
print $judge,"\n";
}
But for some reason I get redundant results :

http://www.kanazawa-gu.ac.jp/~hayashiy/cgi-bin/log/env.cgi
cgihttp://www.bsnoop.de/cgi-bin/jenv.cgi
cgi

etc.

Could someone explain to me why the file extension is present in this
result set . What am I doing wrong ?

You have multiple capturing parentheses in your pattern match. A
pattern match in list context (such as that imposed by the foreach
loop) returns a list of ALL captured parentheses.

Change the ones you don't want to capture to be noncapturing, by
adding a ?: right after the (

See also:
perldoc perlre
perldoc perlretut
perldoc perlreref

Paul Lalli
 
P

Paul Lalli

well paul i tested this on windows n works fine... So is it really
related to multiple paranthesis?.

You tested *what* on Windows? The code that r3gis posted, or the code
that you posted? The code that r3gis posted is incomplete, so I'd
like to see the actual program you used. The code that you posted has
nothing at all to do with the original problem.


Confused,
Paul Lalli
 
J

jeevs

You tested *what* on Windows? The code that r3gis posted, or the code
that you posted?

Sorry for my irrelevant post ... I meant the code posted by me which I
accept was not at all related to the original problem and I apologize
for taking your and others time into this.

r3gis as suggested by Paul you can replace the following line in your
code

foreach $judge( $res->content=~m#((http://[a-z-\/\.~]+\.(php|cgi|
pl)))#g)

by

foreach $judge( $res->content=~m!(http://[a-z-\/\.~]+\.(?:php|cgi|pl))!
g)

Thanks Paul. I will be carefull next time.
 
R

r3gis

I did not realize that the capturing parentheses can be nested in
another one independently of the whole regex.

Thanks for help. Everything is working right now as it should :]
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,764
Messages
2,569,567
Members
45,041
Latest member
RomeoFarnh

Latest Threads

Top