Regular Expression

Z

zowtar

<h1><a href="url" >bla bla bla</a></h1>
<h1><a href="url" >bla bla bla </a></h1>
<h1><a href="url" >bla bla bla <img
src=http://site.com/img.gif></a></h1>

^.*<h1><a[^>]+>(.+)(?:\s)?(?:<img .*>)?</a></h1>.*$

Why the code don't work for give me the "bla bla bla"? sometimes return
"bla bla bla "...
 
P

Paul Lalli

zowtar said:
<h1><a href="url" >bla bla bla</a></h1>
<h1><a href="url" >bla bla bla </a></h1>
<h1><a href="url" >bla bla bla <img
src=http://site.com/img.gif></a></h1>

^.*<h1><a[^>]+>(.+)(?:\s)?(?:<img .*>)?</a></h1>.*$

Why the code

There is no code there. There are fragments of HTML, and something
that could conceivably be a regular expression. You have not shown how
you are matching this pattern against this HTML. Please post a SHORT
but COMPLETE program that demonstrates your errors.
don't work for give me the "bla bla bla"?

I have no idea what this means...
sometimes return "bla bla bla "...

.... but this seems to contradict it.

Please post, at minimum:
1) a short-but-complete script that demonstrates your error
2) sample data
3) desired output
4) actual output.

Have you read the posting guidelines that are posted here twice a week?
Please do so before posting again.

Paul Lalli
 
J

Josef Moellers

zowtar said:
<h1><a href="url" >bla bla bla</a></h1>
<h1><a href="url" >bla bla bla </a></h1>
<h1><a href="url" >bla bla bla <img
src=http://site.com/img.gif></a></h1>

^.*<h1><a[^>]+>(.+)(?:\s)?(?:<img .*>)?</a></h1>.*$

Why the code don't work for give me the "bla bla bla"? sometimes return
"bla bla bla "...

Perl's regexes are greedy, therefore the (.+) will match the blank at
the end while the (?:\s)? will happily match the nothing that follows.

Make the (.+) non-greedy: (.+?).
 
Z

zowtar

I try use non-greedy quantifiers in other case for filter the url...
but don't work...


MATCH:
href="http://site.com/0,,NEWS39104-EI8090,00.html"
href="javascript:abre('http://site.com/0,,NEWS39104-EI8090,00.html','Gallery39104','660','500','no');"


CODE:
href="(?:.*)?((?:ftp|http|https)://(?:[^:/]+)(?::[0-9]{1,5})?(?:/.*)?.+)"
return
OK - http://site.com/0,,NEWS39104-EI8090,00.html
ERROR -
http://site.com/0,,NEWS39104-EI8090,00.html','Gallery39104','660','500','no');

href="(?:.*)?((?:ftp|http|https)://(?:[^:/]+)(?::[0-9]{1,5})?(?:/.*)?.+?)(?:\',\'.*\',\'.*\',\'.*\',\'.*\'\);)?"
return
ERROR - None
 
J

John Bokma

zowtar said:
I try use non-greedy quantifiers in other case for filter the url...
but don't work...

You're using the wrong tool for the job. Have a look at HTML::TreeBuilder.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,770
Messages
2,569,584
Members
45,075
Latest member
MakersCBDBloodSupport

Latest Threads

Top