Regex pattern problem

T

Ted Hopp

I was writing a quick-and-dirty regex to search html text and pull out the
source url from IMG tags. I first tried:

Pattern p = Pattern.compile("<img (?:[^>]* )?src=\"([^\"]*)\"");

(I know that this pattern makes all kinds of unwarranted assumptions about
the html, but that's another topic.) The problem I was having was that
although this pattern matches, it only results in one capture group--group
0. I was expecting the parens after src= to give me the url in capture group
1, but no such luck. It's only when I double the parens:

Pattern p = Pattern.compile("<img (?:[^>]* )?src=\"(([^\"]*))\"");

that the src value is captured.

So my question is: why do I need to double the parens?

Thanks,

Ted Hopp
 
J

Jussi Piitulainen

Ted said:
I was writing a quick-and-dirty regex to search html text and
pull out the source url from IMG tags. I first tried:

Pattern p = Pattern.compile("<img (?:[^>]* )?src=\"([^\"]*)\"");

(I know that this pattern makes all kinds of unwarranted
assumptions about the html, but that's another topic.) The
problem I was having was that although this pattern matches,
it only results in one capture group--group 0. I was
expecting the parens after src= to give me the url in capture
group 1, but no such luck. It's only when I double the
parens:

Pattern p = Pattern.compile("<img (?:[^>]* )?src=\"(([^\"]*))\"");

that the src value is captured.

So my question is: why do I need to double the parens?

You don't need to double the parens. You need to provide a
short program that demonstrates the problem. The following is
longer than needed, but it fails to fail in the way that you
describe: it has single parens in the pattern, accesses group
1, and prints here.be.it/1 and here.be.it/2 as expected:

import java.util.regex.Matcher;
import java.util.regex.Pattern;
class Roska {
public static void main(String [] args) {
String t1 = "left <img stuff src=\"here.be.it/1\" etc.>";
String t2 = " then left <img src=\"here.be.it/2\" etc.>";
Pattern p = Pattern
.compile("<img (?:[^>]* )?src=\"([^\"]*)\"");
Matcher m = p.matcher(t1 + t2);
while (m.find()) {
System.out.println(m.group(1));
}
}
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

regex problem 9
regex 12
Regex challenge 15
Aligned to the left 3
? Trouble with: passing dragElement(e.target.id); 8
complex regex 1
complex regex 1
RegEx Help 2

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top