RegEx Help one more time

K

Ken Kast

Here's my pattern:
pattern =
"\\b[A-Z]([A-Z0-9]|[-+_/&.](?=[A-Z0-9])|[(][A-Z0-9]([A-Z0-9]|[-+_/&.](?=[A-Z0-9]))*[)])+\\b";

With the string AB(CDE), it finds only AB(CDE. Why isn't the closing parens
found and why is it accepting the string without it?

Thanks.

Ken
 
J

Jussi Piitulainen

Ken said:
Here's my pattern:
pattern =
"\\b[A-Z]([A-Z0-9]|[-+_/&.](?=[A-Z0-9])|[(][A-Z0-9]([A-Z0-9]|[-+_/&.](?=[A-Z0-9]))*[)])+\\b";

With the string AB(CDE), it finds only AB(CDE. Why isn't the closing
parens found and why is it accepting the string without it?

I submit the following little program as an example of how you can
study such problems yourself in a kind of experimental way. However,
it doesn't find "AB(CDE", nor do I see how it could when the branch
that matches the opening "(" ends with the closing ")".

To match the closing paren before a word boundary there has to be a
word character immediately after it.

import java.util.regex.Pattern;
import java.util.regex.Matcher;
class Foo {
public static void main(String [] _) {
String pattern = "\\b[A-Z]("
+ "[A-Z0-9]"
+ "|[-+_/&.](?=[A-Z0-9])"
+ "|[(][A-Z0-9]([A-Z0-9]|[-+_/&.](?=[A-Z0-9]))*[)]"
+ ")+\\b";

String text = "First AB(CDE) ends at a non-word-boundary, "
+ "second GH(IJ)K ends at a word-boundary."
+ "Third LM(NOP)q should be caught.";

Matcher m = Pattern.compile(pattern).matcher(text);
while (m.find()) {
System.out.println(m.group());
}
}
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,744
Messages
2,569,483
Members
44,901
Latest member
Noble71S45

Latest Threads

Top