java regex help

R

Rick Venter

I am trying to create a regular expression for matching a pattern with
a string enclosed in curly braces { } and not able to do that using
java.

For ex: if the String is
foo {bar} alpha {baz}:{bat} gamma

I need the strings:
{bar}
{baz}
{bat}

after the pattern matching

How do I do that ? Following is the code snippet and it is not
printing out anything. I think my regular expression pattern might be
wrong.

Can anyone help ?

protected static final String PARAMETER_REGEX =
"\\{\\[\\^\\{\\}\\]+\\}";

String originalString = "foo {bar} alpha {baz}:{bat} gamma";

Pattern p = Pattern.compile(PARAMETER_REGEX);
Matcher m = p.matcher(originalString);

while(m.find()) {
System.out.println(matcher.group());
}
 
V

VisionSet

Rick Venter said:
I am trying to create a regular expression for matching a pattern with
a string enclosed in curly braces { } and not able to do that using
java.


Braces have special meaning ie quantity of previous character/group.
Therefore to match as regular characters they need escaping with /
So try:

\\{[^\\{\\}]+\\}

Oh now I see the rest of your explanation.

Just don't escape the [ or ] you need them with their special meaning ie
character class.
 
A

Alan Moore

protected static final String PARAMETER_REGEX =
"\\{\\[\\^\\{\\}\\]+\\}";

String originalString = "foo {bar} alpha {baz}:{bat} gamma";

Pattern p = Pattern.compile(PARAMETER_REGEX);
Matcher m = p.matcher(originalString);

while(m.find()) {
System.out.println(matcher.group());
}

"\\{[^{}]+\\}"

You don't escape the '[', '^', or ']' because you *want* them to be
treated as metacharacters. And you don't need to escape the inner '{'
and '}' (though it doesn't hurt) because they're in a character class,
where they aren't treated as metacharacters anyway.

Also, if you're going to the trouble of creating a constant for the
regex, why not make it the compiled Pattern instead? That way, you'll
only have to compile it once, no matter how many times you use it.

protected static final Pattern PARAMETER_PATTERN =
Pattern.compile("\\{[^{}]+\\}");
 
J

John C. Bollinger

Alan said:
protected static final String PARAMETER_REGEX =
"\\{\\[\\^\\{\\}\\]+\\}";

String originalString = "foo {bar} alpha {baz}:{bat} gamma";

Pattern p = Pattern.compile(PARAMETER_REGEX);
Matcher m = p.matcher(originalString);

while(m.find()) {
System.out.println(matcher.group());
}


"\\{[^{}]+\\}"

You don't escape the '[', '^', or ']' because you *want* them to be
treated as metacharacters. And you don't need to escape the inner '{'
and '}' (though it doesn't hurt) because they're in a character class,
where they aren't treated as metacharacters anyway.

Also, if you're going to the trouble of creating a constant for the
regex, why not make it the compiled Pattern instead? That way, you'll
only have to compile it once, no matter how many times you use it.

protected static final Pattern PARAMETER_PATTERN =
Pattern.compile("\\{[^{}]+\\}");

Even simpler would be to use a ruluctant quantifier instead of a greedy
one, ala "\\{.+?\\}". (Read: opening '{' + one or more characters up to
and including the _first_ '}'.) That does change the semantics
slightly, however, in that it does not reject strings with internal
opening '{' characters. The pattern could be adjusted to account for
that, but then there would be no simplicity advantage to using the
reluctant form of the quantifier.


John Bollinger
(e-mail address removed)
 
A

Alan Moore

Alan said:
protected static final String PARAMETER_REGEX =
"\\{\\[\\^\\{\\}\\]+\\}";

String originalString = "foo {bar} alpha {baz}:{bat} gamma";

Pattern p = Pattern.compile(PARAMETER_REGEX);
Matcher m = p.matcher(originalString);

while(m.find()) {
System.out.println(matcher.group());
}


"\\{[^{}]+\\}"

You don't escape the '[', '^', or ']' because you *want* them to be
treated as metacharacters. And you don't need to escape the inner '{'
and '}' (though it doesn't hurt) because they're in a character class,
where they aren't treated as metacharacters anyway.

Even simpler would be to use a ruluctant quantifier instead of a greedy
one, ala "\\{.+?\\}". (Read: opening '{' + one or more characters up to
and including the _first_ '}'.) That does change the semantics
slightly, however, in that it does not reject strings with internal
opening '{' characters. The pattern could be adjusted to account for
that, but then there would be no simplicity advantage to using the
reluctant form of the quantifier.

Also, since the dot doesn't match line separators (unless you use the
DOTALL flag), the whole sequence would have to be on one line for the
regex to match--which is probably a good thing, come to think of it.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,142
Latest member
arinsharma
Top