Using constants in regular expressions

F

Fritz Bayer

Hello,

the following code snippet:

String replacementPattern = "-$1 $2-";
Pattern patter = Pattern.compile("http://(.*?)/(.*)");
pattern.matcher("http://www.example.com/test/index.html").replaceFirst(replacementPattern);
pattern.matcher("http://www.example.com/").replaceFirst(replacementPattern);
pattern.matcher("http:///").replaceFirst(replacementPattern);

will generate the following output:

-www.example.com /test/index.html-
-www.example.com -
- -

What I would like to achieve is to print out a constant infront of
each capturing group, if and only if, the group is not empty.

In other words: I'm looking for a value of "String
replacementPattern", which will generate the following output for the
GIVEN code above:

-host:www.example.com uri:/test/index.html-
-host:www.example.com -
--

I only know how to always print something infront of each capturing
group, regardless of whether or not it was empty ( String
replacementPattern = "- host:$1 uri:$2-";).

Fritz
 
J

John McGrath

What I would like to achieve is to print out a constant infront of
each capturing group, if and only if, the group is not empty.

I do not think there is any way to do that using the standard replace
methods. But you can use the matcher to do the matching, then build the
replacement string yourself. For example:

public String replace( String str ) {
Matcher matcher = pattern.matcher( str );
if ( ! matcher.matches() ) {
return null; // or whatever you want
}

String host = matcher.group( 1 );
String uri = matcher.group( 2 );

StringBuilder sb = new StringBuilder();
sb.append( '-' );

if ( host.length() > 0 ) {
sb.append( "host:" );
sb.append( host );
}

if ( ( host.length() > 0 ) || ( uri.length() > 0 ) ) {
sb.append( ' ' );
}

if ( uri.length() > 0 ) {
sb.append( "uri:" );
sb.append( uri );
}

sb.append( '-' );

return sb.toString();
}
 
F

Fritz Bayer

John McGrath said:
I do not think there is any way to do that using the standard replace
methods. But you can use the matcher to do the matching, then build the
replacement string yourself. For example:

public String replace( String str ) {
Matcher matcher = pattern.matcher( str );
if ( ! matcher.matches() ) {
return null; // or whatever you want
}

String host = matcher.group( 1 );
String uri = matcher.group( 2 );

StringBuilder sb = new StringBuilder();
sb.append( '-' );

if ( host.length() > 0 ) {
sb.append( "host:" );
sb.append( host );
}

if ( ( host.length() > 0 ) || ( uri.length() > 0 ) ) {
sb.append( ' ' );
}

if ( uri.length() > 0 ) {
sb.append( "uri:" );
sb.append( uri );
}

sb.append( '-' );

return sb.toString();
}


Thanks John, but the patterns are read from a configuration file,
which does not allow any programming stuff.

So I'm really looking for something, which solves my problem for the
GIVEN code.
 
J

John McGrath

Thanks John, but the patterns are read from a configuration file,
which does not allow any programming stuff.

That does make things difficult. Since you have constrained the solution
to one involving regular expressions, perhaps the Java Programmer group is
not the best place for the question. Since the java.util.regex package
implements Perl 5 regular expressions (with a few changes), you might have
better luck asking your question on a Perl newsgroup. That is where the
hard-core regex gurus would be found.
 
F

Fritz Bayer

John McGrath said:
That does make things difficult. Since you have constrained the solution
to one involving regular expressions, perhaps the Java Programmer group is
not the best place for the question. Since the java.util.regex package
implements Perl 5 regular expressions (with a few changes), you might have
better luck asking your question on a Perl newsgroup. That is where the
hard-core regex gurus would be found.

That's a good idea. I will do that!
 
H

HK

Fritz said:
the following code snippet:

String replacementPattern = "-$1 $2-";
Pattern patter = Pattern.compile("http://(.*?)/(.*)"); [...]
What I would like to achieve is to print out a constant infront of
each capturing group, if and only if, the group is not empty.

Looking at the answers you got, it seems what you want
is not easily achieved. If the above is only an example
of some even more tedious stuff, you may feel
adventurous enough to try my monq.jfa from

http://www.ebi.ac.uk/Rebholz-srv/whatizit/software

It allows you to bind a callback to a regular expression
which gets called whenever a match is found in an
input stream. In your callback you have complete
control over what to do with the matching text. Even more,
you can combine hundreds of regex/callback pairs into
one finite automaton.

Cheers,
Harald.
 
R

Robert Mischke

String replacementPattern = "-$1 $2-";
Pattern patter = Pattern.compile("http://(.*?)/(.*)");
pattern.matcher("http://www.example.com/test/index.html").replaceFirst(replacementPattern);
pattern.matcher("http://www.example.com/").replaceFirst(replacementPattern);
pattern.matcher("http:///").replaceFirst(replacementPattern); ....
What I would like to achieve is to print out a constant infront of
each capturing group, if and only if, the group is not empty.

In other words: I'm looking for a value of "String
replacementPattern", which will generate the following output for the
GIVEN code above:

-host:www.example.com uri:/test/index.html-
-host:www.example.com -

Well, I'm kind of tired right now, so this might be nonsense, BUT...
couldn't you solve it by applying several replacements in a row?

Step 1: "http://(.*?)/(.+)" -> "http://$1/uri:$2"
Step 2: "http://(.+?)/(.*)" -> "http://host:$1/$2"
Step 3: "http://(.*?)/(.*)" -> "-$1 $2-"

The key idea is that rules 1 and 2 only match if these parts are not
empty, and if, they prefix that part and preserve the rest. The final
step then grabs the parts, whether empty or not, with the prefixes
that may have been added.

Dirty hacks are fun :)

Robert
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top