R
Roger Marquis
Finally figured out an odd behavior in 1.4's util.regex.Pattern
class. Specifically, the matcher() method does not work as expected.
Unlike Perl, grep, sed, and most other regex engines Java's
matcher() must match the _entire_ line to return true.
The documentation does have a section titled "Comparison to Perl
5" but for some reason it doesn't list this important difference
(though it does provide a hint in the MULTILINE detail). The fix
is inelegant but straightforward: simply add a wildcard (.*) prefix
and/or suffix to each Pattern.compile. For example the perl/sed/grep:
/somestring/
is equivalent to Java's:
/.*somestring.*/
Here's an example in this hostname validation method:
public static boolean isValidFQDN(String FQDN) {
Pattern legal = Pattern.compile("[0-9a-zA-Z\\.\\-]+");
Matcher isLegal = legal.matcher(FQDN);
if ( ! isLegal.matches() ) {
return false;
}
Pattern illegal = Pattern.compile(".*\\.\\..*|.*--.*|^\\..*|^-.*|.*\\.$|.*-$");
// -- note java regex wildcards ---^^------^^-^^--^^-----^^---^^-^^-----^^
Matcher isIllegal = illegal.matcher(FQDN);
if ( isIllegal.matches() ) {
return false;
}
return true;
}
class. Specifically, the matcher() method does not work as expected.
Unlike Perl, grep, sed, and most other regex engines Java's
matcher() must match the _entire_ line to return true.
The documentation does have a section titled "Comparison to Perl
5" but for some reason it doesn't list this important difference
(though it does provide a hint in the MULTILINE detail). The fix
is inelegant but straightforward: simply add a wildcard (.*) prefix
and/or suffix to each Pattern.compile. For example the perl/sed/grep:
/somestring/
is equivalent to Java's:
/.*somestring.*/
Here's an example in this hostname validation method:
public static boolean isValidFQDN(String FQDN) {
Pattern legal = Pattern.compile("[0-9a-zA-Z\\.\\-]+");
Matcher isLegal = legal.matcher(FQDN);
if ( ! isLegal.matches() ) {
return false;
}
Pattern illegal = Pattern.compile(".*\\.\\..*|.*--.*|^\\..*|^-.*|.*\\.$|.*-$");
// -- note java regex wildcards ---^^------^^-^^--^^-----^^---^^-^^-----^^
Matcher isIllegal = illegal.matcher(FQDN);
if ( isIllegal.matches() ) {
return false;
}
return true;
}