L
lepikhin
Anybody hear about the
boolean endsWith(String s, Pattern pattern)
implementations?
boolean endsWith(String s, Pattern pattern)
implementations?
The source code for all the regex stuff is there for you to look at inboolean endsWith(String s, Pattern pattern)
Since a boolean can only contain the boolean values 'true' and 'false', howAnybody hear about the
boolean endsWith(String s, Pattern pattern)
implementations?
huh?Since a boolean can only contain the boolean values 'true' and 'false', how
could it possibly contain a String? Why would anyone write a method to
search for Strings within a boolean?
Probmlem with "something$"-like regexps is, that you need to look
through the whole string. It's very slow.
scan the string backwards.
Unfortunately Java Pattern compiler (java.util.regex.Pattern) is
stupid. I've checked it.
Unfortunately Java Pattern compiler (java.util.regex.Pattern) is
stupid. I've checked it.
In said:Unfortunately Java Pattern compiler (java.util.regex.Pattern) is
stupid. I've checked it.
It's difficult, for some reasons.
John said:Regexes do
not easily reverse; as far as I know, there is no general algorithm for
"reversing" a regex, so it is unsurprising that Pattern doesn't know how
to do it.
Chris said:John C. Bollinger wrote:
Why do you say that ? Am I missing something, or are you talking about the
difficulty of reversing (some of) the Perl-ish extensions to the classic regexp
concept ?
It seems (to me) that reversing a classic regexp is straightforward using the
following (rather redundant) transformations on the parsed AST representation
of the regexp:
reverse( c ) -> c
reverse( . ) -> .
reverse( [a-b] ) -> [a-b]
reverse( R1 | R2 ) -> reverse(R1) | reverse(R2)
reverse( R1R2 ) -> reverse(R2) reverse(R1)
reverse( ( R ) ) -> ( reverse(R) )
reverse( R * ) -> reverse(R) *
reverse( R + ) -> reverse(R) +
reverse( ^ R) -> reverse(R) $
reverse( R $ ) -> ^ reverse(R)
(where a, b, and c are characters and R, R1, and R1 are regexps). I have no
idea what some of the Perl-ish extensions would even mean if reversed, but
"classic" regexps are good enough for me, and perhaps for most (legitimate)
applications.
I am talking about reversing regexes expressed in the language supported
by java.util.regex.Pattern, a language even more feature-laden than
Perl's regex language.
John said:I am talking about reversing regexes expressed in the language supported
by java.util.regex.Pattern, a language even more feature-laden than
Perl's regex language.
Your argument for how that particular regex language can be reversed is
convincing, but you don't have to add much to the language to make the
task a lot harder. How about reluctant quantifiers, for instance?
Chris said:It seems to me that the greedy/reluctant distinction is only about handling
ambiguous "parses". That's to say that a switching between greedy and
reluctant never changes the language that the regexp recognises, but only
changes the way that sub-regexps are assigned to substrings in the input
sequence. (I'm assuming that the regexp is being used to make a simple Boolean
judgement "does this string match?", the streaming mode is obviously
different).
I conjecture that
inverting the greediness of each subexpression while reversing the pattern
would produce the "same" parse of the reversed string.
I have no idea about the possessive qualifier. But then I don't care either --
it's an aesthetic disaster, a theoretic horror, and I hope (and expect) never
to see any legitimate use for it.
Beyond those cases, the only non-classical features of the Java regexp package
(I think) are: The wide palette of character classes, which can be seen as
syntactic sugar for character ranges, or even for "raw" alternation, and which
reverse trivially. The various end-of-word, end-of-line, etc,
pseudo-characters. The pseudo-characters present a problem if the set is not
closed under reversal, I don't know whether it is (CR-LF, anyone ?). And the
backref feature, which obviously doesn't reverse.
John said:Do I sense a mild distaste for the feature? :^)
In truth, I agree with you. I raised the issue only because I think
they're a certain killer of general-purpose reversal of arbitrary Java
regex patterns. Perhaps in itself that is reflective of the possessive
quantifiers' perversion.
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.