boolean endsWith(String s, Pattern pattern)

Discussion in 'Java' started by lepikhin@gmail.com, Nov 7, 2005.

  1. Guest

    Anybody hear about the

    boolean endsWith(String s, Pattern pattern)

    implementations?
     
    , Nov 7, 2005
    #1
    1. Advertising

  2. Guest

    PS. I know about "something$"-like regexp
     
    , Nov 7, 2005
    #2
    1. Advertising

  3. Roedy Green Guest

    On 7 Nov 2005 06:44:31 -0800, wrote, quoted or
    indirectly quoted someone who said :

    >boolean endsWith(String s, Pattern pattern)

    The source code for all the regex stuff is there for you to look at in
    src.zip.

    IIRC it compiles Patterns to commands to a state machine interpreter,
    not byte code.
    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
     
    Roedy Green, Nov 7, 2005
    #3
  4. Rhino Guest

    <> wrote in message
    news:...
    > Anybody hear about the
    >
    > boolean endsWith(String s, Pattern pattern)
    >
    > implementations?
    >

    Since a boolean can only contain the boolean values 'true' and 'false', how
    could it possibly contain a String? Why would anyone write a method to
    search for Strings within a boolean?

    Rhino
     
    Rhino, Nov 7, 2005
    #4
  5. Roedy Green Guest

    On Mon, 7 Nov 2005 13:53:09 -0500, "Rhino"
    <> wrote, quoted or indirectly
    quoted someone who said :

    >> boolean endsWith(String s, Pattern pattern)
    >>
    >> implementations?
    >>

    >Since a boolean can only contain the boolean values 'true' and 'false', how
    >could it possibly contain a String? Why would anyone write a method to
    >search for Strings within a boolean?

    huh?

    I think he is asking does string s end with given pattern yes or no.

    he can't make it an instance method of String since String is final.
    He left out the word "static", though I suppose it could be
    implemented on some unrelated object.

    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
     
    Roedy Green, Nov 8, 2005
    #5
  6. Guest

    > I think he is asking does string s end with given pattern yes or no.
    Yes. Sorry for ambiguous problem statement.

    Probmlem with "something$"-like regexps is, that you need to look
    through the whole string. It's very slow.
     
    , Nov 8, 2005
    #6
  7. Roedy Green Guest

    On 8 Nov 2005 04:40:12 -0800, wrote, quoted or
    indirectly quoted someone who said :

    >Probmlem with "something$"-like regexps is, that you need to look
    >through the whole string. It's very slow.


    it depends on how clever the Pattern compiler is. A smart one might
    scan the string backwards.
    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
     
    Roedy Green, Nov 8, 2005
    #7
  8. Guest

    > it depends on how clever the Pattern compiler is. A smart one might
    > scan the string backwards.


    Unfortunately Java Pattern compiler (java.util.regex.Pattern) is
    stupid. I've checked it.
     
    , Nov 8, 2005
    #8
  9. Roedy Green Guest

    On 8 Nov 2005 06:36:50 -0800, wrote, quoted or
    indirectly quoted someone who said :

    >> it depends on how clever the Pattern compiler is. A smart one might
    >> scan the string backwards.

    >
    >Unfortunately Java Pattern compiler (java.util.regex.Pattern) is
    >stupid. I've checked it.


    In that case, if this were a bottleneck, you could help the regex
    along by chopping off all but the last N characters.
    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
     
    Roedy Green, Nov 8, 2005
    #9
  10. Chris Uppal Guest

    wrote:
    > > it depends on how clever the Pattern compiler is. A smart one might
    > > scan the string backwards.

    >
    > Unfortunately Java Pattern compiler (java.util.regex.Pattern) is
    > stupid. I've checked it.


    Construct your regexp backwards, reverse the string, and look for a match at
    the beginning of that...

    -- chris
     
    Chris Uppal, Nov 8, 2005
    #10
  11. Guest

    > Construct your regexp backwards
    It's difficult, for some reasons.

    Thanks anyway.
     
    , Nov 9, 2005
    #11
  12. In an earlier post, wrote:
    >Unfortunately Java Pattern compiler (java.util.regex.Pattern) is
    >stupid. I've checked it.


    And in a different one:
    >>Construct your regexp backwards

    >
    > It's difficult, for some reasons.


    My apologies, but does that mean you're stupid, too? I mean, you
    criticize the regex compiler as stupid for not being able to do a job
    equivalent to the one you say is too difficult for *you*. Regexes do
    not easily reverse; as far as I know, there is no general algorithm for
    "reversing" a regex, so it is unsurprising that Pattern doesn't know how
    to do it.

    You started out asking how to determine whether a string ends with a
    match for a particular pattern. The usual approach is to prepend ".*"
    to the pattern and match the whole string with it.

    --
    John Bollinger
     
    John C. Bollinger, Nov 10, 2005
    #12
  13. Chris Uppal Guest

    John C. Bollinger wrote:

    > Regexes do
    > not easily reverse; as far as I know, there is no general algorithm for
    > "reversing" a regex, so it is unsurprising that Pattern doesn't know how
    > to do it.


    Why do you say that ? Am I missing something, or are you talking about the
    difficulty of reversing (some of) the Perl-ish extensions to the classic regexp
    concept ?

    It seems (to me) that reversing a classic regexp is straightforward using the
    following (rather redundant) transformations on the parsed AST representation
    of the regexp:

    reverse( c ) -> c
    reverse( . ) -> .
    reverse( [a-b] ) -> [a-b]
    reverse( R1 | R2 ) -> reverse(R1) | reverse(R2)
    reverse( R1R2 ) -> reverse(R2) reverse(R1)
    reverse( ( R ) ) -> ( reverse(R) )
    reverse( R * ) -> reverse(R) *
    reverse( R + ) -> reverse(R) +
    reverse( ^ R) -> reverse(R) $
    reverse( R $ ) -> ^ reverse(R)

    (where a, b, and c are characters and R, R1, and R1 are regexps). I have no
    idea what some of the Perl-ish extensions would even mean if reversed, but
    "classic" regexps are good enough for me, and perhaps for most (legitimate)
    applications.

    -- chris
     
    Chris Uppal, Nov 11, 2005
    #13
  14. Chris Uppal wrote:
    > John C. Bollinger wrote:
    >
    >
    >>Regexes do
    >>not easily reverse; as far as I know, there is no general algorithm for
    >>"reversing" a regex, so it is unsurprising that Pattern doesn't know how
    >>to do it.

    >
    >
    > Why do you say that ? Am I missing something, or are you talking about the
    > difficulty of reversing (some of) the Perl-ish extensions to the classic regexp
    > concept ?


    I am talking about reversing regexes expressed in the language supported
    by java.util.regex.Pattern, a language even more feature-laden than
    Perl's regex language.

    > It seems (to me) that reversing a classic regexp is straightforward using the
    > following (rather redundant) transformations on the parsed AST representation
    > of the regexp:
    >
    > reverse( c ) -> c
    > reverse( . ) -> .
    > reverse( [a-b] ) -> [a-b]
    > reverse( R1 | R2 ) -> reverse(R1) | reverse(R2)
    > reverse( R1R2 ) -> reverse(R2) reverse(R1)
    > reverse( ( R ) ) -> ( reverse(R) )
    > reverse( R * ) -> reverse(R) *
    > reverse( R + ) -> reverse(R) +
    > reverse( ^ R) -> reverse(R) $
    > reverse( R $ ) -> ^ reverse(R)
    >
    > (where a, b, and c are characters and R, R1, and R1 are regexps). I have no
    > idea what some of the Perl-ish extensions would even mean if reversed, but
    > "classic" regexps are good enough for me, and perhaps for most (legitimate)
    > applications.


    Your argument for how that particular regex language can be reversed is
    convincing, but you don't have to add much to the language to make the
    task a lot harder. How about reluctant quantifiers, for instance? We
    had a different thread in the past week where the regex engine's
    leftmost matching behavior with respect to reluctant quantifiers
    produced a result that it would be tricky to reproduce working from the
    other end. (This is a regex feature also supported by Perl, and one
    with some good uses.)

    It might be possible (though nontrivial) to solve the problem with
    reluctant quantifiers, but what about greedy ones? Those are too
    closely tied to details of the engine's matching behavior to admit
    reversal, I think.

    --
    John Bollinger
     
    John C. Bollinger, Nov 12, 2005
    #14
  15. Roedy Green Guest

    On Fri, 11 Nov 2005 21:54:41 -0500, "John C. Bollinger"
    <> wrote, quoted or indirectly quoted someone who
    said :

    >I am talking about reversing regexes expressed in the language supported
    >by java.util.regex.Pattern, a language even more feature-laden than
    >Perl's regex language.


    A optimisation does not need to handle all cases. It is perfectly
    legit to cherry pick the easy ones to optimise.
    --
    Canadian Mind Products, Roedy Green.
    http://mindprod.com Java custom programming, consulting and coaching.
     
    Roedy Green, Nov 12, 2005
    #15
  16. Chris Uppal Guest

    John C. Bollinger wrote:

    > > Why do you say that ? Am I missing something, or are you talking about
    > > the difficulty of reversing (some of) the Perl-ish extensions to the
    > > classic regexp concept ?

    >
    > I am talking about reversing regexes expressed in the language supported
    > by java.util.regex.Pattern, a language even more feature-laden than
    > Perl's regex language.


    Right. I tend to ignore the non-classical extensions, myself.


    > Your argument for how that particular regex language can be reversed is
    > convincing, but you don't have to add much to the language to make the
    > task a lot harder. How about reluctant quantifiers, for instance?


    It seems to me that the greedy/reluctant distinction is only about handling
    ambiguous "parses". That's to say that a switching between greedy and
    reluctant never changes the language that the regexp recognises, but only
    changes the way that sub-regexps are assigned to substrings in the input
    sequence. (I'm assuming that the regexp is being used to make a simple Boolean
    judgement "does this string match?", the streaming mode is obviously
    different). For instance, given the pattern:
    /a*a*/
    and the input sequence:
    aaaa
    there are various possible parses:
    ()(aaaa)
    (a)(aaa)
    (aa)(aa)
    (aaa)(a)
    ()(aaaa)
    and the choice amongst them is determined by the greediness of the matches. I
    haven't been able to prove it (possibly because I lack a precise definition of
    "reluctant", possibly because I can't think straight), but I conjecture that
    inverting the greediness of each subexpression while reversing the pattern
    would produce the "same" parse of the reversed string.

    I have no idea about the possessive qualifier. But then I don't care either --
    it's an aesthetic disaster, a theoretic horror, and I hope (and expect) never
    to see any legitimate use for it.

    Beyond those cases, the only non-classical features of the Java regexp package
    (I think) are: The wide palette of character classes, which can be seen as
    syntactic sugar for character ranges, or even for "raw" alternation, and which
    reverse trivially. The various end-of-word, end-of-line, etc,
    pseudo-characters. The pseudo-characters present a problem if the set is not
    closed under reversal, I don't know whether it is (CR-LF, anyone ?). And the
    backref feature, which obviously doesn't reverse.

    -- chris
     
    Chris Uppal, Nov 14, 2005
    #16
  17. Chris Uppal wrote:
    > John C. Bollinger wrote:
    >>Your argument for how that particular regex language can be reversed is
    >>convincing, but you don't have to add much to the language to make the
    >>task a lot harder. How about reluctant quantifiers, for instance?

    >
    >
    > It seems to me that the greedy/reluctant distinction is only about handling
    > ambiguous "parses". That's to say that a switching between greedy and
    > reluctant never changes the language that the regexp recognises, but only
    > changes the way that sub-regexps are assigned to substrings in the input
    > sequence. (I'm assuming that the regexp is being used to make a simple Boolean
    > judgement "does this string match?", the streaming mode is obviously
    > different).


    You're right. I was thinking more about assigning parts of the matched
    string to groups the same way, and not as much about the overall "does
    this string match?", which is the most important aspect of the question
    the OP asked (somewhere back there in the distance...).

    [...]

    > I conjecture that
    > inverting the greediness of each subexpression while reversing the pattern
    > would produce the "same" parse of the reversed string.


    Now that would be a tidy result. A little prodding at it didn't make a
    counterexample fall out, but I'm too lazy to try to really prove it.

    > I have no idea about the possessive qualifier. But then I don't care either --
    > it's an aesthetic disaster, a theoretic horror, and I hope (and expect) never
    > to see any legitimate use for it.


    Do I sense a mild distaste for the feature? :^)
    In truth, I agree with you. I raised the issue only because I think
    they're a certain killer of general-purpose reversal of arbitrary Java
    regex patterns. Perhaps in itself that is reflective of the possessive
    quantifiers' perversion.

    > Beyond those cases, the only non-classical features of the Java regexp package
    > (I think) are: The wide palette of character classes, which can be seen as
    > syntactic sugar for character ranges, or even for "raw" alternation, and which
    > reverse trivially. The various end-of-word, end-of-line, etc,
    > pseudo-characters. The pseudo-characters present a problem if the set is not
    > closed under reversal, I don't know whether it is (CR-LF, anyone ?). And the
    > backref feature, which obviously doesn't reverse.


    As I consider it, I think perhaps even a backreference is reversible by
    exchanging the places of the last reference to each group and its
    referrent, and fixing up the group indices in the references. I don't
    see a pseudo-character for a CR/LF pair, but such a combination is
    recognized as a single line terminator when not in UNIX_LINES mode. I
    think that could be handled by use of the available individual
    metacharacters for CR and LF. If I'm right (now) then that leaves the
    possessive quantifiers as the only non-reversible feature. With that,
    you have established your point very well indeed.

    --
    John Bollinger
     
    John C. Bollinger, Nov 15, 2005
    #17
  18. Chris Uppal Guest

    John C. Bollinger wrote:

    > > I have no idea about the possessive qualifier. But then I don't care
    > > either -- it's an aesthetic disaster, a theoretic horror, and I hope
    > > (and expect) never to see any legitimate use for it.

    >
    > Do I sense a mild distaste for the feature? :^)
    > In truth, I agree with you. I raised the issue only because I think
    > they're a certain killer of general-purpose reversal of arbitrary Java
    > regex patterns. Perhaps in itself that is reflective of the possessive
    > quantifiers' perversion.


    That's a good way to think of it. Should I ever write a regexp reverser, I
    shall throw a FeatureInsupportableException in such cases ;-)

    BTW, there's also the zero-width-look{ahead/behind} operator. That, if I've
    understood it correctly (Sun, understandably don't care to admit what it
    actually does) can be handled trivially in the case where the lookahead/behind
    is a fixed string, otherwise it's another job for FeatureInsupportableException
    I fear.

    -- chris
     
    Chris Uppal, Nov 16, 2005
    #18
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Michele Simionato

    feature request: a better str.endswith

    Michele Simionato, Jul 18, 2003, in forum: Python
    Replies:
    24
    Views:
    791
    Peter Hansen
    Jan 9, 2004
  2. metaperl
    Replies:
    5
    Views:
    310
    Lawrence D'Oliveiro
    Sep 29, 2006
  3. =?utf-8?B?Qm9yaXMgRHXFoWVr?=

    Significance of "start" parameter to string method "endswith"

    =?utf-8?B?Qm9yaXMgRHXFoWVr?=, Apr 19, 2007, in forum: Python
    Replies:
    4
    Views:
    441
    John Machin
    Apr 19, 2007
  4. =?utf-8?B?Qm9yaXMgRHXFoWVr?=
    Replies:
    5
    Views:
    341
    Steven D'Aprano
    Apr 21, 2007
  5. Matt Funk

    question about endswith()

    Matt Funk, Mar 3, 2011, in forum: Python
    Replies:
    5
    Views:
    273
    HMX962b
    Mar 4, 2011
Loading...

Share This Page