Prefixes of regular expressions

Discussion in 'Java' started by S J Kissane, Jun 29, 2008.

  1. S J Kissane

    S J Kissane Guest

    Hi all

    I was thinking about regular expressions, in the context of
    syntax checking in user interfaces.

    Example use case: I have a form field, with a regex to
    determine if its contents is valid. The user starts typing it
    in... if its valid, the field goes green. If its invalid, but by
    typing more they could make it valid, the field goes yellow.
    If its invalid, and they cannot make it valid by typing more,
    only by removing characters they've already typed, it
    goes red.

    Suppose I have a regular expression defined like this: [0-9A-F]{8}
    Now, suppose the user has typed: "09AB"
    We can see, although that string does not match the regular
    expression,
    it could match if the user added to it appropriately.

    Comparatively, suppose they typed: "09AG"
    We can see, that no matter what they possibly add,
    it can never be made to match the regular expression;
    the only way of making it match is to remove characters.

    We might say that, although the string does not match the
    regular expression, it is a "valid prefix" of the regular expression.

    Now, the question is, given a regular expression and a string,
    how in Java can I determine if the string is a valid prefix of
    the regular expression? I have looked at the java.util.regex.Matcher
    API in Java SE 6, and I can't see a way of doing this.

    I suppose, if I wrote my own regular expression library
    (or even just started with Sun's one and hacked it),
    I could make this work... but I don't want to do that.

    Thanks
    Simon
    S J Kissane, Jun 29, 2008
    #1
    1. Advertising

  2. "S J Kissane" <> wrote in message
    news:...
    > Hi all
    >
    > I was thinking about regular expressions, in the context of
    > syntax checking in user interfaces.
    >
    > Example use case: I have a form field, with a regex to
    > determine if its contents is valid. The user starts typing it
    > in... if its valid, the field goes green. If its invalid, but by
    > typing more they could make it valid, the field goes yellow.
    > If its invalid, and they cannot make it valid by typing more,
    > only by removing characters they've already typed, it
    > goes red.
    >
    > Suppose I have a regular expression defined like this: [0-9A-F]{8}
    > Now, suppose the user has typed: "09AB"
    > We can see, although that string does not match the regular
    > expression,
    > it could match if the user added to it appropriately.
    >
    > Comparatively, suppose they typed: "09AG"
    > We can see, that no matter what they possibly add,
    > it can never be made to match the regular expression;
    > the only way of making it match is to remove characters.
    >
    > We might say that, although the string does not match the
    > regular expression, it is a "valid prefix" of the regular expression.
    >
    > Now, the question is, given a regular expression and a string,
    > how in Java can I determine if the string is a valid prefix of
    > the regular expression? I have looked at the java.util.regex.Matcher
    > API in Java SE 6, and I can't see a way of doing this.
    >
    > I suppose, if I wrote my own regular expression library
    > (or even just started with Sun's one and hacked it),
    > I could make this work... but I don't want to do that.
    >
    > Thanks
    > Simon


    Define two regular expressions: one for valid (green), and one for valid
    prefix (yellow). Match the supplied string against the first; if matched,
    display green. If no match, match against the second. If matched, display
    yellow; if no match, display red. You lose nothing in responsiveness with
    two regular expressions - this is a user-typed form field after all. And the
    logic becomes more clear.

    For some situations you could probably use groupCount() on the Matcher
    object, with capturing groups, to distinguish between a valid prefix and a
    valid complete string.

    AHS
    Arved Sandstrom, Jun 29, 2008
    #2
    1. Advertising

  3. S J Kissane

    S J Kissane Guest

    On Jun 30, 1:12 am, "Arved Sandstrom" <>
    wrote:
    > Define two regular expressions: one for valid (green), and one for valid
    > prefix (yellow). Match the supplied string against the first; if matched,
    > display green. If no match, match against the second. If matched, display
    > yellow; if no match, display red. You lose nothing in responsiveness with
    > two regular expressions - this is a user-typed form field after all. And the
    > logic becomes more clear.
    >
    > For some situations you could probably use groupCount() on the Matcher
    > object, with capturing groups, to distinguish between a valid prefix and a
    > valid complete string.
    >
    > AHS

    Indeed, such an approach would work. But, logically speaking, I only
    need
    one regular expression to do this, not two. And by using two, I need
    to manually
    construct the prefix regex based on the whole string regex, when
    logically
    the former can be derived from the latter.

    Maybe its time for a trip to bugs.sun.com... Who knows, I might see
    the functionality
    I'm after in J2SE 12.0 :)

    Simon
    S J Kissane, Jun 29, 2008
    #3
  4. S J Kissane

    Roedy Green Guest

    On Sun, 29 Jun 2008 06:24:51 -0700 (PDT), S J Kissane
    <> wrote, quoted or indirectly quoted someone who
    said :

    >Now, the question is, given a regular expression and a string,
    >how in Java can I determine if the string is a valid prefix of
    >the regular expression? I have looked at the java.util.regex.Matcher
    >API in Java SE 6, and I can't see a way of doing this.


    the brute force approach is to have a different regex for each length
    of string.

    Back in the dayso of Java 1.0 I invented a FormattedTextField that
    handled a variety of patters, where you described each slot with a
    character code.
    e.g. 9 numeric A caps a- lower case ...
    I had "humps" where you can have decorative punctuation appear e.g.
    (604) 871-1166 that you don't key, can't change and is not part of the
    final data field.

    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
    Roedy Green, Jun 30, 2008
    #4
  5. S J Kissane

    Roedy Green Guest

    On Mon, 30 Jun 2008 05:18:06 GMT, Roedy Green
    <> wrote, quoted or indirectly quoted
    someone who said :

    >the brute force approach is to have a different regex for each length
    >of string.


    If you look at those regexes,, you may be able to create a single
    regex that will work for more than one length. e.g.. that ended with
    [0-9]* to reduce the total number of them you require.
    --

    Roedy Green Canadian Mind Products
    The Java Glossary
    http://mindprod.com
    Roedy Green, Jun 30, 2008
    #5
  6. S J Kissane

    David Segall Guest

    S J Kissane <> wrote:

    >Suppose I have a regular expression defined like this: [0-9A-F]{8}
    >Now, the question is, given a regular expression and a string,
    >how in Java can I determine if the string is a valid prefix of
    >the regular expression? I have looked at the java.util.regex.Matcher
    >API in Java SE 6, and I can't see a way of doing this.

    I can see that your putative changes to Matcher provides an elegant
    solution to your problem but they would require changing some return
    values from boolean to something containing more information. Rather
    than altering Java's method(s) or having multiple regular expressions
    to test for your three return values perhaps you could append a valid
    string of the appropriate length to the input as a second test. In
    your example, this approach is worse than using a second regular
    expression to check for a valid prefix but it may provide an easier
    general solution.
    David Segall, Jun 30, 2008
    #6
  7. "S J Kissane" <> wrote in message
    news:...
    On Jun 30, 1:12 am, "Arved Sandstrom" <>
    wrote:
    > Define two regular expressions: one for valid (green), and one for valid
    > prefix (yellow). Match the supplied string against the first; if matched,
    > display green. If no match, match against the second. If matched, display
    > yellow; if no match, display red. You lose nothing in responsiveness with
    > two regular expressions - this is a user-typed form field after all. And
    > the
    > logic becomes more clear.
    >
    > For some situations you could probably use groupCount() on the Matcher
    > object, with capturing groups, to distinguish between a valid prefix and a
    > valid complete string.
    >
    > AHS

    Indeed, such an approach would work. But, logically speaking, I only
    need
    one regular expression to do this, not two. And by using two, I need
    to manually
    construct the prefix regex based on the whole string regex, when
    logically
    the former can be derived from the latter.
    [ SNIP ]

    I really don't see you avoiding some non-RE conditional logic at some point.
    If you're not so keen on 2 separate regular expressions, there is always:

    Pattern p = Pattern.compile("([0-9A-F]{1,8})");
    Matcher m = p.matcher(stringToMatch);

    if (m.matches()) {
    int matchLen = m.group(1).length();
    if (matchLen < 8) {
    // do "yellow" stuff
    } else if (matchLen == 8) {
    // do "green" stuff
    }
    } else {
    // do "red" stuff
    }

    AHS
    Arved Sandstrom, Jul 1, 2008
    #7
  8. S J Kissane <> writes:

    > We might say that, although the string does not match the
    > regular expression, it is a "valid prefix" of the regular expression.
    >
    > Now, the question is, given a regular expression and a string,
    > how in Java can I determine if the string is a valid prefix of
    > the regular expression?


    You can't. The Java RegExp library doesn't provide support for
    what you need.

    > I suppose, if I wrote my own regular expression library
    > (or even just started with Sun's one and hacked it),
    > I could make this work... but I don't want to do that.


    You could start out with an existing alternative RegExp
    library. Perhaps <URLhttp://www.brics.dk/automaton/>, which
    is not a traditional RegExp library, but is closer related
    to the Comp.Sci. notions of regular languages and finite
    automatons.
    However, it does have a prefix operation on automatons:
    <URL:http://www.brics.dk/automaton/doc/dk/brics/automaton/SpecialOperations.html#prefixClose(dk.brics.automaton.Automaton)>

    Good luck.
    /L
    --
    Lasse Reichstein Nielsen
    DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
    'Faith without judgement merely degrades the spirit divine.'
    Lasse Reichstein Nielsen, Jul 1, 2008
    #8
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Jay Douglas
    Replies:
    0
    Views:
    598
    Jay Douglas
    Aug 15, 2003
  2. mayur
    Replies:
    2
    Views:
    1,012
    Natty Gur
    Jul 2, 2004
  3. =?Utf-8?B?Q29ybmUgUmFiZQ==?=

    Webservices SOAP and Namespace prefixes

    =?Utf-8?B?Q29ybmUgUmFiZQ==?=, Oct 26, 2004, in forum: ASP .Net
    Replies:
    2
    Views:
    5,869
    =?Utf-8?B?Q29ybmUgUmFiZQ==?=
    Oct 26, 2004
  4. S ML
    Replies:
    0
    Views:
    422
  5. Noman Shapiro
    Replies:
    0
    Views:
    232
    Noman Shapiro
    Jul 17, 2013
Loading...

Share This Page