regexp validator - wrong?

Discussion in 'ASP .Net' started by Dmitry Korolyov, Oct 15, 2003.

  1. ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web server.

    A single-line asp:textbox control and regexp validator attached to it.

    ^\d+$ expression does match an empty string (when you don't enter any values) - this is wrong
    d+ expression does not match, for example "g24" string - this is also wrong

    www.regexplib.com test validator works fine for both cases, i.e. it is reporting "not match" for the first one and "match" for the second one. I am suspecting using different framework version from regexplib, and this being the source of the error. Do you have any other ideas?

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory
    Dmitry Korolyov, Oct 15, 2003
    #1
    1. Advertising

  2. Dmitry Korolyov

    Steve Jansen Guest

    Per MSDN on the RegularExpressionValidatorControl:

    "Note: Validation succeeds if the input control is empty. If a value is
    required for the associated input control, use a RequiredFieldValidator
    control in addition to the RegularExpressionValidator control."

    This is why it appears ^\d+$ is matched with an empty string.

    Also, "d+" means match one or more "d" characters, which is why it does not
    match "g24". You probably intended "^\w+$", meaning a single line string
    with only alphanumerics [a-zA-Z_0-9].

    -Steve Jansen

    ---------------------------
    "Dmitry Korolyov" <> wrote in message
    news:%...
    ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web
    server.

    A single-line asp:textbox control and regexp validator attached to it.

    ^\d+$ expression does match an empty string (when you don't enter any
    values) - this is wrong
    d+ expression does not match, for example "g24" string - this is
    also wrong

    www.regexplib.com test validator works fine for both cases, i.e. it is
    reporting "not match" for the first one and "match" for the second one. I am
    suspecting using different framework version from regexplib, and this being
    the source of the error. Do you have any other ideas?

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory
    Steve Jansen, Oct 15, 2003
    #2
    1. Advertising

  3. Thanks Steve.

    1) Why the first example works the opposite (to what I see and what you have explained) at www.regexplib.com ? They have set up a testing area where you can test various regexps and see if they match or not the strings you enter.

    2) \d+ means one or more digits. They can be anywhere within the string. This means "g2323" should match the regular expression, but it doesn't (although it does on the testing area of www.regexplib.com and in any other regexp-compatible language). Note that if I wanted a string which contains digits only, I'd use ^\d+$ regexp.

    So I guess there should be some more ideas...

    P.S. I'm still thinking of the different versions of the framework.

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory


    "Steve Jansen" <> wrote in message news:...
    Per MSDN on the RegularExpressionValidatorControl:

    "Note: Validation succeeds if the input control is empty. If a value is
    required for the associated input control, use a RequiredFieldValidator
    control in addition to the RegularExpressionValidator control."

    This is why it appears ^\d+$ is matched with an empty string.

    Also, "d+" means match one or more "d" characters, which is why it does not
    match "g24". You probably intended "^\w+$", meaning a single line string
    with only alphanumerics [a-zA-Z_0-9].

    -Steve Jansen

    ---------------------------
    "Dmitry Korolyov" <> wrote in message
    news:%...
    ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web
    server.

    A single-line asp:textbox control and regexp validator attached to it.

    ^\d+$ expression does match an empty string (when you don't enter any
    values) - this is wrong
    d+ expression does not match, for example "g24" string - this is
    also wrong

    www.regexplib.com test validator works fine for both cases, i.e. it is
    reporting "not match" for the first one and "match" for the second one. I am
    suspecting using different framework version from regexplib, and this being
    the source of the error. Do you have any other ideas?

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory
    Dmitry Korolyov [MVP], Oct 15, 2003
    #3
  4. Obviously I made a typo in the initial message, there should be \d+ instead of just d+

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory


    "Steve Jansen" <> wrote in message news:...
    Per MSDN on the RegularExpressionValidatorControl:

    "Note: Validation succeeds if the input control is empty. If a value is
    required for the associated input control, use a RequiredFieldValidator
    control in addition to the RegularExpressionValidator control."

    This is why it appears ^\d+$ is matched with an empty string.

    Also, "d+" means match one or more "d" characters, which is why it does not
    match "g24". You probably intended "^\w+$", meaning a single line string
    with only alphanumerics [a-zA-Z_0-9].

    -Steve Jansen

    ---------------------------
    "Dmitry Korolyov" <> wrote in message
    news:%...
    ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web
    server.

    A single-line asp:textbox control and regexp validator attached to it.

    ^\d+$ expression does match an empty string (when you don't enter any
    values) - this is wrong
    d+ expression does not match, for example "g24" string - this is
    also wrong

    www.regexplib.com test validator works fine for both cases, i.e. it is
    reporting "not match" for the first one and "match" for the second one. I am
    suspecting using different framework version from regexplib, and this being
    the source of the error. Do you have any other ideas?

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory
    Dmitry Korolyov [MVP], Oct 15, 2003
    #4
  5. Dmitry Korolyov

    Marina Guest

    1) .NET defines what it means for the regular expression validator to fire. And it ignores the empty string. The documentation says so - and it says to use a required field validator in addition, if your requirements are that the field be filled in.

    How others chose to implement their regular expression validator is irrelevant.

    2) '\d+' means that the entire string is just one or more digits. Not that there exists a substring of the original string with one or more digits.

    The link you keep referring to, seems to see if there is a substring that matches.

    For example, I put in '\d' for the expression, and 'asdf2sdf' for the test string.

    Now, in reality, '\d' should only match a 1 character string that contains a digit. However, my string matched!

    I disagree that this is actually a regular expression match for the string. There is a substring of my string that matches - but not the entire thing. In fact, absolutely anything will match, as long as there is at least one digit somewhere in it.

    So the result from this web site is very misleading, and as far as I am concerned incorrect. If I am validating that someone enters a 5 digit zip code, but something 'a string 12345' is allowed to match - well, that's just plain wrong!
    "Dmitry Korolyov [MVP]" <> wrote in message news:%...
    Thanks Steve.

    1) Why the first example works the opposite (to what I see and what you have explained) at www.regexplib.com ? They have set up a testing area where you can test various regexps and see if they match or not the strings you enter.

    2) \d+ means one or more digits. They can be anywhere within the string. This means "g2323" should match the regular expression, but it doesn't (although it does on the testing area of www.regexplib.com and in any other regexp-compatible language). Note that if I wanted a string which contains digits only, I'd use ^\d+$ regexp.

    So I guess there should be some more ideas...

    P.S. I'm still thinking of the different versions of the framework.

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory


    "Steve Jansen" <> wrote in message news:...
    Per MSDN on the RegularExpressionValidatorControl:

    "Note: Validation succeeds if the input control is empty. If a value is
    required for the associated input control, use a RequiredFieldValidator
    control in addition to the RegularExpressionValidator control."

    This is why it appears ^\d+$ is matched with an empty string.

    Also, "d+" means match one or more "d" characters, which is why it does not
    match "g24". You probably intended "^\w+$", meaning a single line string
    with only alphanumerics [a-zA-Z_0-9].

    -Steve Jansen

    ---------------------------
    "Dmitry Korolyov" <> wrote in message
    news:%...
    ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web
    server.

    A single-line asp:textbox control and regexp validator attached to it.

    ^\d+$ expression does match an empty string (when you don't enter any
    values) - this is wrong
    d+ expression does not match, for example "g24" string - this is
    also wrong

    www.regexplib.com test validator works fine for both cases, i.e. it is
    reporting "not match" for the first one and "match" for the second one. I am
    suspecting using different framework version from regexplib, and this being
    the source of the error. Do you have any other ideas?

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory
    Marina, Oct 15, 2003
    #5
  6. Agree on the first one - if that is how it works by design. Yet it is still wrong from the position of the common sense. Since an empty string does not contain one or more digits, it should not match the regexp, still it does.

    "How others chose to implement their regular expression validator is irrelevant." umm well if you say so. I've used regexp with perl some time before the whole .NET thing was here. And

    And disagree on the second. '\d+' means a sting containing 1 or more digits anywhere. An entire string which contains only digits - that would be '^\d+$'. Therefore, regexp is handled incorrect here also. You're making wrong assumtions regarding the entire string thing. The entire string consists of:
    1. Start of the string. This is '^' character in regular expressions syntax (if placed at the very beginning of the regular expression)
    2. The string itself. This is the '\d+' pattern we use for "one or more digits"
    3. End of the string. This is '$' character in regular expressions sytnax (if placed at the very end of the regular expression).
    This is documented in .NET regexp help so you can look yourself.

    In other words, 'asdf2sdf' will match the '\d' regexp as this is a string which contains a digit. If you need a regexp for 5-digit zip code, you should use ^\d\d\d\d\d$ or ^\d{5}$ pattern. The test area at www.regexplib.com returns absolutely correct results - in terms of what is referred to as "regular expressions" by anyone who works with regular expressions. In my initial message I was wondering why my .NET framework gives incorrect results, and my suggestion is that the website uses some different version.

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory


    "Marina" <nospam> wrote in message news:...
    1) .NET defines what it means for the regular expression validator to fire. And it ignores the empty string. The documentation says so - and it says to use a required field validator in addition, if your requirements are that the field be filled in.

    How others chose to implement their regular expression validator is irrelevant.

    2) '\d+' means that the entire string is just one or more digits. Not that there exists a substring of the original string with one or more digits.

    The link you keep referring to, seems to see if there is a substring that matches.

    For example, I put in '\d' for the expression, and 'asdf2sdf' for the test string.

    Now, in reality, '\d' should only match a 1 character string that contains a digit. However, my string matched!

    I disagree that this is actually a regular expression match for the string. There is a substring of my string that matches - but not the entire thing. In fact, absolutely anything will match, as long as there is at least one digit somewhere in it.

    So the result from this web site is very misleading, and as far as I am concerned incorrect. If I am validating that someone enters a 5 digit zip code, but something 'a string 12345' is allowed to match - well, that's just plain wrong!
    "Dmitry Korolyov [MVP]" <> wrote in message news:%...
    Thanks Steve.

    1) Why the first example works the opposite (to what I see and what you have explained) at www.regexplib.com ? They have set up a testing area where you can test various regexps and see if they match or not the strings you enter.

    2) \d+ means one or more digits. They can be anywhere within the string. This means "g2323" should match the regular expression, but it doesn't (although it does on the testing area of www.regexplib.com and in any other regexp-compatible language). Note that if I wanted a string which contains digits only, I'd use ^\d+$ regexp.

    So I guess there should be some more ideas...

    P.S. I'm still thinking of the different versions of the framework.

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory


    "Steve Jansen" <> wrote in message news:...
    Per MSDN on the RegularExpressionValidatorControl:

    "Note: Validation succeeds if the input control is empty. If a value is
    required for the associated input control, use a RequiredFieldValidator
    control in addition to the RegularExpressionValidator control."

    This is why it appears ^\d+$ is matched with an empty string.

    Also, "d+" means match one or more "d" characters, which is why it does not
    match "g24". You probably intended "^\w+$", meaning a single line string
    with only alphanumerics [a-zA-Z_0-9].

    -Steve Jansen

    ---------------------------
    "Dmitry Korolyov" <> wrote in message
    news:%...
    ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web
    server.

    A single-line asp:textbox control and regexp validator attached to it.

    ^\d+$ expression does match an empty string (when you don't enter any
    values) - this is wrong
    d+ expression does not match, for example "g24" string - this is
    also wrong

    www.regexplib.com test validator works fine for both cases, i.e. it is
    reporting "not match" for the first one and "match" for the second one. I am
    suspecting using different framework version from regexplib, and this being
    the source of the error. Do you have any other ideas?

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory
    Dmitry Korolyov [MVP], Oct 15, 2003
    #6
  7. Dmitry Korolyov

    mikeb Guest

    Dmitry Korolyov [MVP] wrote:
    > Thanks Steve.
    >
    > 1) Why the first example works the opposite (to what I see and what you
    > have explained) at www.regexplib.com <http://www.regexplib.com> ? They
    > have set up a testing area where you can test various regexps and see if
    > they match or not the strings you enter.
    >


    The RegularExpressionValidator is documented to succeed if the input
    control is empty (ie., the Regexp is not even run in this case). That
    might not be intuitive for you, but it's how MS decided it should work.

    > 2) \d+ means one or more digits. They can be anywhere within the string.
    > This means "g2323" should match the regular expression, but it doesn't
    > (although it does on the testing area of
    > <http://www.regexplib.com> www.regexplib.com <http://www.regexplib.com>
    > and in any other regexp-compatible language). Note that if I wanted a
    > string which contains digits only, I'd use ^\d+$ regexp.
    >
    > So I guess there should be some more ideas...


    Looking at the IL for the RegularExpressionValidator, it appears that MS
    made an undocumented decision such that it will return a successful
    validation only when the regex matches the entire contents of the control.

    I'd agree with you that this is counter-intuitive, and it appears to be
    undocumented. I'm not sure whether MS would consider this a bug in
    implementation or a bug in documentation. In any case, the behavior
    exists in both current versions of the Framework (1.0 SP2 and 1.1).

    So, if you want your Regex's to match with the behavior that MS
    hard-coded for RegularExpressionValidator, the ValidationExpression
    should always be bounded by the ^ and $ characters.

    >
    > P.S. I'm still thinking of the different versions of the framework.
    >
    > --
    > Dmitry Korolyov []
    > MVP: Windows Server - Active Directory
    >
    >
    >
    > "Steve Jansen" < <mailto:>> wrote in
    > message news:...
    > Per MSDN on the RegularExpressionValidatorControl:
    >
    > "Note: Validation succeeds if the input control is empty. If a
    > value is
    > required for the associated input control, use a RequiredFieldValidator
    > control in addition to the RegularExpressionValidator control."
    >
    > This is why it appears ^\d+$ is matched with an empty string.
    >
    > Also, "d+" means match one or more "d" characters, which is why it
    > does not
    > match "g24". You probably intended "^\w+$", meaning a single line
    > string
    > with only alphanumerics [a-zA-Z_0-9].
    >
    > -Steve Jansen
    >
    > ---------------------------
    > "Dmitry Korolyov" <
    > <mailto:>> wrote in message
    > news:%...
    > ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web
    > server.
    >
    > A single-line asp:textbox control and regexp validator attached to it.
    >
    > ^\d+$ expression does match an empty string (when you don't
    > enter any
    > values) - this is wrong
    > d+ expression does not match, for example "g24" string - this is
    > also wrong
    >
    > www.regexplib.com <http://www.regexplib.com> test validator works
    > fine for both cases, i.e. it is
    > reporting "not match" for the first one and "match" for the second
    > one. I am
    > suspecting using different framework version from regexplib, and
    > this being
    > the source of the error. Do you have any other ideas?
    >
    > --
    > Dmitry Korolyov []
    > MVP: Windows Server - Active Directory
    >


    --
    mikeb
    mikeb, Oct 15, 2003
    #7
  8. That's warmer Mike. But regexplib website shows us absolutely correct behavior - or do they use custom handling?

    --
    Dmitry Korolyov []
    MVP: Windows Server - Active Directory


    "mikeb" <> wrote in message news:...
    Dmitry Korolyov [MVP] wrote:
    > Thanks Steve.
    >
    > 1) Why the first example works the opposite (to what I see and what you
    > have explained) at www.regexplib.com <http://www.regexplib.com> ? They
    > have set up a testing area where you can test various regexps and see if
    > they match or not the strings you enter.
    >


    The RegularExpressionValidator is documented to succeed if the input
    control is empty (ie., the Regexp is not even run in this case). That
    might not be intuitive for you, but it's how MS decided it should work.

    > 2) \d+ means one or more digits. They can be anywhere within the string.
    > This means "g2323" should match the regular expression, but it doesn't
    > (although it does on the testing area of
    > <http://www.regexplib.com> www.regexplib.com <http://www.regexplib.com>
    > and in any other regexp-compatible language). Note that if I wanted a
    > string which contains digits only, I'd use ^\d+$ regexp.
    >
    > So I guess there should be some more ideas...


    Looking at the IL for the RegularExpressionValidator, it appears that MS
    made an undocumented decision such that it will return a successful
    validation only when the regex matches the entire contents of the control.

    I'd agree with you that this is counter-intuitive, and it appears to be
    undocumented. I'm not sure whether MS would consider this a bug in
    implementation or a bug in documentation. In any case, the behavior
    exists in both current versions of the Framework (1.0 SP2 and 1.1).

    So, if you want your Regex's to match with the behavior that MS
    hard-coded for RegularExpressionValidator, the ValidationExpression
    should always be bounded by the ^ and $ characters.

    >
    > P.S. I'm still thinking of the different versions of the framework.
    >
    > --
    > Dmitry Korolyov []
    > MVP: Windows Server - Active Directory
    >
    >
    >
    > "Steve Jansen" < <mailto:>> wrote in
    > message news:...
    > Per MSDN on the RegularExpressionValidatorControl:
    >
    > "Note: Validation succeeds if the input control is empty. If a
    > value is
    > required for the associated input control, use a RequiredFieldValidator
    > control in addition to the RegularExpressionValidator control."
    >
    > This is why it appears ^\d+$ is matched with an empty string.
    >
    > Also, "d+" means match one or more "d" characters, which is why it
    > does not
    > match "g24". You probably intended "^\w+$", meaning a single line
    > string
    > with only alphanumerics [a-zA-Z_0-9].
    >
    > -Steve Jansen
    >
    > ---------------------------
    > "Dmitry Korolyov" <
    > <mailto:>> wrote in message
    > news:%...
    > ASP.NET app using c# and framework version 1.1.4322.573 on a IIS 6.0 web
    > server.
    >
    > A single-line asp:textbox control and regexp validator attached to it.
    >
    > ^\d+$ expression does match an empty string (when you don't
    > enter any
    > values) - this is wrong
    > d+ expression does not match, for example "g24" string - this is
    > also wrong
    >
    > www.regexplib.com <http://www.regexplib.com> test validator works
    > fine for both cases, i.e. it is
    > reporting "not match" for the first one and "match" for the second
    > one. I am
    > suspecting using different framework version from regexplib, and
    > this being
    > the source of the error. Do you have any other ideas?
    >
    > --
    > Dmitry Korolyov []
    > MVP: Windows Server - Active Directory
    >


    --
    mikeb
    Dmitry Korolyov [MVP], Oct 16, 2003
    #8
  9. Dmitry Korolyov

    mikeb Guest

    Dmitry Korolyov [MVP] wrote:
    > That's warmer Mike. But regexplib website shows us absolutely correct
    > behavior - or do they use custom handling?
    >


    There's extra code in the handling of the RegularExpressionValidator.
    Pseudo-code looks something like:

    string len = controlToValidate.Text.Length;

    if (len == 0) {
    // nothing in the control - automatically validated
    return( true);
    }

    Match m = regex.Match( controlToValidate.Text);

    if (m.Success && (m.Length == len)) {
    return( true);
    }

    return( false);




    > --
    > Dmitry Korolyov []
    > MVP: Windows Server - Active Directory
    >
    >
    >
    > "mikeb" <
    > <mailto:>> wrote in message
    > news:...
    > Dmitry Korolyov [MVP] wrote:
    > > Thanks Steve.
    > >
    > > 1) Why the first example works the opposite (to what I see and

    > what you
    > > have explained) at www.regexplib.com <http://www.regexplib.com>

    > <http://www.regexplib.com> ? They
    > > have set up a testing area where you can test various regexps and

    > see if
    > > they match or not the strings you enter.
    > >

    >
    > The RegularExpressionValidator is documented to succeed if the input
    > control is empty (ie., the Regexp is not even run in this case). That
    > might not be intuitive for you, but it's how MS decided it should work.
    >
    > > 2) \d+ means one or more digits. They can be anywhere within the

    > string.
    > > This means "g2323" should match the regular expression, but it

    > doesn't
    > > (although it does on the testing area of
    > > <http://www.regexplib.com> www.regexplib.com

    > <http://www.regexplib.com> <http://www.regexplib.com>
    > > and in any other regexp-compatible language). Note that if I

    > wanted a
    > > string which contains digits only, I'd use ^\d+$ regexp.
    > >
    > > So I guess there should be some more ideas...

    >
    > Looking at the IL for the RegularExpressionValidator, it appears
    > that MS
    > made an undocumented decision such that it will return a successful
    > validation only when the regex matches the entire contents of the
    > control.
    >
    > I'd agree with you that this is counter-intuitive, and it appears to be
    > undocumented. I'm not sure whether MS would consider this a bug in
    > implementation or a bug in documentation. In any case, the behavior
    > exists in both current versions of the Framework (1.0 SP2 and 1.1).
    >
    > So, if you want your Regex's to match with the behavior that MS
    > hard-coded for RegularExpressionValidator, the ValidationExpression
    > should always be bounded by the ^ and $ characters.
    >
    > >
    > > P.S. I'm still thinking of the different versions of the framework.
    > >
    > > --
    > > Dmitry Korolyov []
    > > MVP: Windows Server - Active Directory
    > >
    > >
    > >
    > > "Steve Jansen" < <mailto:>

    > <mailto:>> wrote in
    > > message news:...
    > > Per MSDN on the RegularExpressionValidatorControl:
    > >
    > > "Note: Validation succeeds if the input control is empty. If a
    > > value is
    > > required for the associated input control, use a

    > RequiredFieldValidator
    > > control in addition to the RegularExpressionValidator control."
    > >
    > > This is why it appears ^\d+$ is matched with an empty string.
    > >
    > > Also, "d+" means match one or more "d" characters, which is

    > why it
    > > does not
    > > match "g24". You probably intended "^\w+$", meaning a

    > single line
    > > string
    > > with only alphanumerics [a-zA-Z_0-9].
    > >
    > > -Steve Jansen
    > >
    > > ---------------------------
    > > "Dmitry Korolyov" <

    > <mailto:>
    > > <mailto:>> wrote in message
    > > news:%...
    > > ASP.NET app using c# and framework version 1.1.4322.573 on a

    > IIS 6.0 web
    > > server.
    > >
    > > A single-line asp:textbox control and regexp validator

    > attached to it.
    > >
    > > ^\d+$ expression does match an empty string (when you don't
    > > enter any
    > > values) - this is wrong
    > > d+ expression does not match, for example "g24"

    > string - this is
    > > also wrong
    > >
    > > www.regexplib.com <http://www.regexplib.com>

    > <http://www.regexplib.com> test validator works
    > > fine for both cases, i.e. it is
    > > reporting "not match" for the first one and "match" for the

    > second
    > > one. I am
    > > suspecting using different framework version from regexplib, and
    > > this being
    > > the source of the error. Do you have any other ideas?
    > >
    > > --
    > > Dmitry Korolyov []
    > > MVP: Windows Server - Active Directory
    > >

    >
    > --
    > mikeb


    --
    mikeb
    mikeb, Oct 17, 2003
    #9
  10. Dmitry Korolyov

    sdsd Guest

    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Roland
    Replies:
    2
    Views:
    6,083
    Roland
    Mar 21, 2005
  2. Roland
    Replies:
    0
    Views:
    1,421
    Roland
    Mar 31, 2005
  3. Replies:
    4
    Views:
    881
    Richter~9.6
    Feb 13, 2007
  4. Greg Hurrell
    Replies:
    4
    Views:
    146
    James Edward Gray II
    Feb 14, 2007
  5. Joao Silva
    Replies:
    16
    Views:
    328
    7stud --
    Aug 21, 2009
Loading...

Share This Page