Regexp small question

Discussion in 'Perl Misc' started by Shai, Mar 1, 2005.

  1. Shai

    Shai Guest

    Hi,

    I'm trying to check a string that will not contain some characters and
    bump into some problems. The code is:

    if (!($str =~/((\w+)|(\.)|(\_)|(\-))$/)
    {
    print "\nString: $str contains wrong characters!!!\n";
    }
    else
    {
    print "\nString is OK.\n";
    }

    The string can be composed of the following chars: All letters and
    digits, "-"(minus), "."(dot) and "_"(underscore).

    Any idea how to fix the condition???

    Thanks in advanced,
    Shai.
     
    Shai, Mar 1, 2005
    #1
    1. Advertising

  2. Shai

    Shai Guest

    Thanks,

    It works perfect!!!!!!!!!

    Shai.
     
    Shai, Mar 1, 2005
    #2
    1. Advertising

  3. Bernard El-Hagin wrote:

    > if ($str =~ m/^[\w.-]+$/) {


    That works but it's also a common idiom to simplify this by inverting
    the char-class and the condition.

    if ($str !~ /[^\w.-]/) {
     
    Brian McCauley, Mar 1, 2005
    #3
  4. Shai

    Anno Siegel Guest

    Bernard El-Hagin <> wrote in comp.lang.perl.misc:
    > Brian McCauley <> wrote:
    >
    > > Bernard El-Hagin wrote:
    > >
    > >> if ($str =~ m/^[\w.-]+$/) {

    > >
    > > That works but it's also a common idiom to simplify this by
    > > inverting the char-class and the condition.
    > >
    > > if ($str !~ /[^\w.-]/) {

    >
    >
    > Really? That's a common idiom? Personally, I that is absolutely horrid.
    > I would *never* use it and I most certainly wouldn't call it a
    > simplification. I guess it's a matter of what one is used to, but
    > inverting *two* things to get a result one get get without inverting
    > *any*thing seems...perverse to me. :)


    I agree with brian here (hey, it happens :). I find it perfectly natural
    to go from "consists entirely of ..." to "contains nothing outside of ...".
    Since the latter doesn't need anchoring and a quantifier, I often prefer
    it.

    Anno
     
    Anno Siegel, Mar 1, 2005
    #4
  5. "Bernard El-Hagin" <> writes:
    > Brian McCauley <> wrote:
    >
    > > Bernard El-Hagin wrote:
    > >
    > >> if ($str =~ m/^[\w.-]+$/) {

    > >
    > > That works but it's also a common idiom to simplify this by
    > > inverting the char-class and the condition.
    > >
    > > if ($str !~ /[^\w.-]/) {

    >
    >
    > Really? That's a common idiom? Personally, I that is absolutely horrid.
    > I would *never* use it and I most certainly wouldn't call it a
    > simplification. I guess it's a matter of what one is used to, but
    > inverting *two* things to get a result one get get without inverting
    > *any*thing seems...perverse to me. :)


    You can switch the following clauses if the "!=" offends you:

    if ($str =~ /[^\w.-]/) {
    # bad string
    } else {
    # good string
    }

    (To me, it _is_ a simplification in that the ^...+$ makes the other
    construction more error-prone.)

    However, what about empty strings? The two constructions don't treat
    empty strings the same way. Replacing the '+' with '*' would make them
    equivalent.
     
    Arndt Jonasson, Mar 1, 2005
    #5
  6. Anno Siegel wrote:

    > Bernard El-Hagin <> wrote in comp.lang.perl.misc:
    >
    >>Brian McCauley <> wrote:
    >>
    >>
    >>>Bernard El-Hagin wrote:
    >>>
    >>>
    >>>>if ($str =~ m/^[\w.-]+$/) {
    >>>
    >>>That works but it's also a common idiom to simplify this by
    >>>inverting the char-class and the condition.
    >>>
    >>> if ($str !~ /[^\w.-]/) {

    >>
    >>
    >>Really? That's a common idiom? Personally, I that is absolutely horrid.
    >>I would *never* use it and I most certainly wouldn't call it a
    >>simplification. I guess it's a matter of what one is used to, but
    >>inverting *two* things to get a result one get get without inverting
    >>*any*thing seems...perverse to me. :)

    >
    >
    > I agree with brian here (hey, it happens :). I find it perfectly natural
    > to go from "consists entirely of ..." to "contains nothing outside of ...".
    > Since the latter doesn't need anchoring and a quantifier, I often prefer
    > it.


    The OP was expressed "check a string that will not contain some
    characters", so, in fact, it is Bernard's solution that is a double
    invertion relative to the OP.
     
    Brian McCauley, Mar 1, 2005
    #6
  7. Arndt Jonasson wrote:

    > "Bernard El-Hagin" <> writes:
    >
    >>Brian McCauley <> wrote:
    >>
    >>
    >>>Bernard El-Hagin wrote:
    >>>
    >>>
    >>>>if ($str =~ m/^[\w.-]+$/) {
    >>>
    >>>That works but it's also a common idiom to simplify this by
    >>>inverting the char-class and the condition.
    >>>
    >>> if ($str !~ /[^\w.-]/) {


    > However, what about empty strings? The two constructions don't treat
    > empty strings the same way. Replacing the '+' with '*' would make them
    > equivalent.


    Yes, I hadn't spotted that Bernard's solution did the wrong thing with
    respect to empty strings. For that matter it also does the wrong thing
    with respect to strings with a terminal newline.
     
    Brian McCauley, Mar 1, 2005
    #7
  8. Bernard El-Hagin wrote:

    > Brian McCauley <> wrote:
    >
    >>Yes, I hadn't spotted that Bernard's solution did the wrong thing
    >>with respect to empty strings. For that matter it also does the
    >>wrong thing with respect to strings with a terminal newline.

    >
    > The OP stated that he wants to identify strings which contain *only* \w
    > . and -. My solution will not match an empty string (since it doesn't
    > contain any of those characters)


    The null string may not contain any of those characters but in a strict
    logical sense it _does_ contain *only* those characters. And when it
    comes to programming is pays to express things strictly.

    > nor will it match a string with a
    > newline (terminal or otherwise) since a newline is *not* one of \w, .
    > or -.


    You didn't test this, did you?

    $ perl -e'print "Bernard is wrong\n" if "A\n" =~ m/^[\w.-]+$/'
    Bernard is wrong
     
    Brian McCauley, Mar 1, 2005
    #8
  9. Bernard El-Hagin wrote:

    > Brian McCauley <> wrote:
    >
    >>The OP was expressed "check a string that will not contain some
    >>characters", so, in fact, it is Bernard's solution that is a
    >>double invertion relative to the OP.

    >
    > The OP said
    >
    > "The string can be composed of the following chars: All letters and
    > digits, "-"(minus), "."(dot) and "_"(underscore).


    Yes, you are right the OP first expressed the general problem one way,
    then expressed a particular example of the problem the other way round.

    Since the OP had already noted the equivalance of the two ways of
    expressing the problem neither solution could be considered a double
    invertion of the OP.

    I stand corrected.
     
    Brian McCauley, Mar 1, 2005
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Claus K.
    Replies:
    2
    Views:
    404
    Claus K.
    May 7, 2006
  2. Replies:
    8
    Views:
    116
    Logan Capaldo
    Mar 10, 2006
  3. Joao Silva
    Replies:
    16
    Views:
    404
    7stud --
    Aug 21, 2009
  4. David Morel

    small regexp problem

    David Morel, Dec 30, 2003, in forum: Perl Misc
    Replies:
    3
    Views:
    91
    Tad McClellan
    Dec 30, 2003
  5. rusi

    small regexp help

    rusi, Oct 30, 2013, in forum: Python
    Replies:
    1
    Views:
    115
Loading...

Share This Page