Regular expression issue

Discussion in 'Python' started by genxtech, Aug 8, 2010.

  1. genxtech

    genxtech Guest

    I am trying to learn regular expressions in python3 and have an issue
    with one of the examples I'm working with.
    The code is:

    #! /usr/bin/env python3

    import re

    search_string = "[^aeiou]y$"
    print()

    in_string = 'vacancy'
    if re.search(search_string, in_string) != None:
    print(" ay, ey, iy, oy and uy are not at the end of
    {0}.".format(in_string))
    else:
    print(" ay, ey, iy, oy or uy were found at the end of
    {0}.".format(in_string))
    print()

    in_string = 'boy'
    if re.search(search_string, in_string) != None:
    print(" ay, ey, iy, oy and uy are not at the end of
    {0}.".format(in_string))
    else:
    print(" ay, ey, iy, oy or uy were found at the end of
    {0}.".format(in_string))
    print()

    in_string = 'day'
    if re.search(search_string, in_string) != None:
    print(" ay, ey, iy, oy and uy are not at the end of
    {0}.".format(in_string))
    else:
    print(" ay, ey, iy, oy or uy were found at the end of
    {0}.".format(in_string))
    print()

    in_string = 'pita'
    if re.search(search_string, in_string) != None:
    print(" ay, ey, iy, oy and uy are not at the end of
    {0}.".format(in_string))
    else:
    print(" ay, ey, iy, oy or uy were found at the end of
    {0}.".format(in_string))
    print()

    The output that I am getting is:
    ay, ey, iy, oy and uy are not at the end of vacancy.
    ay, ey, iy, oy or uy were found at the end of boy.
    ay, ey, iy, oy or uy were found at the end of day.
    ay, ey, iy, oy or uy were found at the end of pita.

    The last line of the output is the opposite of what I expected to see,
    and I'm having trouble figuring out what the issue is. Any help would
    be greatly appreciated.
     
    genxtech, Aug 8, 2010
    #1
    1. Advertising

  2. On Monday 09 August 2010, it occurred to genxtech to exclaim:
    > I am trying to learn regular expressions in python3 and have an issue
    > with one of the examples I'm working with.
    > The code is:
    >
    > #! /usr/bin/env python3
    >
    > import re
    >
    > search_string = "[^aeiou]y$"


    To translate this expression to English:

    a character that is not a, e, i, o, or u, followed by the character 'y', at
    the end of the line.

    "vacancy" matches. It ends with "c" (not one of aeiou), followed by "y"

    "pita" does not match: it does not end with "y".


    > print()
    >
    > in_string = 'vacancy'
    > if re.search(search_string, in_string) != None:
    > print(" ay, ey, iy, oy and uy are not at the end of
    > {0}.".format(in_string))
    > else:
    > print(" ay, ey, iy, oy or uy were found at the end of
    > {0}.".format(in_string))
    > print()
    >
    > in_string = 'boy'
    > if re.search(search_string, in_string) != None:
    > print(" ay, ey, iy, oy and uy are not at the end of
    > {0}.".format(in_string))
    > else:
    > print(" ay, ey, iy, oy or uy were found at the end of
    > {0}.".format(in_string))
    > print()
    >
    > in_string = 'day'
    > if re.search(search_string, in_string) != None:
    > print(" ay, ey, iy, oy and uy are not at the end of
    > {0}.".format(in_string))
    > else:
    > print(" ay, ey, iy, oy or uy were found at the end of
    > {0}.".format(in_string))
    > print()
    >
    > in_string = 'pita'
    > if re.search(search_string, in_string) != None:
    > print(" ay, ey, iy, oy and uy are not at the end of
    > {0}.".format(in_string))
    > else:
    > print(" ay, ey, iy, oy or uy were found at the end of
    > {0}.".format(in_string))
    > print()
    >
    > The output that I am getting is:
    > ay, ey, iy, oy and uy are not at the end of vacancy.
    > ay, ey, iy, oy or uy were found at the end of boy.
    > ay, ey, iy, oy or uy were found at the end of day.
    > ay, ey, iy, oy or uy were found at the end of pita.
    >
    > The last line of the output is the opposite of what I expected to see,
    > and I'm having trouble figuring out what the issue is. Any help would
    > be greatly appreciated.
     
    Thomas Jollans, Aug 8, 2010
    #2
    1. Advertising

  3. genxtech

    MRAB Guest

    genxtech wrote:
    > I am trying to learn regular expressions in python3 and have an issue
    > with one of the examples I'm working with.
    > The code is:
    >
    > #! /usr/bin/env python3
    >
    > import re
    >
    > search_string = "[^aeiou]y$"


    You can think of this as: a non-vowel followed by a 'y', then the end of
    the string.

    > print()
    >
    > in_string = 'vacancy'
    > if re.search(search_string, in_string) != None:
    > print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
    > else:
    > print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))


    Matches because 'c' is a non-vowel, 'y' matches, and then the end of the
    string.

    > print()
    >
    > in_string = 'boy'
    > if re.search(search_string, in_string) != None:
    > print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
    > else:
    > print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))


    Doesn't match because 'o' is a vowel, not a non-vowel.

    > print()
    >
    > in_string = 'day'
    > if re.search(search_string, in_string) != None:
    > print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
    > else:
    > print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))


    Doesn't match because 'a' is a vowel, not a non-vowel.

    > print()
    >
    > in_string = 'pita'
    > if re.search(search_string, in_string) != None:
    > print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
    > else:
    > print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))


    Doesn't match because 't' is a non-vowel but 'a' doesn't match 'y'.

    > print()
    >
    > The output that I am getting is:
    > ay, ey, iy, oy and uy are not at the end of vacancy.
    > ay, ey, iy, oy or uy were found at the end of boy.
    > ay, ey, iy, oy or uy were found at the end of day.
    > ay, ey, iy, oy or uy were found at the end of pita.
    >
    > The last line of the output is the opposite of what I expected to see,
    > and I'm having trouble figuring out what the issue is. Any help would
    > be greatly appreciated.
     
    MRAB, Aug 8, 2010
    #3
  4. genxtech

    Chris Rebert Guest

    On Sun, Aug 8, 2010 at 3:32 PM, Thomas Jollans <> wrote:
    > On Monday 09 August 2010, it occurred to genxtech to exclaim:
    >> I am trying to learn regular expressions in python3 and have an issue
    >> with one of the examples I'm working with.
    >> The code is:
    >>
    >> #! /usr/bin/env python3
    >>
    >> import re
    >>
    >> search_string = "[^aeiou]y$"

    >
    > To translate this expression to English:
    >
    > a character that is not a, e, i, o, or u, followed by the character 'y', at
    > the end of the line.
    >
    > "vacancy" matches. It ends with "c" (not one of aeiou), followed by "y"
    >
    > "pita" does not match: it does not end with "y".


    Or in other words, the regex will not match when:
    - the string ends in "ay", "ey", "iy", "oy", or "uy"
    - the string doesn't end in "y"
    - the string is less than 2 characters long

    So, the program has a logic error in its assumptions. A non-match
    *doesn't* imply that a string ends in one of the aforementioned pairs;
    the other possibilities have been overlooked.

    May I suggest instead using the much more straightforward
    `search_string = "[aeiou]y$"` and then swapping your conditions
    around? The double-negative sort of style the program is currently
    using is (as you've just experienced) harder to reason about and thus
    more error-prone.

    Cheers,
    Chris
    --
    http://blog.rebertia.com
     
    Chris Rebert, Aug 8, 2010
    #4
  5. genxtech

    Tim Chase Guest

    On 08/08/10 17:20, genxtech wrote:
    > if re.search(search_string, in_string) != None:


    While the other responses have addressed some of the big issues,
    it's also good to use

    if thing_to_test is None:

    or

    if thing_to_test is not None:

    instead of "== None" or "!= None".

    -tkc
     
    Tim Chase, Aug 9, 2010
    #5
  6. genxtech

    genxtech Guest

    On Aug 8, 7:34 pm, Tim Chase <> wrote:
    > On 08/08/10 17:20, genxtech wrote:
    >
    > > if re.search(search_string, in_string) != None:

    >
    > While the other responses have addressed some of the big issues,
    > it's also good to use
    >
    >    if thing_to_test is None:
    >
    > or
    >
    >    if thing_to_test is not None:
    >
    > instead of "== None" or "!= None".
    >
    > -tkc


    I would like to thank all of you for your responses. I understand
    what the regular expression means, and am aware of the double negative
    nature of the test. I guess what I am really getting at is why the
    last test returns a value of None, and even when using the syntax
    suggested in this quoted solution, the code for the last test is doing
    the opposite of the previous 2 tests that also returned a value of
    None. I hope this makes sense and clarifies what I am trying to ask.
    Thanks
     
    genxtech, Aug 9, 2010
    #6
  7. genxtech

    MRAB Guest

    genxtech wrote:
    > On Aug 8, 7:34 pm, Tim Chase <> wrote:
    >> On 08/08/10 17:20, genxtech wrote:
    >>
    >>> if re.search(search_string, in_string) != None:

    >> While the other responses have addressed some of the big issues,
    >> it's also good to use
    >>
    >> if thing_to_test is None:
    >>
    >> or
    >>
    >> if thing_to_test is not None:
    >>
    >> instead of "== None" or "!= None".
    >>
    >> -tkc

    >
    > I would like to thank all of you for your responses. I understand
    > what the regular expression means, and am aware of the double negative
    > nature of the test. I guess what I am really getting at is why the
    > last test returns a value of None, and even when using the syntax
    > suggested in this quoted solution, the code for the last test is doing
    > the opposite of the previous 2 tests that also returned a value of
    > None. I hope this makes sense and clarifies what I am trying to ask.
    >

    It returns None because it doesn't match.

    Why doesn't it match?

    Because the regex wants the last character to be a 'y', but it isn't,
    it's a 'a'.
     
    MRAB, Aug 9, 2010
    #7
  8. genxtech

    nn Guest

    On Aug 9, 9:18 am, genxtech <> wrote:
    > On Aug 8, 7:34 pm, Tim Chase <> wrote:
    >
    >
    >
    > > On 08/08/10 17:20, genxtech wrote:

    >
    > > > if re.search(search_string, in_string) != None:

    >
    > > While the other responses have addressed some of the big issues,
    > > it's also good to use

    >
    > >    if thing_to_test is None:

    >
    > > or

    >
    > >    if thing_to_test is not None:

    >
    > > instead of "== None" or "!= None".

    >
    > > -tkc

    >
    > I would like to thank all of you for your responses.  I understand
    > what the regular expression means, and am aware of the double negative
    > nature of the test.  I guess what I am really getting at is why the
    > last test returns a value of None, and even when using the syntax
    > suggested in this quoted solution, the code for the last test is doing
    > the opposite of the previous 2 tests that also returned a value of
    > None.  I hope this makes sense and clarifies what I am trying to ask.
    > Thanks



    First: You understand the regular expression and the double negative
    but not both of them together, otherwise you would not be asking here.
    The suggestion of refactoring the code is that down the road you or
    somebody else doing maintenance will have to read it again. Good books
    avoid confusing grammar, likewise, good programs avoid confusing
    logic.

    Second: the root of your problem is the mistaken believe that P=>Q
    implies (not P)=>(not Q); This is not so. Let me give an example: if
    you say "if it rains" then "the ground is wet" that does not imply "if
    it doesn't rain" then "the ground is not wet". You could be watering
    the plants for instance. Saying "if the word finishes with a consonant
    and an y" then "ay, ey, iy, oy and uy are not at the end of the word"
    does not imply that "if the word does not finish with a consonant and
    an y" then "ay, ey, iy, oy or uy were found at the end of the word".
    The word could end in x for instance.

    I hope I didn't make it more confusing, otherwise other people will
    probably chime in to make it clear to you.
     
    nn, Aug 9, 2010
    #8
  9. genxtech

    genxtech Guest

    I have it now. Had to beat my head over it a couple times. Thanks
    everybody.
     
    genxtech, Aug 9, 2010
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. VSK
    Replies:
    2
    Views:
    2,336
  2. =?iso-8859-1?B?bW9vcJk=?=

    Matching abitrary expression in a regular expression

    =?iso-8859-1?B?bW9vcJk=?=, Dec 1, 2005, in forum: Java
    Replies:
    8
    Views:
    864
    Alan Moore
    Dec 2, 2005
  3. GIMME
    Replies:
    3
    Views:
    11,999
    vforvikash
    Dec 29, 2008
  4. Replies:
    2
    Views:
    270
    Sibylle Koczian
    Jul 24, 2006
  5. Replies:
    2
    Views:
    314
    Oliver Wong
    Jan 19, 2007
Loading...

Share This Page