Regular expression issue

G

genxtech

I am trying to learn regular expressions in python3 and have an issue
with one of the examples I'm working with.
The code is:

#! /usr/bin/env python3

import re

search_string = "[^aeiou]y$"
print()

in_string = 'vacancy'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of
{0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of
{0}.".format(in_string))
print()

in_string = 'boy'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of
{0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of
{0}.".format(in_string))
print()

in_string = 'day'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of
{0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of
{0}.".format(in_string))
print()

in_string = 'pita'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of
{0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of
{0}.".format(in_string))
print()

The output that I am getting is:
ay, ey, iy, oy and uy are not at the end of vacancy.
ay, ey, iy, oy or uy were found at the end of boy.
ay, ey, iy, oy or uy were found at the end of day.
ay, ey, iy, oy or uy were found at the end of pita.

The last line of the output is the opposite of what I expected to see,
and I'm having trouble figuring out what the issue is. Any help would
be greatly appreciated.
 
T

Thomas Jollans

I am trying to learn regular expressions in python3 and have an issue
with one of the examples I'm working with.
The code is:

#! /usr/bin/env python3

import re

search_string = "[^aeiou]y$"

To translate this expression to English:

a character that is not a, e, i, o, or u, followed by the character 'y', at
the end of the line.

"vacancy" matches. It ends with "c" (not one of aeiou), followed by "y"

"pita" does not match: it does not end with "y".
 
M

MRAB

genxtech said:
I am trying to learn regular expressions in python3 and have an issue
with one of the examples I'm working with.
The code is:

#! /usr/bin/env python3

import re

search_string = "[^aeiou]y$"

You can think of this as: a non-vowel followed by a 'y', then the end of
the string.
print()

in_string = 'vacancy'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))

Matches because 'c' is a non-vowel, 'y' matches, and then the end of the
string.
print()

in_string = 'boy'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))

Doesn't match because 'o' is a vowel, not a non-vowel.
print()

in_string = 'day'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))

Doesn't match because 'a' is a vowel, not a non-vowel.
print()

in_string = 'pita'
if re.search(search_string, in_string) != None:
print(" ay, ey, iy, oy and uy are not at the end of {0}.".format(in_string))
else:
print(" ay, ey, iy, oy or uy were found at the end of {0}.".format(in_string))

Doesn't match because 't' is a non-vowel but 'a' doesn't match 'y'.
 
C

Chris Rebert

I am trying to learn regular expressions in python3 and have an issue
with one of the examples I'm working with.
The code is:

#! /usr/bin/env python3

import re

search_string = "[^aeiou]y$"

To translate this expression to English:

a character that is not a, e, i, o, or u, followed by the character 'y', at
the end of the line.

"vacancy" matches. It ends with "c" (not one of aeiou), followed by "y"

"pita" does not match: it does not end with "y".

Or in other words, the regex will not match when:
- the string ends in "ay", "ey", "iy", "oy", or "uy"
- the string doesn't end in "y"
- the string is less than 2 characters long

So, the program has a logic error in its assumptions. A non-match
*doesn't* imply that a string ends in one of the aforementioned pairs;
the other possibilities have been overlooked.

May I suggest instead using the much more straightforward
`search_string = "[aeiou]y$"` and then swapping your conditions
around? The double-negative sort of style the program is currently
using is (as you've just experienced) harder to reason about and thus
more error-prone.

Cheers,
Chris
 
T

Tim Chase

if re.search(search_string, in_string) != None:

While the other responses have addressed some of the big issues,
it's also good to use

if thing_to_test is None:

or

if thing_to_test is not None:

instead of "== None" or "!= None".

-tkc
 
G

genxtech

While the other responses have addressed some of the big issues,
it's also good to use

   if thing_to_test is None:

or

   if thing_to_test is not None:

instead of "== None" or "!= None".

-tkc

I would like to thank all of you for your responses. I understand
what the regular expression means, and am aware of the double negative
nature of the test. I guess what I am really getting at is why the
last test returns a value of None, and even when using the syntax
suggested in this quoted solution, the code for the last test is doing
the opposite of the previous 2 tests that also returned a value of
None. I hope this makes sense and clarifies what I am trying to ask.
Thanks
 
M

MRAB

genxtech said:
I would like to thank all of you for your responses. I understand
what the regular expression means, and am aware of the double negative
nature of the test. I guess what I am really getting at is why the
last test returns a value of None, and even when using the syntax
suggested in this quoted solution, the code for the last test is doing
the opposite of the previous 2 tests that also returned a value of
None. I hope this makes sense and clarifies what I am trying to ask.
It returns None because it doesn't match.

Why doesn't it match?

Because the regex wants the last character to be a 'y', but it isn't,
it's a 'a'.
 
N

nn

I would like to thank all of you for your responses.  I understand
what the regular expression means, and am aware of the double negative
nature of the test.  I guess what I am really getting at is why the
last test returns a value of None, and even when using the syntax
suggested in this quoted solution, the code for the last test is doing
the opposite of the previous 2 tests that also returned a value of
None.  I hope this makes sense and clarifies what I am trying to ask.
Thanks


First: You understand the regular expression and the double negative
but not both of them together, otherwise you would not be asking here.
The suggestion of refactoring the code is that down the road you or
somebody else doing maintenance will have to read it again. Good books
avoid confusing grammar, likewise, good programs avoid confusing
logic.

Second: the root of your problem is the mistaken believe that P=>Q
implies (not P)=>(not Q); This is not so. Let me give an example: if
you say "if it rains" then "the ground is wet" that does not imply "if
it doesn't rain" then "the ground is not wet". You could be watering
the plants for instance. Saying "if the word finishes with a consonant
and an y" then "ay, ey, iy, oy and uy are not at the end of the word"
does not imply that "if the word does not finish with a consonant and
an y" then "ay, ey, iy, oy or uy were found at the end of the word".
The word could end in x for instance.

I hope I didn't make it more confusing, otherwise other people will
probably chime in to make it clear to you.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,763
Messages
2,569,563
Members
45,039
Latest member
CasimiraVa

Latest Threads

Top