import re
s1 = "I am an american"
s2 = "I am american an "
for s in [s1, s2]:
print re.findall(" (am|an) ", s)
# Results:
# ['am']
# ['am', 'an']
Does it help if you expand your RE to its full expression, with '_'s
where the blanks go:
"_am_" or "_an_"
Now look for these in "I_am_an_american". After the first "_am_" is
processed, findall picks up at the leading 'a' of 'an', and there is
no leading blank, so no match. If you search through
"I_am_american_an_", both "am" and "an" have surrounding spaces, so
both match.
Instead of using explicit spaces, try using '\b' meaning word break:
import re
re.findall(r"\b(am|an)\b", "I am an american") ['am', 'an']
re.findall(r"\b(am|an)\b", "I am american an")
['am', 'an']
-- Paul
Your find pattern includes (and consumes) a leading AND trailing space
around each word. In the first string "I am an american", there is a
leading and trailing space around "am", but the trailing space for
"am" is the leading space for "an", so " an "