(1) Every regex library's match() starts matching from the beginning of
the string (unless of course there's an arg for an explicit starting
position) -- where else would it start?
(2) This has absolutely zero relevance to the "match whole string or
not" question.
*NO* ... if you want to ensure that the whole string matches completely,
you need to end your pattern with "\Z", *not* "$".
Perusal of the manual would seem to be indicated
Yes. If the default were to match the whole string, then a metacharacter
would be required to signal "*don't* match the whole string" ...
functionality which is quite useful.
My understanding, from the docs and from dim memories of using
re.match() long ago, is that it will match on less than the full input
string if the re pattern allows it (for instance, if the pattern
*doesn't* end in '.*' or something similar.)
Ending a pattern with '.*' or something similar is typically a mistake
and does nothing but waste CPU cycles:
C:\junk>python -mtimeit -s"import
re;s='a'+80*'z';m=re.compile('a').match" "m(s)"
1000000 loops, best of 3: 1.12 usec per loop
C:\junk>python -mtimeit -s"import
re;s='a'+8000*'z';m=re.compile('a').match" "m(s)"
100000 loops, best of 3: 1.15 usec per loop
C:\junk>python -mtimeit -s"import
re;s='a'+80*'z';m=re.compile('a.*').match" "m(s)"
100000 loops, best of 3: 1.39 usec per loop
C:\junk>python -mtimeit -s"import
re;s='a'+8000*'z';m=re.compile('a.*').match" "m(s)"
10000 loops, best of 3: 24.2 usec per loop
The regex engine can't optimise it away because '.' means by default
"any character except a newline" , so it has to trundle all the way to
the end just in case there's a newline lurking somewhere.
Oh and just in case you were wondering:
C:\junk>python -mtimeit -s"import
re;s='a'+8000*'z';m=re.compile('a.*',re.DOTALL).match" "m(s)"
1000000 loops, best of 3: 1.18 usec per loop
In this case, logic says the '.*' will match anything, so it can stop
immediately.
I'd test this, though, before trusting it.
What the heck, I'll do that now:
??? What's wrong with _.group() ???
HTH,
John