Does Python optimize regexes?

J

Jason Smith

Hi. I just have a question about optimizations Python does when
converting to bytecode.

import re
for someString in someListOfStrings:
if re.match('foo', someString):
print someString, "matched!"

Does Python notice that re.match is called with the same expression, and
thus lift it out of the loop? Or do I need to always optimize by hand
using re.compile? I suspect so because the Python bytecode generator
would hardly know about a library function like re.compile, unlike e.g.
Perl, with builtin REs.

Thanks much for any clarification or advice.

--
Jason Smith
Open Enterprise Systems
Bangkok, Thailand
http://oes.co.th

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFA4VGMm5qEoSbpT3kRAsYpAKClQmFeamBfDx0vgTVpMc+utgmD/QCcDjpi
edwYF0cRA1V2BvlqV6y4/l4=
=YMB7
-----END PGP SIGNATURE-----
 
P

Peter Otten

Jason said:
Hi. I just have a question about optimizations Python does when
converting to bytecode.

import re
for someString in someListOfStrings:
if re.match('foo', someString):
print someString, "matched!"

Does Python notice that re.match is called with the same expression, and
thus lift it out of the loop? Or do I need to always optimize by hand
using re.compile? I suspect so because the Python bytecode generator
would hardly know about a library function like re.compile, unlike e.g.
Perl, with builtin REs.

Thanks much for any clarification or advice.

Python puts the compiled regular expressions into a cache. The relevant code
is in sre.py:

def match(pattern, string, flags=0):
return _compile(pattern, flags).match(string)

....

def _compile(*key):
p = _cache.get(key)
if p is not None:
return p
....

So not explicitly calling compile() in advance only costs you two function
calls and a dictionary lookup - and maybe some clarity in your code.

Peter
 
M

Michael Geary

Peter said:
Python puts the compiled regular expressions into a cache. The relevant
code is in sre.py:

def match(pattern, string, flags=0):
return _compile(pattern, flags).match(string)

...

def _compile(*key):
p = _cache.get(key)
if p is not None:
return p
...

So not explicitly calling compile() in advance only costs you two function
calls and a dictionary lookup - and maybe some clarity in your code.

That cost can be significant. Here's a test case where not precompiling the
regular expression increased the run time by more than 50%:

http://groups.google.com/[email protected]

-Mike
 
J

Jason Smith

Thanks much to Peter and Michael for the clarification.

Peter said:
So not explicitly calling compile() in advance only costs you two function
calls and a dictionary lookup - and maybe some clarity in your code.

The reason I asked is because I felt that re.compile() was less clear:

someRegex = re.compile('searchforme')
while something:
theString = getTheString()
if someRegex.search(theString):
celebrate()

I wanted to remove someRegex since I can shave a line of code and some
confusion, but I was worried about re.search() in a loop.

The answer is this is smartly handled in Python, as opposed to bytecode
optimizations. Great!

--
Jason Smith
Open Enterprise Systems
Bangkok, Thailand
http://oes.co.th

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.4 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

iD8DBQFA4jN0m5qEoSbpT3kRAry2AJ9RQQnHiGiR2S5bv2CdOpOhMNXOdACeKfyO
a3iZduUZ5qmkOcoBOkV3XEQ=
=D+ea
-----END PGP SIGNATURE-----
 
A

Aahz

The reason I asked is because I felt that re.compile() was less clear:

someRegex = re.compile('searchforme')
while something:
theString = getTheString()
if someRegex.search(theString):
celebrate()

I wanted to remove someRegex since I can shave a line of code and some
confusion, but I was worried about re.search() in a loop.

My reasoning is slightly different. I'm always forgetting with
re.search whether the pattern or string goes first; with re.compile, you
can't fail. Yesterday I fixed a couple of bugs where someone else made
the same error....
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads


Members online

No members online now.

Forum statistics

Threads
473,756
Messages
2,569,540
Members
45,024
Latest member
ARDU_PROgrammER

Latest Threads

Top