Does Python optimize regexes?

Discussion in 'Python' started by Jason Smith, Jun 29, 2004.

  1. Jason Smith

    Jason Smith Guest

    Hi. I just have a question about optimizations Python does when
    converting to bytecode.

    import re
    for someString in someListOfStrings:
    if re.match('foo', someString):
    print someString, "matched!"

    Does Python notice that re.match is called with the same expression, and
    thus lift it out of the loop? Or do I need to always optimize by hand
    using re.compile? I suspect so because the Python bytecode generator
    would hardly know about a library function like re.compile, unlike e.g.
    Perl, with builtin REs.

    Thanks much for any clarification or advice.

    --
    Jason Smith
    Open Enterprise Systems
    Bangkok, Thailand
    http://oes.co.th

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.4 (GNU/Linux)
    Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

    iD8DBQFA4VGMm5qEoSbpT3kRAsYpAKClQmFeamBfDx0vgTVpMc+utgmD/QCcDjpi
    edwYF0cRA1V2BvlqV6y4/l4=
    =YMB7
    -----END PGP SIGNATURE-----
     
    Jason Smith, Jun 29, 2004
    #1
    1. Advertising

  2. Jason Smith

    Peter Otten Guest

    Jason Smith wrote:

    > Hi. I just have a question about optimizations Python does when
    > converting to bytecode.
    >
    > import re
    > for someString in someListOfStrings:
    > if re.match('foo', someString):
    > print someString, "matched!"
    >
    > Does Python notice that re.match is called with the same expression, and
    > thus lift it out of the loop? Or do I need to always optimize by hand
    > using re.compile? I suspect so because the Python bytecode generator
    > would hardly know about a library function like re.compile, unlike e.g.
    > Perl, with builtin REs.
    >
    > Thanks much for any clarification or advice.
    >


    Python puts the compiled regular expressions into a cache. The relevant code
    is in sre.py:

    def match(pattern, string, flags=0):
    return _compile(pattern, flags).match(string)

    ....

    def _compile(*key):
    p = _cache.get(key)
    if p is not None:
    return p
    ....

    So not explicitly calling compile() in advance only costs you two function
    calls and a dictionary lookup - and maybe some clarity in your code.

    Peter
     
    Peter Otten, Jun 29, 2004
    #2
    1. Advertising

  3. Jason Smith

    Peter Otten Guest

    Peter Otten wrote:

    > Python puts the compiled regular expressions into a cache. The relevant


    By the way, re.compile() uses that cache, too:

    >>> import re
    >>> r1 = re.compile("abc")
    >>> r2 = re.compile("abc")
    >>> r1 is r2

    True

    Peter
     
    Peter Otten, Jun 29, 2004
    #3
  4. Peter Otten wrote:
    > Python puts the compiled regular expressions into a cache. The relevant
    > code is in sre.py:
    >
    > def match(pattern, string, flags=0):
    > return _compile(pattern, flags).match(string)
    >
    > ...
    >
    > def _compile(*key):
    > p = _cache.get(key)
    > if p is not None:
    > return p
    > ...
    >
    > So not explicitly calling compile() in advance only costs you two function
    > calls and a dictionary lookup - and maybe some clarity in your code.


    That cost can be significant. Here's a test case where not precompiling the
    regular expression increased the run time by more than 50%:

    http://groups.google.com/groups?selm=

    -Mike
     
    Michael Geary, Jun 29, 2004
    #4
  5. Jason Smith

    Jason Smith Guest

    Thanks much to Peter and Michael for the clarification.

    Peter Otten wrote:
    > So not explicitly calling compile() in advance only costs you two function
    > calls and a dictionary lookup - and maybe some clarity in your code.


    The reason I asked is because I felt that re.compile() was less clear:

    someRegex = re.compile('searchforme')
    while something:
    theString = getTheString()
    if someRegex.search(theString):
    celebrate()

    I wanted to remove someRegex since I can shave a line of code and some
    confusion, but I was worried about re.search() in a loop.

    The answer is this is smartly handled in Python, as opposed to bytecode
    optimizations. Great!

    --
    Jason Smith
    Open Enterprise Systems
    Bangkok, Thailand
    http://oes.co.th

    -----BEGIN PGP SIGNATURE-----
    Version: GnuPG v1.2.4 (GNU/Linux)
    Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org

    iD8DBQFA4jN0m5qEoSbpT3kRAry2AJ9RQQnHiGiR2S5bv2CdOpOhMNXOdACeKfyO
    a3iZduUZ5qmkOcoBOkV3XEQ=
    =D+ea
    -----END PGP SIGNATURE-----
     
    Jason Smith, Jun 30, 2004
    #5
  6. Jason Smith

    Aahz Guest

    In article <>,
    Jason Smith <> wrote:
    >
    >The reason I asked is because I felt that re.compile() was less clear:
    >
    >someRegex = re.compile('searchforme')
    >while something:
    > theString = getTheString()
    > if someRegex.search(theString):
    > celebrate()
    >
    >I wanted to remove someRegex since I can shave a line of code and some
    >confusion, but I was worried about re.search() in a loop.


    My reasoning is slightly different. I'm always forgetting with
    re.search whether the pattern or string goes first; with re.compile, you
    can't fail. Yesterday I fixed a couple of bugs where someone else made
    the same error....
    --
    Aahz () <*> http://www.pythoncraft.com/

    "Typing is cheap. Thinking is expensive." --Roy Smith, c.l.py
     
    Aahz, Jul 3, 2004
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Ara.T.Howard

    MoinMoin WikiName and python regexes

    Ara.T.Howard, Jun 8, 2005, in forum: Python
    Replies:
    6
    Views:
    1,203
    Bengt Richter
    Jun 26, 2005
  2. Blackbird

    How much does Python optimize?

    Blackbird, Mar 3, 2006, in forum: Python
    Replies:
    4
    Views:
    469
    Alex Martelli
    Mar 4, 2006
  3. PerlFAQ Server
    Replies:
    0
    Views:
    186
    PerlFAQ Server
    Jan 14, 2011
  4. PerlFAQ Server
    Replies:
    0
    Views:
    170
    PerlFAQ Server
    Apr 18, 2011
  5. John Ladasky

    Does Python optimize low-power functions?

    John Ladasky, Dec 6, 2013, in forum: Python
    Replies:
    6
    Views:
    136
    Michael Torrie
    Dec 8, 2013
Loading...

Share This Page