re.I slowness

vvikram · Mar 30, 2006

We process a lot of messages in a file based on some regex pattern(s)
we have in a db.
If I compile the regex using re.I, the processing time is substantially
more than if I
don't i.e using re.I is slow.

However, more surprisingly, if we do something on the lines of :

s = <regex string>
s = s.lower()
t = dict([(k, '[%s%s]' % (k, k.upper())) for k in
string.ascii_lowercase])
for k in t: s = s.replace(k, t[k])
re.compile(s)
.......

its much better than using plainly re.I.

So the qns are:
a) Why is re.I so slow in general?
b) What is the underlying implementation used and what is wrong, if
any,
with above method and why is it not used instead?

Thanks
Vikram

Paul McGuire · Mar 30, 2006

We process a lot of messages in a file based on some regex pattern(s)
we have in a db.
If I compile the regex using re.I, the processing time is substantially
more than if I
don't i.e using re.I is slow.

However, more surprisingly, if we do something on the lines of :

s = <regex string>
s = s.lower()
t = dict([(k, '[%s%s]' % (k, k.upper())) for k in
string.ascii_lowercase])
for k in t: s = s.replace(k, t[k])
re.compile(s)
......

its much better than using plainly re.I.

So the qns are:
a) Why is re.I so slow in general?
b) What is the underlying implementation used and what is wrong, if
any,
with above method and why is it not used instead?

Thanks
Vikram

Can't tell you why re.I is slow, but perhaps this expression will make your
RE transform a little plainer (no need to create that dictionary of uppers
and lowers).

s = <regex string>
makeReAlphaCharLowerOrUpper = lambda c : c.isalpha() and "[%s%s]" %
(c.lower(),c.upper()) or c
s_optimized = "".join( makeReAlphaCharLowerOrUpper(k) for k in s)

or

s_optimized = "".join( map( makeReAlphaCharLowerOrUpper, s ) )

Just curious, but what happens if your RE contains something like this
spelling check error finder:
"[^c]ei"
(looking for violations of "i before e except after c")

Can []'s nest in an RE?

-- Paul

Code working properly in VS code for every test case but assigned wrong when submitted why?	0	Aug 21, 2022
[ActivePython 2.5.1.1] Why does Python not return first line?	5	Mar 16, 2009
Blue J Ciphertext Program	2	Nov 22, 2023
how to match whole word	5	Jul 16, 2008
UPnP client	0	Jun 14, 2011
I need help	1	Nov 2, 2022
Processing in Python help	0	Aug 31, 2022
Trying to get cleaner XML output from a text file	1	May 29, 2009

re.I slowness

vvikram

Paul McGuire

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads