John said:
The OP said it works OK, which I took to mean that the OP was OK with
the extra whitespace, which can be easily stripped off. Close enough!
As I said, the whitespace difference [between what the OP said his
regex did and what it actually does] is not the problem. The problem
is that the OP's "works OK" included (item) in the 2nd group, whereas
yours includes (item) in the 3rd group.
Ugh, right again!
That just shows what happens when I try to post while debugging!
The OP didn't say whether search() or match() was being used. With the ^
it doesn't matter.
It *does* matter. re.search() is suboptimal; after failing at the
first position, it's not smart enough to give up if the pattern has a
front anchor.
[win32, 2.6.1]
C:\junk>\python26\python -mtimeit -s"import re;rx=re.compile
('^frobozz');txt=100
*'x'" "assert not rx.match(txt)"
1000000 loops, best of 3: 1.17 usec per loop
C:\junk>\python26\python -mtimeit -s"import re;rx=re.compile
('^frobozz');txt=100
0*'x'" "assert not rx.match(txt)"
1000000 loops, best of 3: 1.17 usec per loop
C:\junk>\python26\python -mtimeit -s"import re;rx=re.compile
('^frobozz');txt=100
*'x'" "assert not rx.search(txt)"
100000 loops, best of 3: 4.37 usec per loop
C:\junk>\python26\python -mtimeit -s"import re;rx=re.compile
('^frobozz');txt=100
0*'x'" "assert not rx.search(txt)"
10000 loops, best of 3: 34.1 usec per loop
Corresponding figures for 3.0 are 1.02, 1.02, 3.99, and 32.9
On my PC the numbers for Python 2.6 are:
C:\Python26>python -mtimeit -s"import
re;rx=re.compile('^frobozz');txt=100*'x'" "assert not rx.match(txt)"
1000000 loops, best of 3: 1.02 usec per loop
C:\Python26>python -mtimeit -s"import
re;rx=re.compile('^frobozz');txt=1000*'x'" "assert not rx.match(txt)"
1000000 loops, best of 3: 1.04 usec per loop
C:\Python26>python -mtimeit -s"import
re;rx=re.compile('^frobozz');txt=100*'x'" "assert not rx.search(txt)"
100000 loops, best of 3: 3.69 usec per loop
C:\Python26>python -mtimeit -s"import
re;rx=re.compile('^frobozz');txt=1000*'x'" "assert not rx.search(txt)"
10000 loops, best of 3: 28.6 usec per loop
I'm currently working on the re module and I've fixed that problem:
C:\Python27>python -mtimeit -s"import
re;rx=re.compile('^frobozz');txt=100*'x'" "assert not rx.match(txt)"
1000000 loops, best of 3: 1.28 usec per loop
C:\Python27>python -mtimeit -s"import
re;rx=re.compile('^frobozz');txt=1000*'x'" "assert not rx.match(txt)"
1000000 loops, best of 3: 1.23 usec per loop
C:\Python27>python -mtimeit -s"import
re;rx=re.compile('^frobozz');txt=100*'x'" "assert not rx.search(txt)"
1000000 loops, best of 3: 1.21 usec per loop
C:\Python27>python -mtimeit -s"import
re;rx=re.compile('^frobozz');txt=1000*'x'" "assert not rx.search(txt)"
1000000 loops, best of 3: 1.21 usec per loop
Hmm. Needs more tweaking...