Some more odd behaviour from the Regexp library

D

David Veerasingam

Can anyone explain why it won't give me my captured group?

In [1]: a = 'exit: gkdfjgfjdfsgdjglkghdfgkd'
In [2]: import re
In [3]: b = re.search(r'exit: (.*?)', a)
In [4]: b.group(0)
Out[4]: 'exit: '

In [5]: b.group(1)
Out[5]: ''

In [6]: b.group(2)
IndexError: no such group
 
D

Doug Schwarz

"David Veerasingam said:
Can anyone explain why it won't give me my captured group?

In [1]: a = 'exit: gkdfjgfjdfsgdjglkghdfgkd'
In [2]: import re
In [3]: b = re.search(r'exit: (.*?)', a)
In [4]: b.group(0)
Out[4]: 'exit: '

In [5]: b.group(1)
Out[5]: ''

In [6]: b.group(2)
IndexError: no such group

The ? tells (.*?) to match as little as possible and that is nothing.
If you change it to (.*) it should do what you want.
 
M

Mike Meyer

David Veerasingam said:
Can anyone explain why it won't give me my captured group?

In [1]: a = 'exit: gkdfjgfjdfsgdjglkghdfgkd'
In [2]: import re
In [3]: b = re.search(r'exit: (.*?)', a)
In [4]: b.group(0)
Out[4]: 'exit: '

In [5]: b.group(1)
Out[5]: ''

In [6]: b.group(2)
IndexError: no such group

It is giving you your captured group. While the * operator matches as
long a string as possible, the *? operator matches as *short* a string
as possible. Since '' matches .*?, that's all it's ever going to
capture. So b.group(1) is '', which is what it's giving you.

which I suspect is what you actually want.

Of course, being the founder of SPARE, I have to point out that
a.split(': ') will get you the same two strings as the re I used
above.


<mike
 
S

Steve Holden

Mike said:
Of course, being the founder of SPARE, I have to point out that
a.split(': ') will get you the same two strings as the re I used
above.
Let me guess: the Society for the Prevention of Abuse of Regular
Expressions?

regards
Steve
 
D

David Veerasingam

Thanks for all your replies.

I guess I've always used .*? as sort of an idiom for a non-greedy
match, but I guess it only works if I specify the end point (which I
didn't in the above case).

e.g. re.search(r'exit: (.*?)$', a)

Thanks for pointing that out!

David
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top