regular expression - matches

A

abcd

how can i determine if a given character sequence matches my regex,
completely?

in java for example I can do,
Pattern.compile(regex).matcher(input).matches()

this returns True/False whether or not input matches the regex
completely.

is there a matches in python?
 
T

Tim Chase

abcd said:
how can i determine if a given character sequence matches my regex,
completely?

in java for example I can do,
Pattern.compile(regex).matcher(input).matches()

this returns True/False whether or not input matches the regex
completely.

is there a matches in python?
Help on function match in module sre:

match(pattern, string, flags=0)
Try to apply the pattern at the start of the string, returning
a match object, or None if no match was found.


For more info, see

http://docs.python.org/lib/module-re.html

-tkc
 
A

abcd

yea i saw that....guess I was trusting that my regex was accurate :)
....b/c i was getting a Matcher when I shouldnt have, but i found that
it must be the regex.
 
S

Simon Forman

abcd said:
how can i determine if a given character sequence matches my regex,
completely?

in java for example I can do,
Pattern.compile(regex).matcher(input).matches()

this returns True/False whether or not input matches the regex
completely.

is there a matches in python?

Yes. It's called match and it's in the re module
(http://docs.python.org/lib/module-re.html)

Python's re.match() matches from the start of the string, so if you
want to ensure that the whole string matches completely you'll probably
want to end your re pattern with the "$" character (depending on what
the rest of your pattern matches.)

HTH,
~Simon
 
J

John Salerno

Simon said:
Python's re.match() matches from the start of the string, so if you
want to ensure that the whole string matches completely you'll probably
want to end your re pattern with the "$" character (depending on what
the rest of your pattern matches.)

Is that necessary? I was thinking that match() was used to match the
full RE and string, and if they weren't the same, they wouldn't match
(meaning a begin/end of string character wasn't necessary). That's wrong?
 
S

Steve Holden

John said:
Simon Forman wrote:




Is that necessary? I was thinking that match() was used to match the
full RE and string, and if they weren't the same, they wouldn't match
(meaning a begin/end of string character wasn't necessary). That's wrong?

That's wrong. In this context match just means you got to the end of the
pattern. However, if you don't want to add the "$" to the end of the
patterns, you could instead check that

m.endpos == len(s)

where m is the match object and s is the subject string.

regards
Steve
 
S

Simon Forman

John said:
Is that necessary? I was thinking that match() was used to match the
full RE and string, and if they weren't the same, they wouldn't match
(meaning a begin/end of string character wasn't necessary). That's wrong?

My understanding, from the docs and from dim memories of using
re.match() long ago, is that it will match on less than the full input
string if the re pattern allows it (for instance, if the pattern
*doesn't* end in '.*' or something similar.)

I'd test this, though, before trusting it.

What the heck, I'll do that now:
None


Yup! That's the case.

Peace,
~Simon
 
J

John Salerno

Simon said:
My understanding, from the docs and from dim memories of using
re.match() long ago, is that it will match on less than the full input
string if the re pattern allows it (for instance, if the pattern
*doesn't* end in '.*' or something similar.)

I'd test this, though, before trusting it.

What the heck, I'll do that now:

None


Yup! That's the case.

Peace,
~Simon

Thanks guys!
 
J

John Machin

(1) Every regex library's match() starts matching from the beginning of
the string (unless of course there's an arg for an explicit starting
position) -- where else would it start?

(2) This has absolutely zero relevance to the "match whole string or
not" question.

*NO* ... if you want to ensure that the whole string matches completely,
you need to end your pattern with "\Z", *not* "$".

Perusal of the manual would seem to be indicated :)

Yes. If the default were to match the whole string, then a metacharacter
would be required to signal "*don't* match the whole string" ...
functionality which is quite useful.
My understanding, from the docs and from dim memories of using
re.match() long ago, is that it will match on less than the full input
string if the re pattern allows it (for instance, if the pattern
*doesn't* end in '.*' or something similar.)

Ending a pattern with '.*' or something similar is typically a mistake
and does nothing but waste CPU cycles:

C:\junk>python -mtimeit -s"import
re;s='a'+80*'z';m=re.compile('a').match" "m(s)"
1000000 loops, best of 3: 1.12 usec per loop

C:\junk>python -mtimeit -s"import
re;s='a'+8000*'z';m=re.compile('a').match" "m(s)"
100000 loops, best of 3: 1.15 usec per loop

C:\junk>python -mtimeit -s"import
re;s='a'+80*'z';m=re.compile('a.*').match" "m(s)"
100000 loops, best of 3: 1.39 usec per loop

C:\junk>python -mtimeit -s"import
re;s='a'+8000*'z';m=re.compile('a.*').match" "m(s)"
10000 loops, best of 3: 24.2 usec per loop

The regex engine can't optimise it away because '.' means by default
"any character except a newline" , so it has to trundle all the way to
the end just in case there's a newline lurking somewhere.

Oh and just in case you were wondering:

C:\junk>python -mtimeit -s"import
re;s='a'+8000*'z';m=re.compile('a.*',re.DOTALL).match" "m(s)"
1000000 loops, best of 3: 1.18 usec per loop

In this case, logic says the '.*' will match anything, so it can stop
immediately.
I'd test this, though, before trusting it.

What the heck, I'll do that now:

??? What's wrong with _.group() ???

HTH,
John
 
J

John Machin

On 22/07/2006 9:25 AM, John Machin wrote:

Apologies if this appears twice ... post to the newsgroup hasn't shown
up; trying the mailing-list.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top