How to match where the search started?

F

Florian Kaufmann

From the documentation:

7.2.4. Regular Expression Objects, search(string[, pos[, endpos]])
.... the '^' pattern character matches at the real beginning of the
string and at positions just after a newline, but not necessarily at
the index where the search is to start....

But I'd like to do just that. In Emacs regexps, I think the closest
equivalent would be \=. Then I could do something like that, and also
find directly adjacent matches

reo = re.compile( r'(\=|...)...' );
while True
mo = reo.search(text,pos)
if not mo: break
...

Flo
 
F

Florian Kaufmann

The thing is that the (\=|...) group is not really part of the match.
I think this gives you more the idea what I want

reo = re.compile( r'(\=|.)...' );
while True
mo = reo.search(text,pos)
if not mo: break
if text[mo.start()] == '\\'
# a pseudo match. continue after the backslash
else
# a real match. continue after the match
 
F

Florian Kaufmann

The thing is that the (\=|...) group is not really part of the match.
I think this gives you more the idea what I want

reo = re.compile( r'(\=|.)...' );
while True
mo = reo.search(text,pos)
if not mo: break
if text[mo.start()] == '\\'
# a pseudo match. continue after the backslash
else
# a real match. continue after the match
 
T

Thomas Jollans

From the documentation:
7.2.4. Regular Expression Objects, search(string[, pos[, endpos]])
... the '^' pattern character matches at the real beginning of the
string and at positions just after a newline, but not necessarily at
the index where the search is to start....

But I'd like to do just that. In Emacs regexps, I think the closest
equivalent would be \=. Then I could do something like that, and also
find directly adjacent matches

reo = re.compile( r'(\=|...)...' );
while True
mo = reo.search(text,pos)
if not mo: break
...

Flo

You could prefix your regexp with r'(.*?)' to create a match of stuff that is
between the start of search and the start of the first thing you're interested
in. (untested...)
 
M

MRAB

From the documentation:

7.2.4. Regular Expression Objects, search(string[, pos[, endpos]])
... the '^' pattern character matches at the real beginning of the
string and at positions just after a newline, but not necessarily at
the index where the search is to start....

But I'd like to do just that. In Emacs regexps, I think the closest
equivalent would be \=. Then I could do something like that, and also
find directly adjacent matches

reo = re.compile( r'(\=|...)...' );
while True
mo = reo.search(text,pos)
if not mo: break
...
If you want to anchor the regex at the start position 'pos' then use
the 'match' method instead.
 
F

Florian Kaufmann

If you want to anchor the regex at the start position 'pos' then use
the 'match' method instead.

The wickedly problem is that matching at position 'pos' is not a
requirement, its an option. Look again at my 2nd example, the
r'(\=|.)...' part, which (of course wrongly) assumes that \= means
'match at the beginning of the search'. Before the match I am really
interested in, there is the start of the search, OR there is any
character.
 
M

MRAB

The wickedly problem is that matching at position 'pos' is not a
requirement, its an option. Look again at my 2nd example, the
r'(\=|.)...' part, which (of course wrongly) assumes that \= means
'match at the beginning of the search'. Before the match I am really
interested in, there is the start of the search, OR there is any
character.

An alternative is to use the 'regex' module, available from PyPI:

http://pypi.python.org/pypi/regex

It has \G, which is the anchor for the start position.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,906
Latest member
SkinfixSkintag

Latest Threads

Top