Negative regular expressions (searching for "i" not inside command)

B

Bart Kastermans

I have a file in which I am searching for the letter "i" (actually
a bit more general than that, arbitrary regular expressions could
occur) as long as it does not occur inside an expression that matches
\\.+?\b (something started by a backslash and including the word that
follows).

More concrete example, I have the string "\sin(i)" and I want to match
the argument, but not the i in \sin.

Can this be achieved by combining the regular expressions? I do not
know the right terminology involved, therefore my searching on the
Internet has not led to any results.

I can achieve something like this by searching for all i and then
throwing away those i that are inside such expressions. I am now just
wondering if these two steps can be combined into one.

Best,
Bart
 
G

Guilherme Polo

I have a file in which I am searching for the letter "i" (actually
a bit more general than that, arbitrary regular expressions could
occur) as long as it does not occur inside an expression that matches
\\.+?\b (something started by a backslash and including the word that
follows).

More concrete example, I have the string "\sin(i)" and I want to match
the argument, but not the i in \sin.

Can this be achieved by combining the regular expressions? I do not
know the right terminology involved, therefore my searching on the
Internet has not led to any results.

Try searching again with the "lookahead" term, or "negative lookahead".
 
C

castironpi

Try searching again with the "lookahead" term, or "negative lookahead".

No dice: "This is called a positive lookbehind assertion. ...The
contained pattern must only match strings of some fixed length."
'finditer' could make the 'combined into one' option more attractive
though.
 
T

Terry Reedy

Bart said:
I have a file in which I am searching for the letter "i" (actually
a bit more general than that, arbitrary regular expressions could
occur) as long as it does not occur inside an expression that matches
\\.+?\b (something started by a backslash and including the word that
follows).

You should either make sure that the opposite, a match of \\.+?\b inside
a match of your target re, cannot occur, or consider what you want to
happen if it can.
More concrete example, I have the string "\sin(i)" and I want to match
the argument, but not the i in \sin.

Can this be achieved by combining the regular expressions? I do not
know the right terminology involved, therefore my searching on the
Internet has not led to any results.

I can achieve something like this by searching for all i and then
throwing away those i that are inside such expressions.

If you do not need the original position in the text of each match, and
you are not concerned about target matches encompassing splitter
matches, you could switch the order of searching.

for fragment in re.split(text, r'\\.+?\b'):
said:
> I am now just wondering if these two steps can be combined into one.

Perhaps find \\.+?\b or target, with only the latter captured, but I
will leave that to someone else.

tjr
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,608
Members
45,250
Latest member
Charlesreero

Latest Threads

Top