Help with regular expression?

L

Linda

Hi,

I'm trying to match all strings in my code that aren't in println()
statements.

What I have tried most recently:
[^\Qsystem.out.println\E]\"([^\"])+?\"

Help?

-L
 
T

Tilman Bohn

Hi,

I'm trying to match all strings in my code that aren't in println()
statements.

I take it you mean all String literals, not all strings.
What I have tried most recently:
[^\Qsystem.out.println\E]\"([^\"])+?\"

The first bracketed expression doesn't do what you think it does.
Character classes don't work for character sequences like that, and the
\Q...\E escaping doesn't change that. (Your bracket really means `any
character but s, y, t, e, m, o, u, p, r, i, n, l, or a dot'.) Also, if it
did, the match would include not only your literal, but anything leading
up to it. Plus, the class System is spelled with a capital s. You really
want to be using a negative look-behind assertion, which in your case
would look as follows:

(?<!System\.out\.println)

(A positive look-behind assertion would be (?<=foo), just for
comparison's sake.)

Note that this will break if someone has aliased System.out and then
calls println() on the alias. Also be careful to allow arbitrary amounts
of white-space. This will get pretty involved once you want to correctly
exclude println() calls spanning several lines, for which you probably
have only two alternatives: actually parse a good deal of Java syntax, or
read a whole file at a time and match it as one multi-line expression.

Both of these will get fairly complicated, but if you can live with
the occasional false positive, the simple look-behind and some additional
white-space should get you a good deal closer to the solution.

The second part of your proposed solution won't catch double quotes
within the literal. But don't just exclude matches to \", because the
sequence \\" could again terminate the literal.

This problem is very similar to the `Regexp and Pattern.class' thread
currently going on here. Deciding how to correctly match variously escaped
characters, but not escaped escape sequences... ;-) Of course this type of
problem is one of the original reasons for regular expressions, because
such classes of sequences are `typical' languages produced by regular
(type 3) grammars. (And of course look-behind assertions can technically
never be a part of _regular_ expressions, but that's another story...)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Similar Threads

I need help with a program 2
Help with passing test 3
Help with regular expression 2
Help with code 0
I need help with sfml on Codeblocks! 0
Help please 8
Needing Help! 1
Need help with this Python code. 2

Members online

No members online now.

Forum statistics

Threads
473,768
Messages
2,569,575
Members
45,053
Latest member
billing-software

Latest Threads

Top