H
Hal Vaughan
I'm trying to clean up some comments in web pages. I'm using regexes to do
a lot of the work, but I've run into a problem. Toward the end of the
process, I'm trying to replace any remaining HTML tags with an empty
string, as in no spaces, nothing, just "". If I replace the HTML tags with
a space or other characters it works, but it won't work with an empty
string. (I also tried at mindprod.com, one of the first places for Java
info, but the site is down.)
Here's a snippet to explain what I'm doing:
//sDesc is the string with the text I'm working on
String sTag = "<.*?>";
Pattern pTag = Pattern.compile(sTag);
Matcher lineMatch = pTag.matcher(sDesc);
sDesc = lineMatch.replaceAll("");
If I use " " in that last line, it works fine, but whenever I use "", the
HTML tags are NOT replaced.
I know it deeply offends people if any code is posted that isn't ready to be
compiled and run as is, but I think this is more about how regexes work
than a specific piece of code. I've searched for "empty string" in
connection with regex replacement (and using different terms), but I
haven't found anything about this. In most cases, I find something talking
about accidently matching empty strings. I would also think there's a
better term than empty string to apply to this. Is there?
Why is it that a replace with a space works but with an empty string it
doesn't?
Thanks!
Hal
a lot of the work, but I've run into a problem. Toward the end of the
process, I'm trying to replace any remaining HTML tags with an empty
string, as in no spaces, nothing, just "". If I replace the HTML tags with
a space or other characters it works, but it won't work with an empty
string. (I also tried at mindprod.com, one of the first places for Java
info, but the site is down.)
Here's a snippet to explain what I'm doing:
//sDesc is the string with the text I'm working on
String sTag = "<.*?>";
Pattern pTag = Pattern.compile(sTag);
Matcher lineMatch = pTag.matcher(sDesc);
sDesc = lineMatch.replaceAll("");
If I use " " in that last line, it works fine, but whenever I use "", the
HTML tags are NOT replaced.
I know it deeply offends people if any code is posted that isn't ready to be
compiled and run as is, but I think this is more about how regexes work
than a specific piece of code. I've searched for "empty string" in
connection with regex replacement (and using different terms), but I
haven't found anything about this. In most cases, I find something talking
about accidently matching empty strings. I would also think there's a
better term than empty string to apply to this. Is there?
Why is it that a replace with a space works but with an empty string it
doesn't?
Thanks!
Hal