complex regex

C

carlbernardi

HI,

I am new to java.util.regex package which I am using to detect each
time the javascript tag occurs in an html file and delete it. I tried
using the following code to find examples such as the ones below but
instead it finds the first occurrence of "<" and the last occurrence
of ">" which is not what I am looking for.

<script>
<script src="script.js">
</script>

String mat = "<html><script><p><font></script>";
String pat = "<*[\\x00-\\x7f]*jscript*[a-z0-9]*>";
Pattern pattern = Pattern.compile(pat);
Matcher matcher = pattern.matcher(mat);
while(matcher.find()){
System.out.println("Match: "+matcher.group()+"
Start:"+matcher.start()+" End:"+ matcher.end());
}

output:
Match: <html><script><p><font><script> Start:0 End:39

i would be looking for an out put of:
Match: <script> Start:6 End:18
Match: <script> Start:27 End:18

Appreciate any input,

Carl
 
C

carlbernardi

Funny, I think I found my answer. This way seamed to do the trick. Is
it possible to do the same thing with just Matcher.replaceAll()?


String mat = "(<html><script><p><font><script>";
String pat = "<[^>]*>";
StringBuffer sb = new StringBuffer(mat);
StringBuffer sb2 = new StringBuffer(mat);
Pattern pattern = Pattern.compile(pat);
Matcher matcher = pattern.matcher(mat);
int start,end = 0;
int newStart = 0;
while(matcher.find()){
start = matcher.start();
end = matcher.end();
System.out.println("old string ---
"+sb.substring(matcher.start(),matcher.end()).toString());
if(sb.substring(start,end).indexOf("script") > -1){
System.out.println("new string --- "+sb2.delete(start-
newStart,end-newStart).toString());
newStart = sb.length() - sb2.length();
}
System.out.println(start+" "+end+" "+newStart);
}
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top