K
Kyle Heck
I'm writing a web crawler, and in that crawler I want to remove all
scripts in the pages I crawl.
I should be able to do a simple gsub!(/<!--.*-->/,"") right? Well, I do
that and unfortunately it doesn't remove some scripts. Take google for
instance. It removes the first script, but not the second. I'm really
confused. Since google has two scripts, <!-- happens twice, so do -->
so it's not like the full regexp should ever fail to be triggered.
Any insight on the issue would be GREAT?!
Thanks,
Kyle Heck
scripts in the pages I crawl.
I should be able to do a simple gsub!(/<!--.*-->/,"") right? Well, I do
that and unfortunately it doesn't remove some scripts. Take google for
instance. It removes the first script, but not the second. I'm really
confused. Since google has two scripts, <!-- happens twice, so do -->
so it's not like the full regexp should ever fail to be triggered.
Any insight on the issue would be GREAT?!
Thanks,
Kyle Heck