regexp strip html

U

Une bévue

i have a regexp able to strip html :

/<[^>]*>/

however, between <script and </script> all the "text is preserved, tjen
i've tried :

def stripHTML
# self.gsub(/<\S[^><]*>/, '')
self.gsub(/\A.*<body [^>]*>(.*)<\/body>\s*\Z/, '\1').gsub(/<[^>]*>/,
'')
end

without success : the various javascript functions are kept ?

what's my error here ?
 
P

Paul Battley

T24gMjYvMDMvMDYsIFVuZSBiw6l2dWUgPHBlcmUubm9lbEBsYXBvbmllLmNvbS5pbnZhbGlkPiB3
cm90ZToKPiBpIGhhdmUgYSByZWdleHAgYWJsZSB0byBzdHJpcCBodG1sIDoKPgo+IC88W14+XSo+
Lwo+Cj4gaG93ZXZlciwgYmV0d2VlbiA8c2NyaXB0IGFuZCA8L3NjcmlwdD4gYWxsIHRoZSAidGV4
dCBpcyBwcmVzZXJ2ZWQsIHRqZW4KLi4uCj4gd2hhdCdzIG15IGVycm9yIGhlcmUgPwoKTG9vayBh
dCBpdCB0aGlzIHdheTogeW91IGhhdmUgJzxzY3JpcHQ+SmF2YXNjcmlwdDwvc2NyaXB0PicuIFlv
dQpyZW1vdmUgZXZlcnl0aGluZyBiZXR3ZWVuIGFuZ2xlIGJyYWNrZXRzLiBZb3Ugc3RpbGwgaGF2
ZSAnSmF2YXNjcmlwdCcsCmJlY2F1c2UgdGhhdCdzIG5vdCBhY3R1YWxseSBpbnNpZGUgPC4uLj4u
CgpUaGUgc2ltcGxlc3Qgc29sdXRpb24gaXMgcHJvYmFibHkgdG8gZG8gc29tZXRoaW5nIGxpa2Ug
dGhpcyBiZWZvcmUKc3RyaXBwaW5nIG91dCB0aGUgcmVtYWluaW5nIHRhZ3M6Cgpnc3ViKC88c2Ny
aXB0Lio/PC9zY3JpcHQ+L2ltLCAnJykKClBhdWwuCg==
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top