remove all html tags by perl

J

jjliu

Could someone tell me how to remove all html tags (and anything inside tags)
by perl. Some people suggested me to use HTML::TagFilter but i could not
find window version. Thanks very much for your help.

JJL
 
J

jjliu

Thanks.What i wanted is to remove head tag and anything inside it. Could you
help me out.
 
K

Kris Wempa

Gunnar Hjalmarsson said:
Sure.

s/.*//s;

That will remove ALL characters. He really needs something along the lines
of:

s/\<[^\<]+\>//;

This only works if the entire TAG is within the same string. If the tag
spans multiple lines, they will need to be concatenated into 1 string.
 
E

Eric J. Roode

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

That will remove ALL characters.

Gunnar knows that. :)

He really needs something along the
lines of:

s/\<[^\<]+\>//;

Why all the backslashes?
This only works if the entire TAG is within the same string. If the
tag spans multiple lines, they will need to be concatenated into 1
string.

It also doesn't work if anything within the tag or its attributes contain
a > symbol. Example:

<img src="mathexpression.gif" alt="5 is > 4" />
<input type="submit" onclick="if (count > 1) true else false" />

- --
Eric
$_ = reverse sort $ /. r , qw p ekca lre uJ reh
ts p , map $ _. $ " , qw e p h tona e and print

-----BEGIN PGP SIGNATURE-----
Version: PGPfreeware 7.0.3 for non-commercial use <http://www.pgp.com>

iQA/AwUBP4ftJGPeouIeTNHoEQJxpACghIOdjOo5xr7rh9N5zQ6d9EF3KvIAmwdA
R0qdv3U33ZyBzW4L7u8Vq6jf
=sIdz
-----END PGP SIGNATURE-----
 
G

Gunnar Hjalmarsson

jjliu said:
Thanks.What i wanted is to remove head tag and anything inside it.
Could you help me out.

Only the head tag? Well, in that case a regexp similar to what Kris
suggested might be sufficient. But please note that normally you'd
better use a module when dealing with HTML code, and even if I have
never used the one you mentioned, it appears to be a good suggestion.
Some people suggested me to use HTML::TagFilter but i could not
find window version.

What do you mean by Windows version? What makes you think that
HTML::TagFilter doesn't work on Windows?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,743
Messages
2,569,478
Members
44,898
Latest member
BlairH7607

Latest Threads

Top