Validating user HTML input

  • Thread starter Peter Morris [Air Software Ltd]
  • Start date
P

Peter Morris [Air Software Ltd]

Hi all

I want to allow users to enter HTML, but I want to ensure that

A) The HTML entered only contains a subset of html tags (a, img, div, etc)
and not certain other elements (html, body, script) etc.

B) The HTML is syntactically correct, so a <li> would have a corresponding
</li>, a <td> would have a </td>, etc.

I'm pretty confident that .net already has something to do this, but I have
no clue what. Can anyone help?

Thanks


--
Pete
====
Read or write articles on just about anything
http://www.HowToDoThings.com

My blog
http://blogs.slcdug.org/petermorris/
 
G

Greg Burns

Regular expressions can do this sort of thing. That's where I would be
investigating.

Greg
 
M

Martin Honnen

Peter Morris [Air Software Ltd] wrote:

I want to allow users to enter HTML, but I want to ensure that
B) The HTML is syntactically correct, so a <li> would have a corresponding
</li>, a <td> would have a </td>, etc.

According to
<http://www.w3.org/TR/html4/struct/lists.html>
the closing </li> is optional, and according to
<http://www.w3.org/TR/html4/struct/tables.html>
the closing </td> is optional too so a HTML syntax checker would be
wrong to complain about missing closng </li> or </td> tags.

As for HTML syntax checking with .NET perhaps Tidy can help with that:
<http://users.rcn.com/creitzel/tidy.html#dotnet>
I am not sure however it will help if you only want to allow a subset of
HTML but maybe you can write a custom DTD and have Tidy use that.
 
P

Peter Morris [Air Software Ltd]

Hi

It seems a bit too complicated for RegEx to me (or for me in RegEx). Not
only would I want to prevent <script> inserts, validate the input etc, but
also prevent javascript being inserted as a click event or something as a
parameter to some html.

What do you think?


--
Pete
====
Audio compression components, DIB graphics controls, FastStrings
http://www.droopyeyes.com

Read or write articles on just about anything
http://www.HowToDoThings.com

My blog
http://blogs.slcdug.org/petermorris/
 
P

Peter Blum

Hi Peter,

I wrote a commercial product that specifically addresses this. Visual Input
Security (http://www.peterblum.com/vise/home.aspx) provides new validators
that handle SQL injection and Script injection attacks. The validators look
at all inputs: visible controls, hidden fields, query strings, and cookies.
They look for illegal tags, tags you just want to avoid, and embedded
javascript in legal tags. They log errors and notify you via email. They can
block access to a page after the hacker makes several attempts.

--- Peter Blum
www.PeterBlum.com
Email: (e-mail address removed)
Creator of "Professional Validation And More" at
http://www.peterblum.com/vam/home.aspx
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,014
Latest member
BiancaFix3

Latest Threads

Top