Validating user HTML input

Discussion in 'ASP .Net' started by Peter Morris [Air Software Ltd], Jan 2, 2005.

  1. Hi all

    I want to allow users to enter HTML, but I want to ensure that

    A) The HTML entered only contains a subset of html tags (a, img, div, etc)
    and not certain other elements (html, body, script) etc.

    B) The HTML is syntactically correct, so a <li> would have a corresponding
    </li>, a <td> would have a </td>, etc.

    I'm pretty confident that .net already has something to do this, but I have
    no clue what. Can anyone help?

    Thanks


    --
    Pete
    ====
    Read or write articles on just about anything
    http://www.HowToDoThings.com

    My blog
    http://blogs.slcdug.org/petermorris/
     
    Peter Morris [Air Software Ltd], Jan 2, 2005
    #1
    1. Advertising

  2. Peter Morris [Air Software Ltd]

    Greg Burns Guest

    Regular expressions can do this sort of thing. That's where I would be
    investigating.

    Greg

    "Peter Morris [Air Software Ltd]" <> wrote in
    message news:%...
    > Hi all
    >
    > I want to allow users to enter HTML, but I want to ensure that
    >
    > A) The HTML entered only contains a subset of html tags (a, img, div, etc)
    > and not certain other elements (html, body, script) etc.
    >
    > B) The HTML is syntactically correct, so a <li> would have a corresponding
    > </li>, a <td> would have a </td>, etc.
    >
    > I'm pretty confident that .net already has something to do this, but I
    > have no clue what. Can anyone help?
    >
    > Thanks
    >
    >
    > --
    > Pete
    > ====
    > Read or write articles on just about anything
    > http://www.HowToDoThings.com
    >
    > My blog
    > http://blogs.slcdug.org/petermorris/
    >
     
    Greg Burns, Jan 2, 2005
    #2
    1. Advertising

  3. Peter Morris [Air Software Ltd] wrote:


    > I want to allow users to enter HTML, but I want to ensure that


    > B) The HTML is syntactically correct, so a <li> would have a corresponding
    > </li>, a <td> would have a </td>, etc.


    According to
    <http://www.w3.org/TR/html4/struct/lists.html>
    the closing </li> is optional, and according to
    <http://www.w3.org/TR/html4/struct/tables.html>
    the closing </td> is optional too so a HTML syntax checker would be
    wrong to complain about missing closng </li> or </td> tags.

    As for HTML syntax checking with .NET perhaps Tidy can help with that:
    <http://users.rcn.com/creitzel/tidy.html#dotnet>
    I am not sure however it will help if you only want to allow a subset of
    HTML but maybe you can write a custom DTD and have Tidy use that.


    --

    Martin Honnen
    http://JavaScript.FAQTs.com/
     
    Martin Honnen, Jan 2, 2005
    #3
  4. Hi

    It seems a bit too complicated for RegEx to me (or for me in RegEx). Not
    only would I want to prevent <script> inserts, validate the input etc, but
    also prevent javascript being inserted as a click event or something as a
    parameter to some html.

    What do you think?


    --
    Pete
    ====
    Audio compression components, DIB graphics controls, FastStrings
    http://www.droopyeyes.com

    Read or write articles on just about anything
    http://www.HowToDoThings.com

    My blog
    http://blogs.slcdug.org/petermorris/
     
    Peter Morris [Air Software Ltd], Jan 2, 2005
    #4
  5. Peter Morris [Air Software Ltd]

    Peter Blum Guest

    Hi Peter,

    I wrote a commercial product that specifically addresses this. Visual Input
    Security (http://www.peterblum.com/vise/home.aspx) provides new validators
    that handle SQL injection and Script injection attacks. The validators look
    at all inputs: visible controls, hidden fields, query strings, and cookies.
    They look for illegal tags, tags you just want to avoid, and embedded
    javascript in legal tags. They log errors and notify you via email. They can
    block access to a page after the hacker makes several attempts.

    --- Peter Blum
    www.PeterBlum.com
    Email:
    Creator of "Professional Validation And More" at
    http://www.peterblum.com/vam/home.aspx

    "Peter Morris [Air Software Ltd]" <> wrote in
    message news:...
    > Hi
    >
    > It seems a bit too complicated for RegEx to me (or for me in RegEx). Not
    > only would I want to prevent <script> inserts, validate the input etc, but
    > also prevent javascript being inserted as a click event or something as a
    > parameter to some html.
    >
    > What do you think?
    >
    >
    > --
    > Pete
    > ====
    > Audio compression components, DIB graphics controls, FastStrings
    > http://www.droopyeyes.com
    >
    > Read or write articles on just about anything
    > http://www.HowToDoThings.com
    >
    > My blog
    > http://blogs.slcdug.org/petermorris/
    >
     
    Peter Blum, Jan 3, 2005
    #5
  6. Peter Morris [Air Software Ltd], Jan 9, 2005
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. darrel
    Replies:
    1
    Views:
    460
    darrel
    Jun 24, 2004
  2. Replies:
    2
    Views:
    5,914
  3. chuck

    Validating user input?

    chuck, Jun 6, 2006, in forum: C Programming
    Replies:
    9
    Views:
    423
    octangle
    Jun 7, 2006
  4. Oliver Bleckmann
    Replies:
    1
    Views:
    351
  5. Tamer Ibrahim

    User input date Validating

    Tamer Ibrahim, Nov 12, 2007, in forum: ASP .Net
    Replies:
    1
    Views:
    499
Loading...

Share This Page