How can i deactivate paste in a rich text edit box ?

Discussion in 'Javascript' started by Seth Russell, Sep 21, 2005.

  1. Seth Russell

    Seth Russell Guest

    I'm running Kevin Roth's rte box and i want to deactivate the ability
    to past inside the box. People sometimes paste outrageous things in
    there that might break my site. How can I deactivate the ability to
    paste?

    see: http://www.kevinroth.com/rte/demo.htm

    Thanks for your help
    Seth Russell
     
    Seth Russell, Sep 21, 2005
    #1
    1. Advertising

  2. "Seth Russell" <> writes:

    > I'm running Kevin Roth's rte box


    I don't know what it is, but it probably doesn't work in my browser
    anyway ... checking ... well, at least I can write HTML in it.

    > and i want to deactivate the ability to past inside the box. People
    > sometimes paste outrageous things in there that might break my site.
    > How can I deactivate the ability to paste?


    That's probably not the best way to solve the problem. Pasting is
    a useful operation, and disabling it will be guaranteed to annoy
    some users eventually. Also remember, anything that can be pasted,
    can also be written manually, so if someone wants to break your
    site, they still can (or if need be, they'll fake a HTTP POST
    of the bad content).

    If your application has a problem with malformed input, it should
    scan for exactly that, on the server, before using the input for
    anything else.

    That is general princliple in client/server programming on the
    internet ... don't trust the client. The responsibility for preventing
    site breakage should lie in a place that you can trust, which means
    the server.

    /L
    --
    Lasse Reichstein Nielsen -
    DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
    'Faith without judgement merely degrades the spirit divine.'
     
    Lasse Reichstein Nielsen, Sep 21, 2005
    #2
    1. Advertising

  3. Seth Russell

    Seth Russell Guest

    >I don't know what it is, but it probably doesn't work in my browser
    >anyway ... checking ... well, at least I can write HTML in it.


    Not in my version of it, i suppressed the "look at html" check box.
    Did the wysiwyg not work in your browser? Which browser is that?

    > Also remember, anything that can be pasted,
    > can also be written manually,


    Not really, you can't write HTML (in my version)

    > If your application has a problem with malformed input, it should
    > scan for exactly that, on the server, before using the input for
    > anything else.


    Yes, yes ... care to point me to a routine in php that does that.
    Needs to
    * disallow all scripts
    * disallow broken html - this is going out on a atom \ Rss feed and
    needs to be perfect XHTML

    Seth Russell
     
    Seth Russell, Sep 21, 2005
    #3
  4. Seth Russell

    Seth Russell Guest

    PS: What i really want it to do is to strip all HTML just from the
    paste input. It should function just exactly like the box here at
    Google Groups. I want just what you get if you select all on a web
    page and go to word pad and paste.

    Seth Russell
     
    Seth Russell, Sep 21, 2005
    #4
  5. Seth Russell

    Guest

    Seth Russell wrote:
    >
    > Not really, you can't write HTML (in my version)


    right. You're sending the user a program - a javascript program - and
    saying, "please run this and send me the results." Then when you get
    the results, you just assume that they are correct? Why? Because the
    user was nice and ran your javascript program?

    See, the thing is, a person can create their own little web page with a
    form in it that submits to *your* page. Do you understand? The kind
    of person that you're worried about, the kind of person who'd cut and
    paste HTML, is certainly the kind of person who is technically capable
    of this simple task.

    You *have* to check the input. You have to. It's not optional. It's
    not a nice thing that you'll do later, after you get the rest of the
    application working. You have to do it now. Checking the input is
    more important that the user interface. It's more important than that
    rich-text edit box. Whatever it is that you're developing, it will
    NEVER be secure until you check and correct the input.

    I'm sorry, but this is web programming 101. It's really something that
    you need to understand before you even get started.

    > Yes, yes ... care to point me to a routine in php that does that.


    When you say, "yes, yes" it kind of sounds like you're blowing the guy
    off. He gave you good advice. You need to listen to it. Stop
    whatever you're doing and fix the input on the server side.

    For starters, you could remove all less-than signs.
     
    , Sep 21, 2005
    #5
  6. Seth Russell

    Seth Russell Guest

    > You *have* to check the input. You have to. It's not optional. It's
    > not a nice thing that you'll do later, after you get the rest of the
    > application working. You have to do it now. Checking the input is
    > more important that the user interface. It's more important than that
    > rich-text edit box. Whatever it is that you're developing, it will
    > NEVER be secure until you check and correct the input.


    Ok, I got it. I guess i suspected this all along and just needed
    somebody with experience to tell me. Thanks.

    Sorry if it sounded like i was blowing Nielsen off, I really do need
    this to find a good sanatizer. Problem is finding a good one and
    finding the correct point in the program to execuite it. Obviously i
    cannot do the same sanatizing to the output of the RTE box that is
    submitted to me that i do to the imput from the paste otherwise i would
    loose all the rich text markup.

    Prob is I'm pretty ok with php, but javascript is a foreign language
    that i am just now learning. How can i preprocess the data comming
    into the RTE box from the client's clipboard ? Then where is there a
    good checking routine for the final output from the RTE box ?

    Thanks for your help ...

    Seth Russell
     
    Seth Russell, Sep 21, 2005
    #6
  7. "Seth Russell" <> writes:

    >>I don't know what it is, but it probably doesn't work in my browser
    >>anyway ... checking ... well, at least I can write HTML in it.

    >
    > Not in my version of it, i suppressed the "look at html" check box.
    > Did the wysiwyg not work in your browser? Which browser is that?


    Opera. It doesn't have formatted text input functionality. I don't
    know if any browser except IE and Mozilla-based ones have such a
    proprietary feature.

    > Yes, yes ... care to point me to a routine in php that does that.
    > Needs to
    > * disallow all scripts
    > * disallow broken html - this is going out on a atom \ Rss feed and
    > needs to be perfect XHTML


    I'd go the safer way and choose what to allow, not what to deny.
    Any text formatting tags should be retained (b, i, u, em, strong,
    br, perhaps even p). No attributes should be allowed (no event
    handlers or style attributes[1], and the rest doesn't really matter
    then). If any of these elements are not closed, it's not a big deal,
    but you could count starts and ends add missing ends.

    So in Javascript, I would do something like:
    ---
    // list of allowed tagnames
    var allowed = ['b','i','u','em','strong','br'];
    // RegExp matching tag
    var tagRE = /(.*?)(<(/?)(\w+)\b[^>]*>|$)/g;
    // RegExp matching alloweed
    var validRE = new RegExp("^("+allowed.join("|")+")$");

    // replace all non-allowed tags and make sure all allowed tags are closed
    function sanitize(html) {
    // stack of open tags
    var open = [];
    // foreach tag, replace with ...
    return html.replace(tagRE, function(_, before, tag, end, name) {
    // escape < and & in non-tag text.
    before = before.replace(/&/g,"&amp;").replace(/</g,"&lt;")
    if (name) { // contains a tag - not end of string
    if (validRE.test(name)) { // allowed tag
    if (!end) { // allowed start tag
    open.push(name);
    return before+"<"+name+">";
    } else { // allowed end tag
    var result = [before];
    var top;
    while (top = open.pop()) {
    result.push("</",top,">")
    if (top == name) { break; }
    }
    return result.join("");
    }
    } else { // unallowed tags.
    return before;
    }
    } else { // end of string
    result = [before];
    while(open.length > 0) {
    result.pop("</",open.pop(),">");
    }
    return result.join("");
    }
    });
    }
    ---
    I.e., pick out tags and in-between text, escape all "<" and "&" in text,
    remove all unallowed tags, remove all attributes from allowed tags,
    and close all open tags correctly (remove incorrect closing tags).

    While this might not give exactly what an author intended for some
    invalid HTML, he really has only himself to blame :)

    I have no idea how to convert this to PHP, but a competent PHP'er will
    probably know how.
    /L

    [1] Yes, style elements can be dangerous too (works in, at least, IE):
    <b style="background-image:
    url(javascript:document.location.href='http://mysexsite.example.com/')">
    --
    Lasse Reichstein Nielsen -
    DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
    'Faith without judgement merely degrades the spirit divine.'
     
    Lasse Reichstein Nielsen, Sep 21, 2005
    #7
  8. Seth Russell

    Guest

    Lasse Reichstein Nielsen wrote:
    > So in Javascript, I would do something like:


    My question is, what good is javascript in this situation? He still
    has to check the input on the server side, before he puts it in his
    database. He's got to do it in php.
     
    , Sep 21, 2005
    #8
  9. "" <> writes:

    > Lasse Reichstein Nielsen wrote:
    >> So in Javascript, I would do something like:

    >
    > My question is, what good is javascript in this situation?


    It's a functional description of an algorithm in a language that is
    on-topic for this newsgroup. It might even be used on the client side
    to preview what the final result will be, for non-malicious users.

    > He still has to check the input on the server side, before he puts
    > it in his database. He's got to do it in php.


    Agree completely that it has to be used server side, in whatever
    language the server side uses (which could be Javascript, but the
    orginal poster appears to use PHP).

    It's easier to translate an existing function into a new language than
    to write one from scratch, and with a javascript version (as opposed
    to a pseudocode description), you can even test that the translation
    gives the same results.

    /L
    --
    Lasse Reichstein Nielsen -
    DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
    'Faith without judgement merely degrades the spirit divine.'
     
    Lasse Reichstein Nielsen, Sep 21, 2005
    #9
  10. Seth Russell

    Guest

    I understand.
     
    , Sep 21, 2005
    #10
  11. Seth Russell

    Seth Russell Guest

    > It's easier to translate an existing function into a new language than
    > to write one from scratch, and with a javascript version (as opposed
    > to a pseudocode description), you can even test that the translation
    > gives the same results.


    Yes, definitely i can translate from the javascript to PHP, thanks for
    the code :)

    The problem i still have is that i want to send the client a checker
    (probably exactly the one you have given me above) but i don't know
    where to install it in the javascript
    (http://fastblogit.com/add/richtext.js) that is running the RTE box
    such that it will intervene between the client's paste of their
    clipboard. That same routine won't be applied on the output at the
    server because then it would eliminate all the nice rich text editing,
    right ?

    Seth Russell
     
    Seth Russell, Sep 21, 2005
    #11
  12. Seth Russell

    Seth Russell Guest

    Just so im not misunderstood: I know i need to check on the server side
    the XHTML comming back and disallow everyting that was not allowed from
    the client's paste or generated by the RTE box logic itself. So
    there are 2 different checks (1) between the client's clipboard and the
    RTE box - for that i can use the routine above almost verbatem - i just
    dont know where the intervention point is; and (2) back at the server
    sanatize everything not comming from the gadgets in the RTE box or the
    allowable HTML to be pasted.

    Hmmm ... does that make sense ?

    Seth
     
    Seth Russell, Sep 21, 2005
    #12
  13. "Seth Russell" <> writes:

    > Just so im not misunderstood: I know i need to check on the server side
    > the XHTML comming back and disallow everyting that was not allowed from
    > the client's paste or generated by the RTE box logic itself. So
    > there are 2 different checks (1) between the client's clipboard and the
    > RTE box -


    ....

    > Hmmm ... does that make sense ?


    Somewhat. Why do you need to prevent the user from pasting "bad" HTML?
    If the server removes the badness anyway, there is no problem in the end.

    You might wait until submission time, and remove extraneous HTML tags
    before submitting, but again, a normal user won't need it, because he
    only submits "nice" HTML, and the malicious malefactor will disable
    the javascript anyway.

    The *only* reason to do anything on the client side is to help the
    normal user. Since he isn't pasting bad HTML anyway, there is no need
    to do anything. Even if he manages to paste bad HTML, the server will
    remove it and (hopefully) display his result to him.

    /L
    --
    Lasse Reichstein Nielsen -
    DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
    'Faith without judgement merely degrades the spirit divine.'
     
    Lasse Reichstein Nielsen, Sep 21, 2005
    #13
  14. "Seth Russell" <> writes:

    > The problem i still have is that i want to send the client a checker
    > (probably exactly the one you have given me above) but i don't know
    > where to install it in the javascript


    Why? Or, more precisely: What problem are you trying to solve by that?

    Remember that in non IE/Mozilla browsers, the textarea will be just
    that, a plain HTML textarea.

    If you really want to do validation of the field, either do it at
    submission time, or in an "onchange" handler on the textarea.

    Also remember to check all fields, including both title fields,
    since they can also contain malicious HTML.

    > (http://fastblogit.com/add/richtext.js) that is running the RTE box
    > such that it will intervene between the client's paste of their
    > clipboard.


    Shouldn't be possible. Pasting is just a way of adding a lot of text
    without typing it one character at a time, but

    > That same routine won't be applied on the output at the
    > server because then it would eliminate all the nice rich text editing,
    > right ?


    The idea was to remove bad HTML from the input, which means that it
    never gets any further.

    I have now checked the site, and can see that more formatting is
    allowed than what my script would let through. The colors are set
    using spans with style attributes, so that should be allowed too ...
    and then you can't throw away all attributes, or even all style
    attributes, so a more precise filtering is needed.

    What you need to remove is then, at least:
    Scripts:
    * all script elements.
    * all intrinsic event handlers (any attribute starting with "on" should do)
    * all script urls (any url starting with a protocol not http or ftp,
    both in links and image elements, and in style attributes)
    * any iframe or object element (could embed another page with scripts).
    Malicious HTML:
    * any opening or closing comment
    * any closing tag not matching an opening tag (throw in a </table> and see :).
    * any starting tag not closed (especially those with CDATA content,
    i.e., script, style and textarea)

    With those gone, I'm fairly sure there is no scripting left, and the
    HTML can be contained in its div (adding </table> or </div> could otherwise
    mess up the layout).

    I have done some testing (as anonymous user aaaaej) which messes things
    up quite badly (I think I deleted them now, or maybe the Wizzard did :).

    /L
    --
    Lasse Reichstein Nielsen -
    DHTML Death Colors: <URL:http://www.infimum.dk/HTML/rasterTriangleDOM.html>
    'Faith without judgement merely degrades the spirit divine.'
     
    Lasse Reichstein Nielsen, Sep 21, 2005
    #14
  15. Seth Russell

    Donius Guest

    The *only* reason to do anything on the client side is to help the
    normal user. Since he isn't pasting bad HTML anyway, there is no need
    to do anything. Even if he manages to paste bad HTML, the server will
    remove it and (hopefully) display his result to him.

    I disagree, here! I run some CMS' for some customers who are, let's
    see...not technically savvy. I wish everyone could be on the up and
    up, but they aren't always. One thing that they constantly do that
    fouls up the system before we instituted some scripting ((both server
    and client side)) is paste content from MS Word. This looks like html.
    Smells like html. But breaks our editor and the resulting webpages
    like there's no tomorrow.

    So, Mr. Russell, to tell you what we did, on the JS end, we tested with
    a regex on pasting and submission for basic valid xhtml ((google for
    something that will work, i did!)), and then something similar on the
    backend of our server before submission.

    Hope that helps!

    -Brendan
     
    Donius, Sep 22, 2005
    #15
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. cpprogrammer
    Replies:
    0
    Views:
    555
    cpprogrammer
    May 11, 2006
  2. cpprogrammer
    Replies:
    4
    Views:
    792
    cpprogrammer
    May 15, 2006
  3. Paddy

    XP rich text cut-n-paste

    Paddy, Mar 4, 2006, in forum: Python
    Replies:
    4
    Views:
    409
    Duncan Booth
    Mar 5, 2006
  4. Hollow Quincy

    rich:dataTable - rich:dataScroller

    Hollow Quincy, Dec 30, 2011, in forum: Java
    Replies:
    5
    Views:
    4,555
    Arved Sandstrom
    Jan 2, 2012
  5. Robin
    Replies:
    1
    Views:
    94
    sreservoir
    May 2, 2010
Loading...

Share This Page