paste as plain text from word

Discussion in 'Javascript' started by Flyzone, Apr 22, 2008.

  1. Flyzone

    Flyzone Guest

    Hello, i'm trying to paste copied text from word into an input box.
    This text is saved into a oracle db and then used as text in another
    javascript.
    The problem is that using the saved text (encoded and decoded in the
    db to avoid sql injection) have some special char that block the
    javascript execution (i think is unicode char).

    So i would like to detect and delete this char with a javascript
    function (i can't disable copy and paste), cause if i paste the text
    copied in word into window notepad and then i copy from notepad ad i
    paste again in my form i don't have problem.

    Until now i used:
    re = /\$|,|`|\'|\||\\|\!|\./g;
    return str.replace(re, "");

    But is impossible to put all chars; does exist any function that
    detect if a char is unicode or plain text, or a paste special?

    Thanks in advance
    Flyzone, Apr 22, 2008
    #1
    1. Advertising

  2. Flyzone

    Erwin Moller Guest

    Flyzone schreef:
    > Hello, i'm trying to paste copied text from word into an input box.
    > This text is saved into a oracle db and then used as text in another
    > javascript.
    > The problem is that using the saved text (encoded and decoded in the
    > db to avoid sql injection) have some special char that block the
    > javascript execution (i think is unicode char).
    >
    > So i would like to detect and delete this char with a javascript
    > function (i can't disable copy and paste), cause if i paste the text
    > copied in word into window notepad and then i copy from notepad ad i
    > paste again in my form i don't have problem.
    >
    > Until now i used:
    > re = /\$|,|`|\'|\||\\|\!|\./g;
    > return str.replace(re, "");
    >
    > But is impossible to put all chars; does exist any function that
    > detect if a char is unicode or plain text, or a paste special?
    >
    > Thanks in advance


    Hi,

    Wouldn't it be easier to support unicode instead of trying to strip the
    content?

    Regards,
    Erwin Moller
    Erwin Moller, Apr 22, 2008
    #2
    1. Advertising

  3. Flyzone

    Flyzone Guest

    On 22 Apr, 11:24, Erwin Moller
    <> wrote:
    > Flyzone schreef:
    >
    > Wouldn't it be easier to support unicode instead of trying to strip the
    > content?


    Maybe, but is not so clean to put in the db some 'microsoft word'
    trash...
    However supporting the charset means? I have the string in db
    urlencoded, i try

    <?php
    print "document.object.value='".urldecode(".$string."');";
    ?>

    that in javascript became

    document.object.value='string_plus_chartrash
    other trash
    ';

    For what i know urldecode function in php is different from that in
    javascript, so i'll need to write a urldecode function for
    javascript....not more easy than clean the string before the db
    insert..
    Flyzone, Apr 22, 2008
    #3
  4. Flyzone

    Erwin Moller Guest

    Flyzone schreef:
    > On 22 Apr, 11:24, Erwin Moller
    > <> wrote:
    >> Flyzone schreef:
    >>
    >> Wouldn't it be easier to support unicode instead of trying to strip the
    >> content?

    >
    > Maybe, but is not so clean to put in the db some 'microsoft word'
    > trash...
    > However supporting the charset means? I have the string in db
    > urlencoded, i try
    >
    > <?php
    > print "document.object.value='".urldecode(".$string."');";
    > ?>
    >
    > that in javascript became
    >
    > document.object.value='string_plus_chartrash
    > other trash
    > ';
    >
    > For what i know urldecode function in php is different from that in
    > javascript, so i'll need to write a urldecode function for
    > javascript....not more easy than clean the string before the db
    > insert..


    Agree, I see your point.
    Maybe somebody else can help. I have little experience with unicode and
    JavaScript.

    Regards,
    Erwin Moller
    Erwin Moller, Apr 22, 2008
    #4
  5. Flyzone wrote:
    > Hello, i'm trying to paste copied text from word into an input box.
    > This text is saved into a oracle db and then used as text in another
    > javascript.
    > The problem is that using the saved text (encoded and decoded in the
    > db to avoid sql injection)


    I daresay that is a wrong and therefore potentially dangerous solution. The
    SQL injection attack *cannot* be prevented by storing the data encoded in
    the database, but it has to take place before storing the data in the
    database, when passing the arguments to the (server-side) database
    modification feature, by properly escaping some delimiter characters in the
    query string that could be exploited in injection code. I find it hard to
    believe that the Oracle API for your programming language does not offer
    something like PHP does for MySQL with mysql_real_escape_string().

    http://php.net/mysql_real_escape_string

    > have some special char that block the javascript execution (i think
    > is unicode char).


    What would "unicode char" be? See below.

    > So i would like to detect and delete this char with a javascript
    > function (i can't disable copy and paste), cause if i paste the text
    > copied in word into window notepad and then i copy from notepad ad i
    > paste again in my form i don't have problem.


    ISTM your problem is that you are trying to fix the issues that have arisen
    because you have been implementing a wrong and potentially dangerous
    solution. And that you are apparently unable to spell the English pronoun
    `I' properly.

    > Until now i used:
    > re = /\$|,|`|\'|\||\\|\!|\./g;


    Eeek.

    var re = /[!$',.`|]/g;



    There is no variable required anyway, you can use the RegExp literal as
    argument as it is:

    > return str.replace(re, "");


    return str.replace(/[!$',.`|]/g, "");

    > But is impossible to put all chars;


    It is possible: /[\u0000-\uffff]/g

    Depending on what you mean by "all", it may be possible to specify other
    character ranges. But I think either would be unnecessary overkill here.

    http://developer.mozilla.org/en/doc...Exp#Special_characters_in_regular_expressions

    > does exist any function that detect if a char is unicode or plain text,
    > or a paste special?


    No. Firstly, it is a misconception of yours to believe that Unicode
    characters were not plain text, and that there would be "paste specials".

    Secondly, all strings are stored internally using the UTF-16LE encoding from
    JavaScript 1.3, ECMAScript edition 3 forward, and so they must represent
    characters in the Unicode character set. Whether some of those characters
    are also part of other character sets, most notably that supported by the
    7-bit US-ASCII encoding, is irrelevant.

    http://en.wikipedia.org/wiki/Unicode

    Thirdly, ECMAScript implementations are Unicode-safe from edition 3 forward.
    That includes identifiers. So it would only be probable that your encoded
    string contains characters that are interpreted as control characters like
    newline in eval(), in which case you should not use eval() or escape e.g.
    "\n" with "\\n", respectively.

    Like I said, you should forego the idea of encoding all the information in
    your database completely (unless there are further security requirements to
    consider) and properly escape your storing query string instead.


    HTH

    PointedEars
    --
    Anyone who slaps a 'this page is best viewed with Browser X' label on
    a Web page appears to be yearning for the bad old days, before the Web,
    when you had very little chance of reading a document written on another
    computer, another word processor, or another network. -- Tim Berners-Lee
    Thomas 'PointedEars' Lahn, Apr 22, 2008
    #5
  6. Flyzone

    Flyzone Guest

    On 22 Apr, 22:25, Thomas 'PointedEars' Lahn <>
    wrote:
    > query string that could be exploited in injection code. I find it hard to
    > believe that the Oracle API for your programming language does not offer
    > something like PHP does for MySQL with mysql_real_escape_string().


    I do an escape and then urlencode, and the server is internal for us,
    out the
    internet and without critical data.

    > And that you are apparently unable to spell the English pronoun `I' properly.


    Ahm yes, I know I know.... sorry :p

    > There is no variable required anyway, you can use the RegExp literal as
    > argument as it is:


    Was required, I just didn't told, I would like to have different var
    for
    differente use, like just char, just number, just specialchar :)

    > It is possible: /[\u0000-\uffff]/g


    Gh you solved my problem, your reply is really appreciated, I have
    wrong
    from the beginning searching a "paste speacial" instead of thinking
    about
    a blur and function that use a more right regexp.

    > So it would only be probable that your encoded
    > string contains characters that are interpreted as control characters like
    > newline in eval()


    That was the problem with javascript, but some unicode chars are
    however disliked.

    > escape your storing query string instead.


    Yes I'll do, but however I need to show at the users what they are
    writing into
    the form and what will be deleted.

    Thank you again
    Flyzone, Apr 29, 2008
    #6
  7. Flyzone wrote:
    > On 22 Apr, 22:25, Thomas 'PointedEars' Lahn <>
    > wrote:
    >> query string that could be exploited in injection code. I find it hard to
    >> believe that the Oracle API for your programming language does not offer
    >> something like PHP does for MySQL with mysql_real_escape_string().

    >
    > I do an escape and then urlencode, and the server is internal for us,
    > out the internet and without critical data.


    Nevertheless, the approach of escaping everything obviously requires a
    greater database and less efficient storage and retrieval methods than
    necessary.

    >> So it would only be probable that your encoded
    >> string contains characters that are interpreted as control characters like
    >> newline in eval()

    >
    > That was the problem with javascript, but some unicode chars are
    > however disliked.


    Since Usenet works in both directions, it would be appropriate if you named
    those characters so that others here can benefit from that knowledge.

    >> escape your storing query string instead.

    >
    > Yes I'll do, but however I need to show at the users what they are
    > writing into the form and what will be deleted.


    I don't follow.


    PointedEars
    --
    Anyone who slaps a 'this page is best viewed with Browser X' label on
    a Web page appears to be yearning for the bad old days, before the Web,
    when you had very little chance of reading a document written on another
    computer, another word processor, or another network. -- Tim Berners-Lee
    Thomas 'PointedEars' Lahn, Apr 29, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Mike Bridge
    Replies:
    2
    Views:
    4,696
    Mike Bridge
    Feb 20, 2004
  2. TomasH
    Replies:
    2
    Views:
    6,308
    La'ie Techie
    Oct 15, 2003
  3. Elton Pruitt
    Replies:
    2
    Views:
    5,805
    akjoshi
    Jun 12, 2006
  4. TimmyC
    Replies:
    0
    Views:
    1,529
    TimmyC
    Jun 8, 2007
  5. Krick
    Replies:
    0
    Views:
    111
    Krick
    Apr 14, 2004
Loading...

Share This Page