Javascript and special characters

Discussion in 'Javascript' started by Doc, Mar 27, 2006.

  1. Doc

    Doc Guest

    Hello!

    I'm experiencing a little problem counting the number of characters in
    a textarea on a html page.

    This is the content type of my HTML document
    content="text/html; charset=iso-8859-1"

    I have a textarea that I want to limit to 400 characters, but the
    enduser can enter special characters (like €, v, 8, ...). I have
    to limit it because I have a limit in my database and I don't want my
    webapp to hang...

    I tried counting the characters with a javascript function, but it
    doesn't work with these special characters as 'v' is stored in the DB
    as '√' My character count is not right (well for the DB...) and I
    can get an error when saving.

    Can somebody help me? Can I do it otherwise (always checking the input
    on the JSP (HTML).

    Thanks
     
    Doc, Mar 27, 2006
    #1
    1. Advertising

  2. Doc

    AndrewTK Guest

    If I understand, your problem is:
    -user enters "mes vacances cet été" for example
    -"é" will become &#(something);
    -you want to count é as probably "" length = 6 (if the DB
    generates a 3 digit number)

    Solution (amongst many):

    on submit, pass the *textarea* to this function:

    esc(the_textarea) {the_textarea.value = escape(thetextarea.value);}

    this will convert the message to HTTP-url-encoded text and be stored
    as-is on the server. count that text. if you are using continuous
    counting on the page via a JS function count(text), use
    count(escape(text) ) - this will count the text in its escaped form,
    without displaying this to the textarea before the user sends

    when displaying the text in a page, call unescape(text) on your text to
    convert it back to the original text

    if the text is returned by the DB as a page, you'll need to include a
    script in that page to find the text and convert it back. Elsewise, PHP
    can decode http-url-encoded text with, I think, a function conveniently
    named.... unescape() :)
     
    AndrewTK, Mar 27, 2006
    #2
    1. Advertising

  3. Doc

    RobG Guest

    AndrewTK wrote:
    > If I understand, your problem is:
    > -user enters "mes vacances cet été" for example
    > -"é" will become &#(something);
    > -you want to count é as probably "" length = 6 (if the DB
    > generates a 3 digit number)
    >
    > Solution (amongst many):
    >
    > on submit, pass the *textarea* to this function:
    >
    > esc(the_textarea) {the_textarea.value = escape(thetextarea.value);}


    If that is what is required (and I'm not sure it is), use
    encodeURIComponent() to count characters as that pretty much emulates what
    will be done to the textarea value when the form is submitted.

    But ultimately what is stored in the DB is up to the server, not the client.

    [...]


    --
    Rob
     
    RobG, Mar 27, 2006
    #3
  4. Doc wrote:

    > I'm experiencing a little problem counting the number of characters in
    > a textarea on a html page.
    >
    > This is the content type of my HTML document
    > content="text/html; charset=iso-8859-1"


    Most certainly it is not. The content type (here better: encoding,
    referring only to the `charset' label) is specified by the HTTP header
    Content-Type which takes precedence over any declaration with the meta
    element.

    > I tried counting the characters with a javascript function, but it
    > doesn't work with these special characters as 'v' is stored in the
    > DB as '√'


    Information should be stored independently of the output medium (I
    suggest storing "sqrt(...)" instead), and you do not have to use
    the character reference if you declare the correct encoding.


    PointedEars
     
    Thomas 'PointedEars' Lahn, Mar 27, 2006
    #4
  5. Doc

    Doc Guest

    Thanks Andrew, Rob and PointedEars!

    But the escape() or encodeURIComponent() functions converts my
    characters to a '%..' format, and I can't use that as the data is
    accessed by other means (extracted to XML format, ...), and I can't do
    the unescape() or decodeURIComponent().

    The 'é' is stored as 'é', it is only some special characters that are
    stored in a '' format, and the aim is to be able to store all
    these special characters (such as square root or infinite or ...). The
    characters stored in this format don't need any conversion when
    rendering the pages or creating the XML document. I'd just like to find
    a function to convert the 'special' characters to this format so I can
    count them as they are to be stored in the DB.

    Is my encoding wrong? I don't understand the difference between the
    HTTP header and the content type, I thought it was the same thing...
    Well I thought <meta http-equiv="Content-Type" content="text/html;
    charset=iso-8859-1"> was the HTTP header and defined the encoding...
    Can you explain what you meant PointedEars please?

    Thanks for your help!
     
    Doc, Mar 28, 2006
    #5
  6. Doc wrote:

    > But the escape() or encodeURIComponent() functions converts my
    > characters to a '%..' format, and I can't use that as the data is
    > accessed by other means (extracted to XML format, ...), and I can't
    > do the unescape() or decodeURIComponent().


    Then don't. Where is the problem?

    > The 'é' is stored as 'é', it is only some special characters that are
    > stored in a '' format, and the aim is to be able to store all
    > these special characters (such as square root or infinite or ...).


    Use a Unicode Transformation Format instead of US-ASCII, ISO-8859-x, or
    Windows-125x.

    > The characters stored in this format don't need any conversion when
    > rendering the pages or creating the XML document.


    As I said, dependencies on the output should be avoided when storing
    information in a database. For example, do not store "√"; store
    "√" instead.

    > I'd just like to find a function to convert the 'special' characters to
    > this format so I can count them as they are to be stored in the DB.


    This should be done server-side, not client-side. Are you using server-side
    J(ava)Script?

    > Is my encoding wrong?


    Maybe. Note that the encoding declared specifies primarily how the content
    is encoded, not what characters are allowed to be displayed. The HTML
    Document Character Set used for character references (like √) is the
    Universal Character Set (ISO/IEC 10646), which is character-by-character
    equivalent to Unicode 3.0.

    > I don't understand the difference between the HTTP header and the content
    > type, I thought it was the same thing... Well I thought <meta
    > http-equiv="Content-Type" content="text/html; charset=iso-8859-1"> was the
    > HTTP header and defined the encoding...


    No, it is not. And it does not, unless the `charset' label is missing
    from the Content-Type HTTP header (which is recommended against anyway.
    Interpretation of meta[http-equiv] elements is not mandatory as per
    HTML 4.01, so this is not an interoperable approach.).

    > Can you explain what you meant PointedEars please?


    <URL:http://en.wikipedia.org/wiki/HTTP>

    Please take heed of

    <URL:http://jibbering.com/faq/faq_notes/pots1.html>
    <URL:http://www.safalra.com/special/googlegroupsreply/>

    with your next posting.


    PointedEars
     
    Thomas 'PointedEars' Lahn, Mar 29, 2006
    #6
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Crazy Monkey
    Replies:
    1
    Views:
    5,460
    Crazy Monkey
    Jan 21, 2005
  2. Stefan Mueller
    Replies:
    3
    Views:
    33,045
    Stefan Mueller
    Jul 23, 2006
  3. Replies:
    2
    Views:
    1,095
    Ingo Menger
    May 31, 2007
  4. rvino
    Replies:
    0
    Views:
    4,661
    rvino
    Aug 14, 2007
  5. majna
    Replies:
    4
    Views:
    678
    Thomas 'PointedEars' Lahn
    Sep 19, 2007
Loading...

Share This Page