ASP converts Unicode Chars to HTML entities?

Discussion in 'ASP General' started by Beat Richli, Sep 5, 2005.

  1. Beat Richli

    Beat Richli Guest

    Hello

    i have following problem with ASP (using Interdev, Win2003 Server): if a
    special character is entered in a textbox, ASP or the Client Browser (IE 6)
    seems to convert this character in HTML entities.
    eg characters on this site:
    http://unicode.e-workers.de/kyrillisch.php

    come back as eg &#1051 . i'm not shure, where exactly this happens. it
    doesn't happen on ASP.NET sites though. the top of those documents looks
    like this:

    <!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
    <html lang="de" >
    <head>
    <meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf8">
    <title>Beorda AG - Account Detail</title>
    </head>
    ......


    does anybody know how to avoid this? basically i'll need a utf8 postback i
    guess. or i convert the entities to unicode before storing the values in the
    database.

    thanks for your hints

    beat
     
    Beat Richli, Sep 5, 2005
    #1
    1. Advertising

  2. Beat Richli wrote:


    > i have following problem with ASP (using Interdev, Win2003 Server): if a
    > special character is entered in a textbox, ASP or the Client Browser (IE 6)
    > seems to convert this character in HTML entities.
    > eg characters on this site:
    > http://unicode.e-workers.de/kyrillisch.php
    >
    > come back as eg &#1051 . i'm not shure, where exactly this happens.


    Browsers have a tendency to do that if encodings are not properly
    declared and have to be guessed or even if an encoding is properly
    declared but characters the user enters are not representable in the
    declared encoding. See
    <http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html>
    If for instance your HTML document is encoded as ISO-8859-1 and then a
    user enters the character "Л" in a form then browsers indeed pass that
    on as %26%231051%3B which ASP would then decode as %26 for the character
    '&', %23 for the character '#', the unencoded sequence of digits 1051
    and as %3B as the character ';' which ends up as the string
    'Л'
    in your ASP Request.Form or Request.QueryString.

    Thus one way to make sure the browser submits a properly encoded
    character and not an encoded HTML character reference is to author the
    HTML documents in the encoding UTF-8 and declare that properly, e.g. at
    least with a
    <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    in the head of the document or even better by having the HTTP server
    configured to send the HTTP response header
    Content-Type: text/html; charset=UTF-8
    That way the browser will then for instance encode the entered 'Л' as
    '%D0%9B'.

    ASP pages can also be authored using UTF-8 by using and indicating the
    corresponding code page 65001 e.g.
    <%@ Language="VBScript" CodePage="65001" %>

    --

    Martin Honnen --- MVP XML
    http://JavaScript.FAQTs.com/
     
    Martin Honnen, Sep 7, 2005
    #2
    1. Advertising

  3. Beat Richli

    Beat Richli Guest

    "Martin Honnen" <> schrieb im Newsbeitrag
    news:...
    >
    >
    > Beat Richli wrote:
    >
    >
    >> i have following problem with ASP (using Interdev, Win2003 Server): if a
    >> special character is entered in a textbox, ASP or the Client Browser (IE
    >> 6) seems to convert this character in HTML entities.
    >> eg characters on this site:
    >> http://unicode.e-workers.de/kyrillisch.php
    >>
    >> come back as eg &#1051 . i'm not shure, where exactly this happens.

    >
    > Browsers have a tendency to do that if encodings are not properly declared
    > and have to be guessed or even if an encoding is properly declared but
    > characters the user enters are not representable in the declared encoding.
    > See
    > <http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html>
    > If for instance your HTML document is encoded as ISO-8859-1 and then a
    > user enters the character "?" in a form then browsers indeed pass that on
    > as %26%231051%3B which ASP would then decode as %26 for the character '&',
    > %23 for the character '#', the unencoded sequence of digits 1051 and as
    > %3B as the character ';' which ends up as the string
    > 'Л'
    > in your ASP Request.Form or Request.QueryString.
    >
    > Thus one way to make sure the browser submits a properly encoded character
    > and not an encoded HTML character reference is to author the HTML
    > documents in the encoding UTF-8 and declare that properly, e.g. at least
    > with a
    > <meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
    > in the head of the document or even better by having the HTTP server
    > configured to send the HTTP response header
    > Content-Type: text/html; charset=UTF-8
    > That way the browser will then for instance encode the entered '?' as
    > '%D0%9B'.
    >
    > ASP pages can also be authored using UTF-8 by using and indicating the
    > corresponding code page 65001 e.g.
    > <%@ Language="VBScript" CodePage="65001" %>
    >
    > --
    >
    > Martin Honnen --- MVP XML
    > http://JavaScript.FAQTs.com/



    thanks a lot Martin. i will check the site again using this information.

    greets
    beat
     
    Beat Richli, Sep 7, 2005
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Steven D'Aprano

    Convert from unicode chars to HTML entities

    Steven D'Aprano, Jan 29, 2007, in forum: Python
    Replies:
    8
    Views:
    686
    Roberto Bonvallet
    Feb 8, 2007
  2. Laszlo Nagy

    convert html entities into real chars

    Laszlo Nagy, Apr 10, 2007, in forum: Python
    Replies:
    2
    Views:
    318
    Larry Bates
    Apr 10, 2007
  3. ldng
    Replies:
    3
    Views:
    1,958
    Tim Golden
    May 10, 2007
  4. Clodoaldo

    Unicode to HTML entities

    Clodoaldo, May 29, 2007, in forum: Python
    Replies:
    6
    Views:
    358
    Clodoaldo
    May 30, 2007
  5. Jim Higson
    Replies:
    3
    Views:
    247
    Eric Amick
    Jul 25, 2004
Loading...

Share This Page