ASP converts Unicode Chars to HTML entities?

B

Beat Richli

Hello

i have following problem with ASP (using Interdev, Win2003 Server): if a
special character is entered in a textbox, ASP or the Client Browser (IE 6)
seems to convert this character in HTML entities.
eg characters on this site:
http://unicode.e-workers.de/kyrillisch.php

come back as eg &#1051 . i'm not shure, where exactly this happens. it
doesn't happen on ASP.NET sites though. the top of those documents looks
like this:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html lang="de" >
<head>
<meta HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=utf8">
<title>Beorda AG - Account Detail</title>
</head>
......


does anybody know how to avoid this? basically i'll need a utf8 postback i
guess. or i convert the entities to unicode before storing the values in the
database.

thanks for your hints

beat
 
M

Martin Honnen

Beat Richli wrote:

i have following problem with ASP (using Interdev, Win2003 Server): if a
special character is entered in a textbox, ASP or the Client Browser (IE 6)
seems to convert this character in HTML entities.
eg characters on this site:
http://unicode.e-workers.de/kyrillisch.php

come back as eg &#1051 . i'm not shure, where exactly this happens.

Browsers have a tendency to do that if encodings are not properly
declared and have to be guessed or even if an encoding is properly
declared but characters the user enters are not representable in the
declared encoding. See
<http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html>
If for instance your HTML document is encoded as ISO-8859-1 and then a
user enters the character "Л" in a form then browsers indeed pass that
on as %26%231051%3B which ASP would then decode as %26 for the character
'&', %23 for the character '#', the unencoded sequence of digits 1051
and as %3B as the character ';' which ends up as the string
'Л'
in your ASP Request.Form or Request.QueryString.

Thus one way to make sure the browser submits a properly encoded
character and not an encoded HTML character reference is to author the
HTML documents in the encoding UTF-8 and declare that properly, e.g. at
least with a
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
in the head of the document or even better by having the HTTP server
configured to send the HTTP response header
Content-Type: text/html; charset=UTF-8
That way the browser will then for instance encode the entered 'Л' as
'%D0%9B'.

ASP pages can also be authored using UTF-8 by using and indicating the
corresponding code page 65001 e.g.
<%@ Language="VBScript" CodePage="65001" %>
 
B

Beat Richli

Martin Honnen said:
Beat Richli wrote:



Browsers have a tendency to do that if encodings are not properly declared
and have to be guessed or even if an encoding is properly declared but
characters the user enters are not representable in the declared encoding.
See
<http://ppewww.ph.gla.ac.uk/~flavell/charset/form-i18n.html>
If for instance your HTML document is encoded as ISO-8859-1 and then a
user enters the character "?" in a form then browsers indeed pass that on
as %26%231051%3B which ASP would then decode as %26 for the character '&',
%23 for the character '#', the unencoded sequence of digits 1051 and as
%3B as the character ';' which ends up as the string
'Л'
in your ASP Request.Form or Request.QueryString.

Thus one way to make sure the browser submits a properly encoded character
and not an encoded HTML character reference is to author the HTML
documents in the encoding UTF-8 and declare that properly, e.g. at least
with a
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8">
in the head of the document or even better by having the HTTP server
configured to send the HTTP response header
Content-Type: text/html; charset=UTF-8
That way the browser will then for instance encode the entered '?' as
'%D0%9B'.

ASP pages can also be authored using UTF-8 by using and indicating the
corresponding code page 65001 e.g.
<%@ Language="VBScript" CodePage="65001" %>


thanks a lot Martin. i will check the site again using this information.

greets
beat
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,020
Latest member
GenesisGai

Latest Threads

Top