Incorrect HttpWebResponse.CharacterSet

M

Martin Honnen

Leon_Amirreza said:
My Page contains this:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


i used this code to retrieve the page:

HttpWebResponse webresp = (HttpWebResponse)(webrq.GetResponse());

the value of the some properties of webreps are as follows:

ContentType = "text/html"
ContentEncoding = ""
CharacterSet = "ISO-8859-1"

Is that a public URL? Then post it so that we can simply check the HTTP
headers (as any HTTP Content-Type response header is more important than
that meta element, in particular for the .NET HttpWebResponse which will
hardly have any code to look at HTML meta elements. A browser might look
at the meta element, but not a generic HTTP API like .NET's
HttpWebRequest/Response).
 
M

Martin Honnen

Leon_Amirreza said:
I used the debugger to list the

HttpWebResponse.Headers

It is as follows:

{Accept-Ranges: bytes
Content-Length: 8609
Content-Type: text/html
Date: Sun, 24 Sep 2006 17:32:55 GMT
ETag: "3a51249e792c61:900"
Last-Modified: Sun, 18 Jun 2006 14:55:48 GMT
Server: Microsoft-IIS/5.1
X-Powered-By: ASP.NET

That means the server does not tell anything about the encoding/charset
of the response. I am not sure whether that CharacterSet property takes
on some default, the documentation does not tell:
<http://msdn.microsoft.com/library/d...mNetHttpWebResponseClassCharacterSetTopic.asp>

As said, the HttpWebResponse does probably not have any support for
checking the meta element of a HTML document it receives. If that
text/html is on your server then you might want to have the server send
a charset parameter for the Content-Type header.
 
L

Leon_Amirreza

My Page contains this:

<meta http-equiv="Content-Type" content="text/html; charset=utf-8">


i used this code to retrieve the page:

HttpWebResponse webresp = (HttpWebResponse)(webrq.GetResponse());

the value of the some properties of webreps are as follows:

ContentType = "text/html"
ContentEncoding = ""
CharacterSet = "ISO-8859-1"

can any1 say why CharacterSet is ISO instead of utf-8?
how can i fix this?

Info:
the page is served by IIS (on WinXp SP2)
IE shows the page correctly but

textBoxHTMLSource.Text =
Encoding.GetEncoding(webresp.CharacterSet).GetString(data, 0, len);

the textbox shows characters incorrectly (data is recieved through
webresp.GetStream())
 
L

Leon_Amirreza

no unfortunately its not a public url actually its on my winxp.
I used the debugger to list the

HttpWebResponse.Headers

It is as follows:

{Accept-Ranges: bytes
Content-Length: 8609
Content-Type: text/html
Date: Sun, 24 Sep 2006 17:32:55 GMT
ETag: "3a51249e792c61:900"
Last-Modified: Sun, 18 Jun 2006 14:55:48 GMT
Server: Microsoft-IIS/5.1
X-Powered-By: ASP.NET

}
 
J

Joerg Jooss

Thus wrote Leon_Amirreza,
i used an ASP.NET line of code to have server send charset in
ContentType but according to IIS Documentation these two lines of code
are equivalent:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html;
CHARSET=windows-1251">

<% Response.Charset = "windows-1252" %>

That's plain wrong. First of all, Windows-1251 and Windows-1252 are different
encodings (that's a silly error in the IIS docs).

But more importantly, HTTP doesn't care about any HTML content, so specifying
a META tag works for clients that parse HTML (i.e. web browsers), but not
for HTTP intermediaries that process only HTTP headers. Setting a META tag
has no effect on HTTP headers in ASP.NET.

Next, all but forget about using the HttpResponse.Charset property -- that's
ASP programming.

The proper way of specifying a character encoding in ASP.NET is to either
1) use the default response encoding specified in your web.config's <globalization
/> element, UTF-8 by default
2) declare a ResponseEncoding in your page directive: <%@ Page Language="C#"
ResponseEncoding="ISO-8859-1" %>
3) declare a CodePage in your page directive: <%@ Page Language="C#" CodePage="28591"
%>
4) programmatically set the HttpResponse.ContentEncoding property: Response.ContentEncoding
= Encoding.GetEncoding(28591);

The only difference between 2) and 3) is that 2) relies on character encoding
(i.e. IANA) names, whereas 3) uses numeric Win32 codepage identifiers as
per http://msdn.microsoft.com/library/default.asp?url=/library/en-us/intl/unicode_81rn.asp

All of the above implicitly set the HttpResponse.Charset to its appropriate
value (with one caveat -- see below). Vice versa, the same is *not* true.
Setting HttpResponse.Charset has no effect on the actual encoding applied
to the output stream. It's just a string that ends up in the Content-Type
header. Thus, you should never set Charset in your code (but see below**).

Morale of the story: ASP.NET sets the appropriate Content-Type header that
includes a charset attribute, unless you break stuff ;-)

For static content (i.e. that is directly served by IIS) none of the above
applies. See http://www.w3.org/International/O-HTTP-charset for how to set
the appropriate Content-Type in these case.

**
When to set Charset: There's one esoteric case I'm aware of when you *do*
want to set HttpResponse.Charset in your code, and that's when you're using
UTF-16BE as response encoding. In this case, ASP.NET uses an invalid IANA
name, which is not recognized by all browsers. It sets "unicodeFFFE" instead
of "utf-16be".

Cheers,
 
L

Leon_Amirreza

i used an ASP.NET line of code to have server send charset in ContentType
but according to IIS Documentation these two lines of code are equivalent:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=windows-1251">
<% Response.Charset = "windows-1252" %>Title : Setting the Code Page for
String Conversionsurl:
http://localhost/iishelp/iis/htm/asp/eadg6e7n.htmBut using the meta tag does
not seem to send charset in the ContentType of HTTP Header!Is this a problem
in my Web Server configuration or these 2 line of code are not actually the
 
L

Leon_Amirreza

i used an ASP.NET line of code to have server send charset in ContentType
but according to IIS Documentation these two lines of code are equivalent:

<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=windows-1251">

<% Response.Charset = "windows-1252" %>

Title : Setting the Code Page for String Conversions

url: http://localhost/iishelp/iis/htm/asp/eadg6e7n.htm

But using the meta tag does not seem to send charset in the ContentType of
HTTP Header!
Is this a problem in my Web Server configuration or these 2 line of code are
not actually the
same?!
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,057
Latest member
KetoBeezACVGummies

Latest Threads

Top