trying to send 8 bit chars under IIS6

M

Mark J. McGinty

Greets,

Part of the content of one of our web pages uses wingdings and Chr(239)
through Chr(242) (which are little arrow outlines, though that's not really
important.)

It worked just fine in Windows 2000 Server, but now under Server 2003 it
seems that characters above 127 get converted somehow, and our code no
longer produces the desired effect.

Does anyone know how to make it send our content without modification, or
how to encode it in a way that it makes it out to the browser with the
intended character value (as opposed to some thoroughly useless conversion
to a 7 bit value)?

tia,
Mark
 
D

David Wang [Msft]

IIS6 itself does not do any such conversion, so I am not certain the issue
has to do with 8bit characters.

Page Frameworks like ASP or the web browser (based on
Content-Encoding/Language hints from the response) can do such conversion,
but those are parameters you need to control if you wish your page to be
consistently interpreted.

Can you describe what codepage your ASP page is configured to be interpreted
as, and whether you send any additional response headers that may affect how
the browser interprets your response?

Because in order for the little arrow outlines to display properly, these
two things have to happen:
1. The response entity body must contain characters whose character code is
239-242
2. The browser must choose a code page (based on response headers) which
selects a font which maps little arrow outlines to character codes 239-242

You must ensure those two things happen; neither IIS nor the browser can
make it automagically happen.

--
//David
IIS
http://blogs.msdn.com/David.Wang
This posting is provided "AS IS" with no warranties, and confers no rights.
//
Greets,

Part of the content of one of our web pages uses wingdings and Chr(239)
through Chr(242) (which are little arrow outlines, though that's not really
important.)

It worked just fine in Windows 2000 Server, but now under Server 2003 it
seems that characters above 127 get converted somehow, and our code no
longer produces the desired effect.

Does anyone know how to make it send our content without modification, or
how to encode it in a way that it makes it out to the browser with the
intended character value (as opposed to some thoroughly useless conversion
to a 7 bit value)?

tia,
Mark
 
M

Mark J. McGinty

David Wang said:
IIS6 itself does not do any such conversion, so I am not certain the issue
has to do with 8bit characters.

Page Frameworks like ASP or the web browser (based on
Content-Encoding/Language hints from the response) can do such conversion,
but those are parameters you need to control if you wish your page to be
consistently interpreted.

Thanks for your reply.

It is definitely a server side issue, I looked at the response using
Ethereal, before it reaches the browser. I also tried sending characters
with values over 128 using the default font, as a reality check. Characters
that use the 8th bit are being transformed somewhere in between the VBS/ASP
script -- which does a Response.Write(Chr(239)) -- and the requesting
client's TCP socket.

Can you describe what codepage your ASP page is configured to be
interpreted
as, and whether you send any additional response headers that may affect
how
the browser interprets your response?

It was implicit, we also tried explicitly setting it to ANSI and to UTF-8.
(Yes I saved the script source files as UTF-8 after adding the @CodePage
directive.) We also added the DTD that VS7 generates for new HTML
documents. The content type is ASP's default, HTML Document according to
IE. Ethereal confirms it, the type is text/html.

Because in order for the little arrow outlines to display properly, these
two things have to happen:
1. The response entity body must contain characters whose character code
is
239-242

Therein lies a big part of the problem, because those character values are
not being sent as written to the response context.

2. The browser must choose a code page (based on response headers) which
selects a font which maps little arrow outlines to character codes 239-242

We set the font via CSS for just the elements that display these characters.
We surely do not want the entire page to use wingdings (it would be
extremely difficult to read that way.) Wingdings is installed by default on
all Windows machines since Windows 95 iirc.

Further, as I stated, the value of the characters in question has been
altered by the time the content makes it to the browser. The page *is*
displaying wingdings characters where we expect it to, they just are not the
correct wingdings characters... because their value has been altered or
transformed in some strange way.

You must ensure those two things happen; neither IIS nor the browser can
make it automagically happen.

I surely didn't expect that from either of them, and as I stated, this same
code worked perfectly in Win2K/IIS5. It's not something we set out on a
lark to do, and wistfully hoped it would miraculously happen, we have
worling code already deployed on next-to-latest major release of IIS. So I
suspect it has something to do with IIS6 MIME type handling "enhancements"

In practice I've all but decided to just say the hell with it, and replace
the wingding characters with images. Yes it will add a few KB to the size
of the content, and a number of extra requests will be generated by the page
as it renders (even cached images generate an HTTP request per instance of
the image, delivering them inline as individual characters incurs much less
client request overhead) but nobody with broadband will ever know the
difference.

Even so I'd love to get to the bottom of this, because I saw a loosely
related issue (involving XML) in another NG. If it alters any XML, it won't
be so easy to work around.


-Mark
 
M

Mark Schupp

Create the simplest page that you can which reproduces the problem and post
the entire code here.
 
M

Mark J. McGinty

Mark Schupp said:
Create the simplest page that you can which reproduces the problem and
post the entire code here.

Strangely, when I tried to fo that, the problem disappeared. Worse, I
noticed that the same characters were being displayed correctly in another
frame of the same frameset!

I did some more research and found an explanation (though that it's
implemented like this is a mind-blower to me):

--------------
Literal strings in a script are still encoded by using @CodePage (if
present) or the AspCodePage metabase property value (if set), or the system
ANSI code page. If you set Response.CodePage or Session.CodePage explicitly,
do so before sending nonliteral strings to the client. If you use literal
and nonliteral strings in the same page, make sure the code page of
@CodePage matches the code page of Response.CodePage,
--------------

The above is an excerpt from this page:

http://msdn.microsoft.com/library/d...html/268f1db1-9a36-4591-956b-d7269aeadcb0.asp

We haven't been setting any codepage. I had thought we tried to set it as a
possible solution using @codepage and saving the source as utf-8, but I see
I didn't even come close, there are included files, and I did not imagine
I'd *need* to set it elsewhere to make it effective.

So to cause my pages to be output as utf-8 I need all of this:

<% @CodePage = 65001 Language=VBScript %>
<%
Response.CharSet = "utf-8"
Response.CodePage = 65001
%>
[...]
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; CHARSET=utf-8">

AND the source file must be saved as utf-8, and any included files should be
saved as utf-8 also? That's straight nuts!

Also the doc contradicts itself, in one spot it says there can be only one
codepage (which seems reasonable) but in another it says, "...or the literal
strings are encoded differently from the nonliteral strings and display
incorrectly." If there was only one codepage, all output would be encoded
accordingly.

Unreal! And this is supposed to be an improvement over IIS5? It did not
seem to suffer from the same potential for ambiguity.

What's more, ANSI is perfectly capable of displaying 8 bit characters. So
it seems to me that IIS6 goes to way too much trouble trying to ascertain
the codepage, when all we really wanted was ANSI by default. In the end,
without having any explicit codepage set, it makes wrong assumptions and
incorrectly encodes some characters.

Another irony, I tried saving the source as unicode, and it bitterly
complained... so all strings in VBS are unicode, the script engine that does
this is unable to read source files saved as unicode? So how could unicode
be native, does it read ANSI source files and convert to unicode?
Riiight.... Add one to the reasons I gotta call BS on the "Unicode Native to
VBS" myth.

So in the end I said to hell with it, and substituted some chars in webdings
that look close enough, and fit in 7 bits. It works.


Thanks for the reply,
Mark
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top