? in XMLHttpRequest.ResponseText

E

Elizabeth

I'm fetching some HTML files with XMLHttpRequest and dumping the
ResponseText into block elements; works fine except that single and double
quotes are being displayed as question marks (inside of a black diamond in
FireFox)

What's going on ? What is the workaround ? I've tried this:

divElement.innerHTML = x.responseText.replace(/\?/g, "'")

but it does nothing ... even if it did work it would not be distinguishing "
from '

Thanks for any help ...

-E-
 
T

TheBagbournes

Elizabeth said:
I'm fetching some HTML files with XMLHttpRequest and dumping the
ResponseText into block elements; works fine except that single and double
quotes are being displayed as question marks (inside of a black diamond in
FireFox)

It sounds like the server is telling the browser the wrong character set
in it's response headers.

If you are in control of the server (ie, you're writing the
servlet/script which produces the text on the server), ensure that the
"Content-Type" header has the appropriate charset attribute.

eg

Content-Type:text/html;charset:UTF-8

or whatever the char set is.
 
E

Elizabeth

It sounds like the server is telling the browser the wrong character set
in it's response headers.

If you are in control of the server (ie, you're writing the
servlet/script which produces the text on the server), ensure that the
"Content-Type" header has the appropriate charset attribute.

The Content-Type charset header is UTF-8 ... that doesn't appear to be the
problem; I'm still baffled by it. Thanks for replying and please let me
know if you have any other ideas/suggestions ..

-E-
 
V

VK

Elizabeth said:
The Content-Type charset header is UTF-8 ... that doesn't appear to be the
problem; I'm still baffled by it. Thanks for replying and please let me
know if you have any other ideas/suggestions ..

My wild guess is that you're retrieving a page with "typography" quotes
in UTF-8 Embedded or Microsoft Embedded form.

Open the file and check that all quotes in the text are escaped
properly: &quote; and '

If you see any "typography" quotes (thus left quote differs from the
right one) or some strange entities like \“ or \” then I
bet you've found the reason.

Not a JavaScript problem though.
 
E

Elizabeth

VK said:
My wild guess is that you're retrieving a page with "typography" quotes
in UTF-8 Embedded or Microsoft Embedded form.

Thanks. Your wild guess is correct. I guess someone had given me that text
from a word processor file. But why does it display correctly when I just
navigate to the HTML file in the browser and NOT when I GET it via
XMLHttpRequest and assign it to <block>.innerHTML ?

-E-
 
E

Elizabeth

Maybe because the HTML is parsed when you assign it with innerHTML.

maybe, but what piece of code is doing the parsing ? and can't understand
the "typography quotes" ?
 
V

VK

Elizabeth said:
text from a word processor file. But why does it display correctly when I
just navigate to the HTML file in the browser and NOT when I GET it via

For a number of reasons which would take a page to lay out, plus I
might switch on prophanity words - as it happens every time oI need to
explain the Unicode.org production.

To save you ears :) I say only the basic. UTF-8 is not something to
display - it is an encoding to deliver milti-byte chars as single-byte
sequences to be transformed back into Unicode characters on the
recipient side. With direct server <=> browser stream UTF-8 is being
parsed by browser HTML parser which is aware of Unicode-16 as well as
of Unicode-24 and other Unicode.org sick fantasies. With any ajaxoid
this stream is being parsed by JavaScript Unicode-16 parser. That
leaves the hell's doors wide open, and not only for quotes.

As a side note: you still can have typography quotes in your script by
using Unicode-16 \u201C and \u201D escape sequences.
 
V

VK

VK said:
As a side note: you still can have typography quotes in your script by
using Unicode-16 \u201C and \u201D escape sequences.

Also: "?" sign you see on the page is not the question mark, so no use
to look for it in RegExp. It's Unicode Replacement Character entity
(FFFD) which is used in lieu of any unrecognised character.
 
T

Thomas 'PointedEars' Lahn

VK said:
To save you ears :) I say only the basic. UTF-8 is not something to
display - it is an encoding to deliver milti-byte chars as single-byte
sequences to be transformed back into Unicode characters on the
recipient side.

Nonsense. The Unicode characters U+0000 to U+007F are encoded with one
UTF-8 code unit, therefore one byte.
[snipped further nonsense]

Read <http://unicode.org/faq/>. NOW.


PointedEars
 
E

Elizabeth

VK said:
For a number of reasons which would take a page to lay out, plus I
might switch on prophanity words - as it happens every time oI need to
explain the Unicode.org production.

To save you ears :) I say only the basic. UTF-8 is not something to
display - it is an encoding to deliver milti-byte chars as single-byte
sequences to be transformed back into Unicode characters on the
recipient side. With direct server <=> browser stream UTF-8 is being
parsed by browser HTML parser which is aware of Unicode-16 as well as
of Unicode-24 and other Unicode.org sick fantasies. With any ajaxoid
this stream is being parsed by JavaScript Unicode-16 parser. That
leaves the hell's doors wide open, and not only for quotes.


huh ... guess I don't really understand why a string assignment needs to be
parsed ... or do you consider "parse" to be a synonym for "decode" ? I
guess that might be a form of parsing ... hadn't really thought about it ...

you don't need to spare the profanity on my account; I'm not politically
correct ... it keeps you from learning things

-E-
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top