Strange diacritcs behaviour (IE issue)

T

Tom de Neef

My app is browser based. It starts with a html page displaying a family
structure. Click on a name and that person will be the focus in the
re-generated page. The re-generation is with JavaScript.
A name in the original page may contain a diacritic (u - umlaut, e - grave,
etc. They appear in the text as ANSI characters, not as ü codes.) They
come out OK.
But two clicks further, the diacritcs have changed into little squares.
I have experimented and simplified the code as follows (still showing this
behaviour):
Original html:

<HTML>
<HEAD><SCRIPT type="text/javascript" SRC="GGshowAll.js"></SCRIPT></HEAD>
<BODY>
<SCRIPT type="text/javascript">
document.write('<HTML><BODY>'+ShowNearbyFamily(2070)+'</BODY></HTML>')
</SCRIPT>
</BODY></HTML>

Function ShowNearbyFamily(index) is defined in the external 'js' file. The
OnClick in the file's code which must result in the redo of the page calls
it again via Refocus(index):

function Refocus(index)
{ var txt =
'<HTML>'+
'<HEAD><SCRIPT type="text/javascript"
SRC="GGshowAll.js"></SCRIPT></HEAD>'+
'<BODY>'+
ShowNearbyFamily(index)+
'</BODY></HTML>'
document.write(txt)
document.close()
}

A first click is in the original html. Its result is still OK. The next
click is in a page produced by the script. Its result is without the correct
diacritics in IE. It is OK in FF.
The (IE) source of the generated page shows a ? for any diacritic.
I do not understand this. In all cases the output is generated by the same
function (ShowNearbyFamily). It just takes the data from a list. I have no
idea where to start looking for a solution. I tried extra's like charset
definitions in the meta tags, document type definitions before the HTML tag.
Any ideas will be very much appreciated.
Tom
 
V

VK

My app is browser based. It starts with a html page displaying a family
structure. Click on a name and that person will be the focus in the
re-generated page. The re-generation is with JavaScript.
A name in the original page may contain a diacritic (u - umlaut, e - grave,
etc. They appear in the text as ANSI characters, not as &uuml; codes.) They
come out OK.
But two clicks further, the diacritcs have changed into little squares.
I have experimented and simplified the code as follows (still showing this
behaviour):
Original html:

<HTML>
<HEAD><SCRIPT type="text/javascript" SRC="GGshowAll.js"></SCRIPT></HEAD>
<BODY>
<SCRIPT type="text/javascript">
document.write('<HTML><BODY>'+ShowNearbyFamily(2070)+'</BODY></HTML>')
</SCRIPT>
</BODY></HTML>

OK, and who will provide Content-type for script-generated pages?
Server cannot do it, so it is up to you:

document.write('<html>'.concat(
'<meta http-equiv="Content-Type" content="text/html;',
'charset=iso-8859-1">', theRestOfContent));
 
T

Thomas 'PointedEars' Lahn

Tom said:
My app is browser based. It starts with a html page displaying a family
structure. Click on a name and that person will be the focus in the
re-generated page. The re-generation is with JavaScript.
A name in the original page may contain a diacritic (u - umlaut, e - grave,
etc. They appear in the text as ANSI characters, not as &uuml; codes.)

Despite continued Microsoft misnaming, there is no such thing as an "ANSI
character", as there is also no "ANSI character set". I have explained that
here before, search the newsgroup or the Web.
They come out OK.
But two clicks further, the diacritcs have changed into little squares.
I have experimented and simplified the code as follows (still showing this
behaviour):
Original html:

<HTML>
<HEAD><SCRIPT type="text/javascript" SRC="GGshowAll.js"></SCRIPT></HEAD>
<BODY>
<SCRIPT type="text/javascript">
document.write('<HTML><BODY>'+ShowNearbyFamily(2070)+'</BODY></HTML>')
</SCRIPT>
</BODY></HTML>

Both your generating and your generated markup would be invalid, so strange
behaviour should come as a surprise.

http://validator.w3.org/

Besides:
[...]
function Refocus(index)
{ var txt =
'<HTML>'+
'<HEAD><SCRIPT type="text/javascript"
SRC="GGshowAll.js"></SCRIPT></HEAD>'+
'<BODY>'+
ShowNearbyFamily(index)+
'</BODY></HTML>'
document.write(txt)
document.close()
}

this generates a temporary (invalid) document that lacks all information
about character encoding, while replacing the existing one (in viewport and
memory). With this not recommended approach, there is only one thing you an
try: generate the following element.

<meta http-equiv="Content-Type" content="text/html: charset=...">

Where the ellipsis has to be replaced with the designation of the correct
encoding.


PointedEars
 
T

Thomas 'PointedEars' Lahn

Thomas said:
Both your generating and your generated markup would be invalid, so strange
behaviour should come as a surprise.

Should *not*. (This is what happens when one rephrases a sentence and loses
track :))


PointedEars
 
T

Tom de Neef

Thomas 'PointedEars' Lahn said:
Both your generating and your generated markup would be invalid, so
strange
behaviour should not come as a surprise.

[...]
function Refocus(index)
{ var txt =
'<HTML>'+
'<HEAD><SCRIPT type="text/javascript"
SRC="GGshowAll.js"></SCRIPT></HEAD>'+
'<BODY>'+
ShowNearbyFamily(index)+
'</BODY></HTML>'
document.write(txt)
document.close()
}

this generates a temporary (invalid) document that lacks all information
about character encoding, while replacing the existing one (in viewport
and
memory). With this not recommended approach, there is only one thing you
an
try: generate the following element.

<meta http-equiv="Content-Type" content="text/html: charset=...">

Where the ellipsis has to be replaced with the designation of the correct
encoding.

Thank you.
In the full code I had a <!DOCTYPE ... transitional..> before the <HTML> tag
and a content-type META tag defining the charset in the <HEAD>. I Left these
out to get to the simplest code still showing the ugly behaviour. By doing
so I confused the issue, for which I am sorry. But also with the charset
defined (trying several charsets) the behaviour stays the same.

Without further suggestions the way out seems to be to use a conversion
table to change all diacritics chars into their corresponding html code
(like "&Egrave;" ).

Tom
 
T

Thomas 'PointedEars' Lahn

Tom said:
Without further suggestions the way out seems to be to use a conversion
table to change all diacritics chars into their corresponding html code
(like "&Egrave;" ).

It is certainly better to convert the characters to their numeric character
reference (NCR) instead of to the corresponding character entity reference,
whereas the latter may not even exist or may not be fully supported. Such a
method has been posted here, by others and me, several times before.


HTH

PointedEars
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,774
Messages
2,569,596
Members
45,143
Latest member
SterlingLa
Top