innerHTML = responseText generates invalid chars

R

rabbitrun

When I assign responseText to innerHTML, sometimes because of the
content Firefox would display invalid characters using ? as a
placeholder. This is especially true for single quotes from text
content coming from a windows machine vs unix machine.

Does anyone comes across this frequently? What can I can do to make
the characters display properly?
 
D

Doug Gunnoe

When I assign responseText to innerHTML, sometimes because of the
content Firefox would display invalid characters using ? as a
placeholder.  This is especially true for single quotes from text
content coming from a windows machine vs unix machine.

Does anyone comes across this frequently?  What can I can do to make
the characters display properly?

Can you post an example of the code giving you trouble?
 
M

Martin Honnen

When I assign responseText to innerHTML, sometimes because of the
content Firefox would display invalid characters using ? as a
placeholder. This is especially true for single quotes from text
content coming from a windows machine vs unix machine.

Does anyone comes across this frequently? What can I can do to make
the characters display properly?

Make sure the server sends a HTTP Content-Type response header with a
charset parameter e.g.
Content-Type: text/plain; charset=Windows-1252

Windows-1252 is only an example, obviously you need to set the encoding
there that your text is encoded with.
 
B

Bart Van der Donck

Martin said:
Make sure the server sends a HTTP Content-Type response header with a
charset parameter e.g.
Content-Type: text/plain; charset=Windows-1252

Windows-1252 is only an example, obviously you need to set the encoding
there that your text is encoded with.

Adding 'charset=' is always a good idea, but I would avoid the
proprietary Windows-1252, especially because the original poster
surmises it is a Windows/UNIX issue.

I suggest to add a character set at the following places:
1. HTTP header of the program that outputs the data:
Content-Type: text/html; charset=ISO-8859-1
2. Inside the output file itself:
<?xml version="1.0" encoding="ISO-8859-1"?>
3. HTTP header of the receiving HTML-file:
Content-Type: text/html; charset=ISO-8859-1
4. As <meta> in the header of the receiving HTML-file:
<meta http-equiv="Content-Type"
content="text/html; charset=ISO-8859-1">

The original poster mentions that it
"is especially true for
single quotes from text content". Since single quote is an ASCII-safe
character, it makes me think that it has something to do with the DTD
of the XML. '&apos;' is one of the five default character entities in
XML besides '&quot;', '&amp;', '&lt;' and '&gt;'.
 
R

rabbitrun

Actually the single quote is some sort of backward single quote.
There are also other characters not displaying correctly after
innerHTML = responseText.

I'll try to cover all the steps you noted above.



Martin said:
(e-mail address removed) wrote:
Make sure the server sends a HTTP Content-Type response header with a
charset parameter e.g.
Content-Type: text/plain; charset=Windows-1252
Windows-1252 is only an example, obviously you need to set the encoding
there that your text is encoded with.

Adding 'charset=' is always a good idea, but I would avoid the
proprietary Windows-1252, especially because the original poster
surmises it is a Windows/UNIX issue.

I suggest to add a character set at the following places:
1. HTTP header of the program that outputs the data:
Content-Type: text/html; charset=ISO-8859-1
2. Inside the output file itself:
<?xml version="1.0" encoding="ISO-8859-1"?>
3. HTTP header of the receiving HTML-file:
Content-Type: text/html; charset=ISO-8859-1
4. As <meta> in the header of the receiving HTML-file:
<meta http-equiv="Content-Type"
content="text/html; charset=ISO-8859-1">

The original poster mentions that it
"is especially true for
single quotes from text content". Since single quote is an ASCII-safe
character, it makes me think that it has something to do with the DTD
of the XML. '&apos;' is one of the five default character entities in
XML besides '&quot;', '&amp;', '&lt;' and '&gt;'.
 
B

Bart Van der Donck

Actually the single quote is some sort of backward single quote.
There are also other characters not displaying correctly after
innerHTML = responseText.

It would be great to see your both strings in order to compare them. I
think you're looking at a UTF-8 issue or (maybe a bit less likely) two
different encodings for the 128-256 code point range.

I guess you don't see Apostrophe (') or Grave Accent (`) since they
cannot be problematic if it's a character set issue; both are ASCII.
More likely it is Acute Accent (´) which can be displayed differently
under different non-Utf-8 sets, since it is above code point 128. If
any of the parties sends/expects UTF-8, Acute Accent is supposed to be
part of a multibyte-sequence and cannot stand on its own, therefor
maybe causing Firefox to show a question mark.

Please see the following table:
http://www.hri.org/fonts/unix/iso8859-1.gif
Does the upper half of the characters display correctly and the lower
half wrong ?
 
R

rabbitrun

Yes it is the Acute Accent! Sorry I didn't know how to call it other
than backward single apostrophe (my grammar is elementary).

Yes the upper half is fine and lower half is incorrect.

Moreoever, the ?'s come up where content has possessive nouns that
uses the Acute Accent.

My innerhtml content is coming from shtml files sitting on a linux
box. Mainly, I am doing a client-side include via Ajax of generated
static shtml files.


Thank you so much for responding to my thread. I appreciate your
help.
 
B

Bart Van der Donck

Yes it is the Acute Accent! Sorry I didn't know how to call it other
than backward single apostrophe (my grammar is elementary).

Yes the upper half is fine and lower half is incorrect.

Moreoever, the ?'s come up where content has possessive nouns that
uses the Acute Accent.

My innerhtml content is coming from shtml files sitting on a linux
box.  Mainly, I am doing a client-side include via Ajax of generated
static shtml files.

Thank you so much for responding to my thread.  I appreciate your
help.

It would be really helpful to see the outcome of your functions. But
the diagnosis should be clear. Just make sure you have the same
character set at both sides and it should be okay. It's probably a
UTF-8 versus ISO-8859-1 issue ("new versus old" conflict). And I think
you want ISO-8859-1 since you want code point 180 to be displayed as
Acute Accent (´).

Consider the following test (saved under ANSI):

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd">
<html>
<head>
<title>test</title>
<meta http-equiv="Content-Type"
content="text/html; charset=XXXXXXX">
</head>
<body>
´
</body>
</html>

And then replace XXXXXXX by

ISO-8859-1
UTF-8
KOI8-R
...

I think setting both resources to 'ISO-8859-1' should be the solution
(see 4 points in my previous article here).
 
T

Thomas 'PointedEars' Lahn

Bart said:
Martin said:
[...] What can I can do to make the characters display properly?
Make sure the server sends a HTTP Content-Type response header with a
charset parameter e.g.
Content-Type: text/plain; charset=Windows-1252

Windows-1252 is only an example, obviously you need to set the encoding
there that your text is encoded with.

Adding 'charset=' is always a good idea, but I would avoid the
proprietary Windows-1252, especially because the original poster
surmises it is a Windows/UNIX issue.

You overlook the fact that it has been registered at IANA since a while. So
it is hardly proprietary, and (therefore) it is well supported by Web user
agents on Unices.

http://www.iana.org/assignments/character-sets
I suggest to add a character set at the following places:
1. HTTP header of the program that outputs the data:
Content-Type: text/html; charset=ISO-8859-1
Recommended.

2. Inside the output file itself:
<?xml version="1.0" encoding="ISO-8859-1"?>

Bad idea. Forces MSHTML (and probably other HTML user agents) into
Quirks/Compatibility Mode. The XML declaration is not required
for a valid X(HT)ML document.
3. HTTP header of the receiving HTML-file:
Content-Type: text/html; charset=ISO-8859-1
Recommended.

4. As <meta> in the header of the receiving HTML-file:
<meta http-equiv="Content-Type"
content="text/html; charset=ISO-8859-1">

Seldom necessary, can be harmful. Useless in properly served (X)HTML.


PointedEars
 
B

Bart Van der Donck

Thomas 'PointedEars' Lahn said:
You overlook the fact that it has been registered at IANA since a while.  So
it is hardly proprietary, and (therefore) it is well supported by Web user
agents on Unices.

http://www.iana.org/assignments/character-sets


Bad idea.  Forces MSHTML (and probably other HTML user agents) into
Quirks/Compatibility Mode.  The XML declaration is not required
for a valid X(HT)ML document.


Seldom necessary, can be harmful.  Useless in properly served (X)HTML.

?

You miss a simple obvious fact. The intention is to make sure that the
character sets match; this means that they should be declared on the
places that I mentionned. Your arguments are not valid.

P.S. It is certainly not 'harmful', 'useless', 'bad idea'... to
specify the applicable charset in the HTML/XML file itself !
 
T

Thomas 'PointedEars' Lahn

[Full quote intended]
?

You miss a simple obvious fact. The intention is to make sure that the
character sets match; this means that they should be declared on the
places that I mentionned. Your arguments are not valid.

Quite the contrary. To make sure that the character sets match, a simple
declaration alone will not help. It is instead important that the resources
are actually encoded as they are declared. Of all the things you have
mentioned, only the HTTP headers are important with properly served markup.
P.S. It is certainly not 'harmful', 'useless', 'bad idea'... to
specify the applicable charset in the HTML/XML file itself !

Yes, it is. One could get the silly notion that it actually means something
for properly served markup, and I have already pointed out what it means
with markup not suited to the user agents it is being served to.


PointedEars
 
B

Bart Van der Donck

Thomas said:
To make sure that the character sets match, a simple declaration alone
will not help.  

Yes they will. Convince yourself:

http://www.dotinternet.be/temp/t1.pl

No charset in HTTP-header said:
It is instead important that the resources are actually encoded
as they are declared.  Of all the things you have mentioned, only
the HTTP headers are important with properly served markup.

Yes, that rule must always be followed.

Yes, it is.  One could get the silly notion that it actually means
something for properly served markup, and I have already pointed out
what it means with markup not suited to the user agents it is being
served to.

My gas line has two cranes; the main stop and a smaller stop in
between. You are saying that the smaller stop in between is 'useless',
'potentially harmful' and 'a bad idea' when the main stop is working
properly. It's simply wrong to think like that.

[*] With this test I don't want to advocate to leave out a charset in
the HTTP-header. I'm saying from the beginning that both <meta> and
HTTP-header should be used.
 
T

Thomas 'PointedEars' Lahn

Bart said:
Yes they will.

No, *it* won't.
Convince yourself:

http://www.dotinternet.be/temp/t1.pl

No charset in HTTP-header, only set in <meta> [*].

You want to meditate on the word "alone". You are trying to prove something
here that I did not even debate.
Yes, that rule must always be followed.

Therefore a declaration *alone* is not sufficient.
True = HTTP-headers are more important than <meta>.
Not true = only HTTP-headers are important.

You should read what I wrote, not what you wanted me to have written.
My gas line has two cranes; the main stop and a smaller stop in
between. You are saying that the smaller stop in between is 'useless',
'potentially harmful' and 'a bad idea' when the main stop is working
properly. It's simply wrong to think like that.

Not everything that is limping is an analogy. It *is* harmful to think that
a character set declaration in properly served markup, i.e. markup that is
served with a proper Content-Type header, would have any meaning. It is far
better to know about the importance of the HTTP headers and a BOM, and
forego the notion of an inline character set declaration entirely:

In properly served HTML, i.e. served with Content-Type header that has the
`charset' parameter to declare the encoding, the HTTP header takes
precedence over any other declaration, including the corresponding
meta[http-equiv] element.

In properly served XHTML (as application/xhtml+xml or another MIME media
type that triggers the XML parser), a `meta' element character set
declaration has no meaning at all; the markup has to be well-formed well
*before* parsing reaches that line.

In improperly served XHTML (as text/html), an XML declaration prior to the
DOCTYPE declaration forces the HTML user agent that is used then into
Quirks/Compatibility mode. But the XML declaration is unnecessary with
UTF-8, as that is the default, or UTF-16, as a BOM can take care of that.

In any of the aforementioned cases, there is no guarantee that a user agent
honors the XML declaration or `meta' element at all, much in contrast to the
HTTP Content-Type header.


PointedEars
 
B

Bart Van der Donck

Thomas said:
My gas line has two cranes; the main stop and a smaller stop in
between. You are saying that the smaller stop in between is 'useless',
'potentially harmful' and 'a bad idea' when the main stop is working
properly. It's simply wrong to think like that.

Not everything that is limping is an analogy.  It *is* harmful to think that
a character set declaration in properly served markup, i.e. markup that is
served with a proper Content-Type header, would have any meaning.  It isfar
better to know about the importance of the HTTP headers and a BOM, and
forego the notion of an inline character set declaration entirely:

In properly served HTML, i.e. served with Content-Type header that has the
`charset' parameter to declare the encoding, the HTTP header takes
precedence over any other declaration, including the corresponding
meta[http-equiv] element.

In properly served XHTML (as application/xhtml+xml or another MIME media
type that triggers the XML parser), a `meta' element character set
declaration has no meaning at all; the markup has to be well-formed well
*before* parsing reaches that line.

In improperly served XHTML (as text/html), an XML declaration prior to the
DOCTYPE declaration forces the HTML user agent that is used then into
Quirks/Compatibility mode.  But the XML declaration is unnecessary with
UTF-8, as that is the default, or UTF-16, as a BOM can take care of that.

In any of the aforementioned cases, there is no guarantee that a user agent
honors the XML declaration or `meta' element at all, much in contrast to the
HTTP Content-Type header.

That's an impressive air tree you build; but IMHO you don't see the
power of simplicity here. The rule of thumb is clear and simple:
(1) Set a charset-parameter in the HTTP Content-Type header.
(2) Set the same character set inside the XML/HTML file.

Nothing more to it. Will always be okay.

There is a practical concern though. It can hardly be expected from a
web designer to define a HTTP Content-Type header separately for
every .htm(l)-page. This would need to be done on per-file base in a
configuration file (.htaccess, httpd.conf and the like), because the
desired character set might differ through the pages.

Same goes for the web server; it cannot set 1 rule here; so it just
sets none; and thus passing the responsibility for the character set
to the <meta>-element in question. I don't think this is a such a bad
strategy. As my test page above shows, this works correctly across
browsers.
 
E

Eric B. Bednarz

Bart Van der Donck said:
Thomas 'PointedEars' Lahn wrote:

Oh but that’s wrong. Although the default character set for HTML is ISO
10646, you could specify a different one in the SGML declaration on a
per file basis.

(as the Dutch say, I’m pretending my nose bleeds :)

You couldn't do that with XHTML though.
Same goes for the web server; it cannot set 1 rule here; so it just
sets none; and thus passing the responsibility for the character set
to the <meta>-element in question. I don't think this is a such a bad
strategy. As my test page above shows, this works correctly across
browsers.

If you are lucky; a couple of my utf-8 encoded sites at one particular
host did only *not* break when they upgraded to Apache 2 because the
encoding had always send explicitly send on HTTP level.

IOW, your strategy is only safe if you administer the server.

Another problem is maintenance; I usually generate the HTTP encoding and
the META element value from the same configuration, and that cannot
really hurt. A good reason for actually including the META element is
because the HTTP encoding might get lost (e.g. by saving the file to
disk, or a dumb proxy).

How is that related to javascript? ;-)
 
T

Thomas 'PointedEars' Lahn

Eric said:
Oh but that’s wrong.

It is not. In HTML the Content-Type HTTP header is specified to take
precedence and it is implemented so.
Although the default character set for HTML is ISO 10646,

You are confusing the Document Character Set and the document encoding,
whereas the former, which is ISO 10646 or UCS-2 (like Unicode 4.0) only
describes the range of characters that can be displayed with an HTML document.

In practice, the default character encoding for HTML documents is
ISO-8859-1, as that is the HTTP 1.0 default. However, the HTML 4.01
Specification recommends to declare the character encoding explicitly anyway.
you could specify a different one in the SGML declaration on a
per file basis.

You could not. Although HTML is defined as an application of SGML, SGML
features as those are not allowed in a Valid HTML document. That is also
why you cannot use SGML declarations in HTML in order to declare an internal
subset that would declare previously undefined elements and attributes in
order to make the markup Valid.

I was referring to `<meta http-equiv="Content-Type" ...>', though, which
is rendered meaningless if a HTTP `Content-Type' header with a `charset'
parameter is present.
[...]
You couldn't do that with XHTML though.

It is exactly XHTML or any other application of XML where you could do that,
with an XML declaration --

<?xml version="1.0" encoding="UTF-7"?>

-- or a Byte Order Mark.

However, at least the XML declaration would be ignored when served as
text/html to a tag-soup parser, and forces MSHTML into Compatibility mode.
I'm not sure about the BOM with text/html, but I think it is error-prone to
rely on it.

ISTM that the XML 1.0 Specification, Fourth Edition, does not clearly state
whether or not the HTTP Content-Type header should take precedence as it
does in HTML; it only states that if there is no external character encoding
information (such as this header) the document entity has to provide that
information in a text declaration (see above) or the parse result would be a
fatal error.
[...]
How is that related to javascript? ;-)

See the Subject header.


PointedEars
 
E

Eric B. Bednarz

Thomas 'PointedEars' Lahn said:
Eric B. Bednarz wrote:
You are confusing the Document Character Set and the document
encoding,

I am not confusing anything. If you want to say HTTP charset parameter,
you could just do. Or say encoding, because that's less ambiguous.
After all charset is like referer, it's too old to be fixed. In the SGML
declaration CHARSET means document character set, by the way (and in
the public identifier of a document type declaration, DTD means public
text class ‘document type declaration subset’, not ‘document type
definition’; so much jargon, so little shoulders to stand on :).

OTOH, I was just joking. When comparing comments you make and your
reactions on comments you get, you seem to be pretty undecided on
pedantery for its own sake after all. Good :)
whereas the former, which is ISO 10646 or UCS-2 (like Unicode 4.0) only
describes the range of characters that can be displayed with an HTML document.

Nonsense. It describes the range of legal SGML characters an SGML parser
has to be able to deal with. Displaying is the job of the application;
the latter might even be able to deal with non-SGML characters. Why else
would it be valid to use character references to non-SGML characters in
the document instance set?
In practice, the default character encoding for HTML documents is
ISO-8859-1,

LOL. In *practice*, it's Windows 1252, at least with a western European
locale. Worked for me on Mac OS 9, several GNU/Linux distributions, OS X,
maybe even Windows (<- heads up, joke). Supposedly for many other people
too, even on Solaris, but that's just hearsaying.
as that is the HTTP 1.0 default.

Engineering tends to gravitate towards either reality or the bit bucket.
You could not.

Zu Befehl!
Although HTML is defined as an application of SGML, SGML
features as those are not allowed in a Valid HTML document.

You should try reading ISO 8879 one day, to find out how it defines a
conforming application of SGML.
That is also
why you cannot use SGML declarations in HTML in order to declare an internal
subset

The primary reason that I cannot do that is not because you say so
(sorry about that), but because the document type declaration subset is
located in the document type declaration, not the SGML declaration.
that would declare previously undefined elements and attributes in
order to make the markup Valid.

You are half right; because actual UAs don't support SGML, I've always
done that in the external subset (‘only five lines’).

<http://lists.w3.org/Archives/Public/www-validator/2006Sep/0010.html>

(the details of w3c validation service output are like the seasons,
subject to arbitrary change :)
 
T

Thomas 'PointedEars' Lahn

Eric said:
I am not confusing anything.

Yes, you are. The Document Character Set is the set of characters that can
be displayed with a document. On the other hand, the character encoding is
how the these characters or references thereto are encoded. For example,
you can use each of the encodings US-ASCII, ISO-8859-1, UTF-7, UTF-8,
UTF-16LE, UTF-16BE, and UTF-32, among others, to encode HTML source code
that is used to display characters in UCS-2; the character entity reference
&hellip; or …, one of its corresponding character references, requires
only US-ASCII as an encoding to be used, but it refers to a character in
UCS-2 and therefore requires UCS-2 as DCS to be displayed.

http://www.w3.org/TR/html401/charset.html
If you want to say HTTP charset parameter, you could just do.

I could, nevertheless this is but a synonym for the former, and a
misleading/confusing one when talking about the differences between the DCS
and the character encoding. You want to leave it to me which synonym I choose.
Or say encoding, because that's less ambiguous.

That would be wrong in this context, because is not a synonym for the
former. The character encoding of a document resource may differ from what
was declared, which is the entire point of this thread.
After all charset is like referer, it's too old to be fixed.

You are not making sense.
In the SGML declaration CHARSET means document character set, by the way
(and in the public identifier of a document type declaration, DTD means
public text class ‘document type declaration subset’, not ‘document type
definition’; so much jargon, so little shoulders to stand on :).

You are only confusing more things.

http://www.w3.org/TR/html401/sgml/dtd.html
OTOH, I was just joking. When comparing comments you make and your
reactions on comments you get, you seem to be pretty undecided on
pedantery for its own sake after all. Good :)

I know what I am talking about, whereas you obviously know only half the
things you are talking about. However unfortunate, the latter is not bad in
itself. But it becomes bad when it causes you to add just more confusion to
an already hard-to-explain issue, and to give bad advice.
Nonsense. It describes the range of legal SGML characters an SGML parser
has to be able to deal with.

There is no contradiction, that is included in "can be displayed with an
HTML document". "to display" in English does not have the sole meaning of
"to show on a screen" (compare "dargestellt durch" in German).
Displaying is the job of the application; the latter might even be able
to deal with non-SGML characters. Why else would it be valid to use
character references to non-SGML characters in the document instance set?

Using character references to represent a character (of the DCS) is included
in "can be displayed with an HTML document".
LOL. In *practice*, it's Windows 1252, at least with a western European
locale. Worked for me on Mac OS 9, several GNU/Linux distributions, OS X,
maybe even Windows (<- heads up, joke). Supposedly for many other people
too, even on Solaris, but that's just hearsaying.

Nonsense. You really want to read the Specification about this:

http://www.w3.org/TR/html401/charset.html#h-5.2.2
Engineering tends to gravitate towards either reality or the bit bucket.

See above.
Zu Befehl!

Don't be ridiculous. This was a statement, not a command. The explanation
for it followed below.
You should try reading ISO 8879 one day, to find out how it defines a
conforming application of SGML.

You should read the HTML 4.01 Specification more thoroughly, and test your
extended documents in some user agents once in a while (BTDT). What you
failed to observe to date is that the Specification prose is normative, too,
except places where it is defined informative. HTML is neither solely
defined by its DTD(s) nor is it implemented so.

http://www.w3.org/TR/html401/intro/sgmltut.html#h-3.2
The primary reason that I cannot do that is not because you say so (sorry
about that), but because the document type declaration subset is located
in the document type declaration, not the SGML declaration.

Nonsense. The document of an SGML application usually may contain the
declaration of an internal subset in its DOCTYPE declaration:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"
"http://www.w3.org/TR/html4/loose.dtd"
[
<!ATTLIST img
onload CDATA #IMPLIED]>

But that is not allowed in HTML:

http://www.w3.org/TR/html401/struct/global.html#h-7.2
You are half right; because actual UAs don't support SGML, I've always
done that in the external subset (‘only five lines’).

<http://lists.w3.org/Archives/Public/www-validator/2006Sep/0010.html>

(the details of w3c validation service output are like the seasons,
subject to arbitrary change :)

Only that this is no longer HTML as well, and therefore the least you can
expect is that user agents use Quirks mode:

,<http://validator.w3.org/check?uri=http://bednarz.nl/tmp/nobr/&ss=1>
|
| [...]
|
| Potential Issues
|
| The following missing or conflicting information caused the validator
| to perform guesswork prior to validation. If the guess or fallback is
| incorrect, it may make validation results entirely incoherent. It is
| highly recommended to check these potential issues, and, if necessary,
| fix them and re-validate the document.
|
| /!\ Warning Unable to Determine Parse Mode!
|
| The validator can process documents either as XML (for document types
| such as XHTML, SVG, etc.) or SGML (for HTML 4.01 and prior versions).
| For this document, the information available was not sufficient
| to determine the parsing mode unambiguously, because:
|
| * the MIME Media Type (text/html) can be used for XML or SGML
| document types
| * the Document Type (http://bednarz.nl/tmp/nobr/www.dtd) is not
| in the validator's catalog
| * No XML declaration (e.g <?xml version="1.0"?>) could be found
| at the beginning of the document.
|
| As a default, the validator is falling back to SGML mode.
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

PointedEars
 
B

Bart Van der Donck

Eric said:
LOL. In *practice*, it's Windows 1252, at least with a western European
locale. Worked for me on Mac OS 9, several GNU/Linux distributions, OS X,
maybe even Windows (<- heads up, joke). Supposedly for many other people
too, even on Solaris, but that's just hearsaying.

ISO-8859-1 has always been the (historical) http default character
encoding since the eighties. But you have a point that in practice
there are (quite recent) addings. The Euro-symbol is a nice example;
it doesn't exist in ISO-8859-1, but doesn't cause much problems
anymore today. But stating that Windows-1252 is the default charset,
would be a bridge too far for me.

http://www.google.com/search?q=windows+ANSI+misnomer
http://en.wikipedia.org/wiki/Windows-1252
http://en.wikipedia.org/wiki/ANSI
http://en.wikipedia.org/wiki/ISO/IEC_8859-15
http://en.wikipedia.org/wiki/ISO/IEC_8859-1
 
E

Eric B. Bednarz

Thomas 'PointedEars' Lahn said:
I know what I am talking about, whereas you obviously know only half the
things you are talking about.

“There’s just no nice way to say this…â€
© 2004 Tim Bray
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top