ISO-8859-1

J

Jeff Thies

I've got some content managed pages where content is pasted in from
some other source.

This is causing some characters to appear as "?" marks in IE (Win and
Mac) and little square boxes in Opera.

Changing the character set in Opera to ISO-8859-1 fixes that.

The server is sending out charset=UTF-8 and the page contains this:

<meta http-equiv="content-type" content="text/html;charset=iso-8859-1">

It looks like the server content type is winning out.

Should I change the server charset to iso-8859-1?

Does the case matter in character set declarations?

Jeff
 
D

David Dorward

Jeff said:
Changing the character set in Opera to ISO-8859-1 fixes that.
The server is sending out charset=UTF-8 and the page contains this:
<meta http-equiv="content-type" content="text/html;charset=iso-8859-1">
It looks like the server content type is winning out.

Yes, the specification says that real http headers trump http-equiv.
Should I change the server charset to iso-8859-1?

.... or the document to UTF-8.
 
B

brucie

In alt.html Jeff Thies said:
The server is sending out charset=UTF-8 and the page contains this:
<meta http-equiv="content-type" content="text/html;charset=iso-8859-1">
It looks like the server content type is winning out.

as it should

<quote>
[...] conforming user agents must observe the following priorities when
determining a document's character encoding (from highest priority to
lowest):
1. An HTTP "charset" parameter in a "Content-Type" field.
2. A META declaration with "http-equiv" set to "Content-Type" and a
value set for "charset".
3. The charset attribute set on an element that designates an external
resource.
Should I change the server charset to iso-8859-1?
yes

Does the case matter in character set declarations?

<quote>
[...] Names for character encodings are case-insensitive, so that for
example "SHIFT_JIS", "Shift_JIS", and "shift_jis" are equivalent.
</quote> http://www.w3.org/TR/html401/charset.html#h-5.2.1
 
E

Eric B. Bednarz

Jeff Thies said:
I've got some content managed pages where content is pasted in from
some other source.

[x] define 'pasted in'
The server is sending out charset=UTF-8 and the page contains this:

<meta http-equiv="content-type" content="text/html;charset=iso-8859-1">

That's bogus of course, and one of the reasons to avoid that kludge
altogether.
It looks like the server content type is winning out.

That's the correct behaviour, as already mentioned.
Should I change the server charset to iso-8859-1?

If everything 'pasted in' is encoded as Latin 1, the answer may be
'yes'. In many scenarios you don't know that in advance and you should
probably look that up in the documentation of the content management (if
any exists) or otherwise determine the backend language and locate a
newsgroup that deals with it (depending on several factors, this can be
a not-too-trivial issue).
 
J

Jim Higson

You could try just converting everything to utf8, and then serving it as
such.

The increase in filesize will be nothing to little, and will solve a lot of
headaches.
 
J

Jeff Thies

Jim said:
You could try just converting everything to utf8, and then serving it as
such.

The increase in filesize will be nothing to little, and will solve a lot of
headaches.
How do you do that?

Jeff
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,755
Messages
2,569,536
Members
45,015
Latest member
AmbrosePal

Latest Threads

Top