<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">?

T

TheKeith

Dreamweaver puts that tag in the head section of every page? Can someone
tell me what it's for and if it's needed? Thanks.
 
B

brucie

<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1">
Dreamweaver puts that tag in the head section of every page?

1. stop DW from doing it and
2. configure your server to do it.

in a .htaccess file:

AddType text/html;charset=ISO-8859-1 .html
Can someone tell me what it's for

the file content type and character encoding

4.3 The text/html content type
http://www.w3.org/TR/html401/conform.html#h-4.3

5.2.2 Specifying the character encoding
http://www.w3.org/TR/html401/charset.html#h-5.2.2
and if it's needed?

yes
 
M

Micah Cowan

brucie said:
1. stop DW from doing it and
2. configure your server to do it.

in a .htaccess file:

AddType text/html;charset=ISO-8859-1 .html

Why? I understand the preference that it be set in the actual
HTTP headers; but isn't it helpful to have the same information
within the document itself? That way encoding information is
preserved when (say) the document is saved to the viewer's
system: the browser doesn't get HTTP headers when that document
is opened for viewing later, so the <meta> tag would seem handy
for that case.

BTW, I prefer

AddCharset ISO-8859-1 .html

The above feels a little like cheating (i.e., it's really not
part of the "type"...)

-Micah
 
B

brucie

Why? I understand the preference that it be set in the actual
HTTP headers; but isn't it helpful to have the same information
within the document itself?

the UA is under no obligation to take any notice of meta data.
BTW, I prefer
AddCharset ISO-8859-1 .html
The above feels a little like cheating (i.e., it's really not
part of the "type"...)

the charset is a valid optional parameter of the text/html content
type.
 
M

Micah Cowan

brucie said:
the UA is under no obligation to take any notice of meta data.

Yes, but is this a valid reason to

?

My preference is to specify it in as many locations as
possible. Since I generally use XHTML, I have it in the XML
declaration, in the meta tags and in the proper HTTP headers.
the charset is a valid optional parameter of the text/html content
type.

Yes, but that's irrelevant. "AddType" isn't an HTTP Content-Type
header; it's an Apache directive which tells it *part* of how it
should formulate one. The apache documentation says it should be
used to specify the MIME-Type; it doesn't say it should be used
to specify parameters to the Content-Type header (doesn't even
say it's possible). OTOH, it specifically provides a directive to
specify that parameter in the "AddCharset" directive; so why not
use that, as that's what it's clearly intended for?

Just my 2¢, naturally
-Micah
 
I

Isofarro

Micah said:

There are no practical benefits for them to suddenly start parsing HTML just
to extract a content type (extended or otherwise). All the better to keep
it in the rightful place in the HTTP header.
Please read the rest of what I said in the message you snipped, and
then answer if you like.

Doesn't mention or clarify why you consider it a valid reason for User
Agents to take notice of meta-data, so no further quotage is needed. There
are no questions relevant to that issue in the snipped material.
 
M

Micah Cowan

Isofarro said:
There are no practical benefits for them to suddenly start parsing HTML just
to extract a content type (extended or otherwise). All the better to keep
it in the rightful place in the HTTP header.

I'm not expecting them to: neither have I in this thread, even
once, suggested that the HTTP header is not the most appropriate
place for the content type to be specified. What I *did* say is
that I don't understand why brucie would recommend that we *not*
also specify it in a <meta> tag, as this provides useful
information to user agents which *do* parse HTML, and which (if, e.g.,
loading from local store) would not be likely to have the helpful
context that HTTP headers provide.
 
I

Isofarro

Micah said:
What I *did* say is
that I don't understand why brucie would recommend that we *not*
also specify it in a <meta> tag, as this provides useful
information to user agents which *do* parse HTML,

To the _detriment_ of user-agents that don't.
 
T

Toby A Inkster

Isofarro said:
To the _detriment_ of user-agents that don't.

How can it have a detrimental effect on user agents if they don't even
parse the HTML? Please explain.

Wget, for example, doesn't parse HTML (unless it's in recursive mode).
Apart from a *slight* increase in file size, what *detrimental* effect
does the <meta http-equiv="Content-Type"> tag have?
 
D

Daniel R. Tobias

Toby A Inkster said:
Wget, for example, doesn't parse HTML (unless it's in recursive mode).
Apart from a *slight* increase in file size, what *detrimental* effect
does the <meta http-equiv="Content-Type"> tag have?

There's the infamous "Netscape Burp", where Netscape 4.x browsers will
display a page containing such a meta tag fully, then immediately
clear the window and re-render it, sometimes a fairly slow process if
the page is big and complex. Back when I was using NS 4.x as my
primary browser (before switching to Mozilla, which I use now), I
developed an extremely strong hatred for this meta tag and the site
developers who use it.
 
I

Isofarro

Toby said:
How can it have a detrimental effect on user agents if they don't even
parse the HTML? Please explain.

Take a UA that indexes text/html files. If they've got to parse the Entity
Body every single time just to figure out what the correct content type is,
that means on an update cycle, the entire resource (be it an html document,
image, mp3, divx) has to be requested on each update cycle, since the
information won't be in the Response header on a HEAD request. That alone
breaks lots of tools.

Now, in the scenario within this subthread, an indexer only interested in
Western Latin character sets information - as opposed to DBCS character set
material is going to have to expend a significant chunk of bandwith and
time downloading and parsing through html resources that are not going to
be indexed (as opposed to a HEAD request - wrong character set. Next URL).
That's a massive waste of resources, resulting in a slower update cycle and
higher resource usage.
Wget, for example, doesn't parse HTML (unless it's in recursive mode).
Apart from a *slight* increase in file size, what *detrimental* effect
does the <meta http-equiv="Content-Type"> tag have?

The detrimential effects occurr when the content type is _only_ available in
the meta element. Where an otherwise HEAD only request is sufficient to
check whether a resource has changed before requesting, this is now no
longer possible and the entire resource has to be downloaded before the
correct determination can be made.
 
M

Micah Cowan

Isofarro said:
Take a UA that indexes text/html files. If they've got to parse the Entity
Body every single time just to figure out what the correct content type is,
that means on an update cycle, the entire resource (be it an html document,
image, mp3, divx) has to be requested on each update cycle, since the
information won't be in the Response header on a HEAD request. That alone
breaks lots of tools.

*Finally* someone actually begins to come close to actually
*answering* the question I posed in the first place, rather than
a lot of knee-jerk responses from people who clearly didn't read
thoroughly in the first place.

However... this is a ridiculous assertion. You already know it's
text/html before you even begin parsing it (or at the very least,
by the time you've read the DOCTYPE declaration). The meta tag
will not and cannot change this: everyone knows the only reason
to specify the Content-Type header in a <meta/> tag in the first
place is to indicate the character encoding; a practice
sanctioned by the spec. And the presence of <meta/>, or lack
thereof, will not significantly impact the performance of a tool
which has decided it has to put itself through this torture
anyway.
Now, in the scenario within this subthread, an indexer only interested in
Western Latin character sets information - as opposed to DBCS character set
material is going to have to expend a significant chunk of bandwith and
time downloading and parsing through html resources that are not going to
be indexed (as opposed to a HEAD request - wrong character set. Next URL).
That's a massive waste of resources, resulting in a slower update cycle and
higher resource usage.

Again, if you're doing *both* (setting the appropriate HTTP
headers *and* setting the meta-tag equivalent), this is
completely irrelevant.
The detrimential effects occurr when the content type is _only_ available in
the meta element.

And therefore still does not address the question at hand.

Once again, I'm *not* asking why we should set the relevant HTTP
headers: I can't imagine anyone thinking this is a bad thing. My
question is, why discourage replication of the very same
information (in particular, the character encoding via the
Where an otherwise HEAD only request is sufficient to
check whether a resource has changed before requesting, this is now no
longer possible and the entire resource has to be downloaded before the
correct determination can be made.

Again, you are addressing scenarios which you, and not I, have
drumed up. This has nothing to do with the topic at hand.
 
T

Toby A Inkster

Isofarro said:
Take a UA that indexes text/html files. If they've got to parse the
Entity Body every single time just to figure out what the correct
content type is

Why can't they just look at the HTTP header?
The detrimential effects occurr when the content type is _only_ available in
the meta element.

Which nobody has suggested as a wise course of action -- this entire
thread has been about specifying the character set in both a <meta> tag
*and* the HTTP header.

Micah Cowen wrote:
| My preference is to specify it in as many locations as
| possible. Since I generally use XHTML, I have it in the XML
| declaration, in the meta tags and in the proper HTTP headers.

and then:
| I'm not expecting them to: neither have I in this thread, even
| once, suggested that the HTTP header is not the most appropriate
| place for the content type to be specified.

Micah and I have been arguing that it is sensible to specify the character
set in <meta> *and* HTTP because then:

* if a user agent has access to the HTTP header, but not the HTML file, or
doesn't want to parse the HTML file, it can get the character set from the
HTTP header.

* if a user agent has access to the HTTP header and parses the HTML file,
then it can get the character set from either. If the two character sets
disagree, it might have to make an educated guess.

* if a user agent does not have access to the HTTP header (e.g. it
operates on locally saved files), then it can parse the HTML and find the
character set there.

So all user agents have access to the charset.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,484
Members
44,903
Latest member
orderPeak8CBDGummies

Latest Threads

Top