XHTML and how to do it right

T

Thomas Mlynarczyk

Hi,

I want to write a site in XHTML1.0 Strict, but it should - of course - also
work with older browsers who don't know about XML yet. Now I have found
information on that subject, but some of the sources contradict each other
or it is not quite clear what one should do and what not. For example, the
necessity of the <xml>-Prolog, the question of the charset encoding (and
where to specify it) and whether to serve the page as "text/html" or not and
which file extension should be used.

So, how do I do this right?

Greetings,
Thomas
 
W

William Tasso

Thomas said:
I want to write a site in XHTML1.0 Strict,

may I be so bold as to enquire why?
but it should - of course
- also work with older browsers who don't know about XML yet.

yet? how's that gonna work? if they don't support it now they are not going
to. maybe a later version might, but the old software will remain broken
(for want of a better word)
...
So, how do I do this right?

HTML 4.01 strict
 
T

Thomas Mlynarczyk

Also sprach William Tasso:
yet? how's that gonna work? if they don't support it now they are
not going to. maybe a later version might, but the old software will
remain broken (for want of a better word)

Maybe I should have said that it should be *compatible* with older browsers.
HTML 4.01 strict

The more I think about it, this may be indeed the best solution. Still,
isn't XHTML supposed to be "the future" of the WWW? So wouldn't it be a good
idea to make use of XML, while somehow providing a "graceful degradation"
for older browsers (similar to using CSS which can be ignored by old
browsers)?
 
D

DU

Thomas said:
Hi,

I want to write a site in XHTML1.0 Strict, but it should - of course - also
work with older browsers who don't know about XML yet. Now I have found
information on that subject, but some of the sources contradict each other
or it is not quite clear what one should do and what not. For example, the
necessity of the <xml>-Prolog

The <xml> prolog is optional but nevertheless highly recommended.
Because it would interfere with the ability to trigger MSIE 6 for
windows to render a page in standards compliant rendering mode, then I
do not include it myself.

, the question of the charset encoding (and
where to specify it) and whether to serve the page as "text/html" or not and
which file extension should be used.

So, how do I do this right?

Greetings,
Thomas

If you're going to write your site in XHTML 1.0 strict, then I recommend

- serve the page as "text/html" so that MSIE 5+ can access and render
the pages without problems

- avoid the xml prolog so that MSIE 6 for windows can be triggered to
render the document in standards compliant rendering mode. This is
important to do because of numerous reasons:
a) MSIE 6 for windows will implement correctly the CSS1 box model among
many other bug fixes and corrections
b) you will greatly minimize cross-browser code; your pages layout will
look more closely the same with other highly compliant W3C web standards
compliant browsers (like Opera 7.x, Mozilla 1.3+, Safari 1.1, Konqueror
3.x, etc.)
c) the pages will be parsed and rendered more quickly

- you can specify the charset encoding in a meta element like
<meta http-equiv="Content-Type" content="text/html; charset=iso-8859-1" />
and in other ways described by the W3C validator. The best way when the
server is running Apache is via an .htaccess file; you should ask others
about this if you don't know how.

http://www.w3.org/International/O-HTTP-charset.html

http://httpd.apache.org/docs/content-negotiation.html

I recommend you invite - somehow - your visitors using an old browser to
upgrade: there is nothing wrong with this and I deeply believe it is in
their own best interests to upgrade their browsers. E.g. NS 4 users
are using a browser which was designed more than 6 years ago.

You can also adopt a flexible and mixed approach to all this by coding
all your pages in HTML 4.01 strict by making your code as closer and
readier to convert to XHTML 1.0 strict or XHTML 1.1 by
- avoing name attributes in all elements which must not use name
attributes in XHMTL 1.x (like form, a, map, ...etc... elements)
- quoting all attribute values,
- etc.

What's the most important in my mind is to choose a strict definition
because this will trigger standards compliant rendering mode in MSIE 6
where the benefits are important for cross-browser compatibility with
highly compliant browsers (like Opera 7.x, Mozilla 1.3+, NS 7.1, Safari
1.1, etc), speed of parsing and rendering, strict implementation of the
CSS1 box model, etc..

DU
 
K

Kris

I want to write a site in XHTML1.0 Strict, but it should - of course - also
work with older browsers who don't know about XML yet. Now I have found
information on that subject, but some of the sources contradict each other
or it is not quite clear what one should do and what not. For example, the
necessity of the <xml>-Prolog

The <xml> prolog is optional but nevertheless highly recommended.
Because it would interfere with the ability to trigger MSIE 6 for
windows to render a page in standards compliant rendering mode, then I
do not include it myself.[/QUOTE]

The <xml>-prologue causes IE4/Mac browser to render a blank page. Better
to leave the optional thing out and if you want to trigger IE6's quirks
mode, add a HTML comment before the DocType.
 
T

Thomas Mlynarczyk

Also sprach DU:
- serve the page as "text/html"
- avoid the xml prolog
- you can specify the charset encoding in a meta element

Thus, the following would be perfectly ok?

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="de" lang="de">
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"
/>
<title>This is the title</title>
<link rel="stylesheet" type="text/css" href="styles.css" />
</head>
<body>
... contents ...
</body>

Bookmarked :)
 
D

DU

Thomas said:
Also sprach DU:




Thus, the following would be perfectly ok?

<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="de" lang="de">

I do not include lang="de" here; if one day I want to convert the file
to XHTML 1.1, then I'm closer/readier that way.
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1"
/>

I include a few other <meta>'s myself:

<meta http-equiv="Content-Language" content="de" />
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta http-equiv="Content-Script-Type" content="text/javascript" />
<meta http-equiv="date" content="2004-01-28T09:54:03+08:00" />
<meta http-equiv="imagetoolbar" content="no" />
This one is to avoid the annoyance of the MSIE image toolbar.

<title>This is the title</title>
<link rel="stylesheet" type="text/css" href="styles.css" />

I define also the media type for the stylesheet but what you do is ok.
And I had the navigation <link rel="...">'s for Site Navigation bar of
browsers like Opera, Mozilla, Lynx, ICab and others.

</head>
<body>
... contents ...
</body>



Bookmarked :)

webstandards.org asks the W3C:
"Which should we use, HTML or XHTML, and why?"
http://webstandards.org/learn/askw3c/oct2003.html

DU
 
D

DU

Andreas said:
No. *Everything* called <meta http-equiv> cannot be perfectly OK
because it is only an ersatz.

True but it's better than nothing.

webstandards.org Asks the W3C
"There are several ways of specifying the character encoding for a
particular document. Which of the following methods (or combination
thereof) does the W3C recommend, and why?"
http://webstandards.org/learn/askw3c/dec2002.html

and the W3C gave HTML/XHTML meta element as the 3rd one. Of course, if
you can set the content-type of the server, then you're using the best
way. But when you can not and when you don't want to trigger backward
compatible rendering mode in MSIE 6 for windows...

DU
 
T

Thomas Mlynarczyk

Also sprach Andreas Prilop:
No. *Everything* called <meta http-equiv> cannot be perfectly OK
because it is only an ersatz.

I had thought <meta http-equiv> was "equivalent" to the HTTP header and
actually a way to write things into that header when one has no access to
the server's configuration or cannot use things like .htaccess.

If only that was NN's only problem...
 
T

Thomas Mlynarczyk

Also sprach DU:
I do not include lang="de" here; if one day I want to convert the file
to XHTML 1.1, then I'm closer/readier that way.

But the W3C says both should be included.
I include a few other <meta>'s myself:

<meta http-equiv="Content-Language" content="de" />

Is this really useful - having already specified the language in the html
element?
<meta http-equiv="Content-Style-Type" content="text/css" />
<meta http-equiv="Content-Script-Type" content="text/javascript" />

<meta http-equiv="date" content="2004-01-28T09:54:03+08:00" />

Yes, that I will certainly add as well as some author info.
<meta http-equiv="imagetoolbar" content="no" />
This one is to avoid the annoyance of the MSIE image toolbar.

Fortunately, this toolbar appears only for images larger than ? x ? (don't
remember), and for those it may not really be that disturbing. Besides, some
people might actually find it useful.
And I had the navigation <link rel="...">'s for Site Navigation bar of
browsers like Opera, Mozilla, Lynx, ICab and others.

That I will add too.
 
D

DU

Thomas said:
Also sprach DU:




But I've heard that for XHTML, the charset in the meta element will be
ignored...?

I'm not an expert on this question; D. Dorward and Jukka know much more
than me on this. Apparently (and I could be quite wrong here), the
charset in the meta element can be ignored or can be honored depending
on how the web server was configured.
At least, the webserver could be (or should be?) configured in a way
that it could serve the proper character encoding when meeting a link
coded in this example/manner in, say, an English document:

<a href="path/RussianFilename.html" charset="koi8-r"
hreflang="ru">Russian exposition in Moscow</a>

or

<link rel="alternate" type="text/html" href="path/PolishFilename.html"
hreflang="pl" charset="iso-8859-2" lang="pl" title="Some Polish text
description">
so that if you click that link in the Site Navigation toolbar, the
browser should notify in advance the webserver to serve that document in
iso-8859-2 character encoding.

Over the years, I've coded 2 multi-lingual websites (10 languages total
with sub-sites based on languages) involving 5 different character
encodings and this is how I managed to do things. I never had access to
the webserver (not an Apache on top of that) and never included an <xml>
declaration.

DU
 
S

Steve Pugh

Thomas Mlynarczyk said:
Also sprach DU:


Shouldn't <style type="text/css"> and <script type="text/javascript"> do
just as well?

What stylesheet language is used here: <div style="color: red;>. Now
it looks like CSS but is it really? Is <div style="text-color: 'red'">
an error or a piece from another stylesheet language?

Likewise, is <div onclick="doStuff()"> JavaScript or some other
scripting language? (There's also the case on when should the
<noscript> be displayed by browsers that understand scripting language
A but not scripting language B - e.g. some versions of Opera got
confused by pages with both JavaScript and VBScript in them.)

If you avoid placing any style or script code inline in the page then
I can't see any need for those meta tags, but they seem to be gaining
popularity so I'm willing to hear the case in their favour.
Fortunately, this toolbar appears only for images larger than ? x ? (don't
remember), and for those it may not really be that disturbing. Besides, some
people might actually find it useful.

Exactly, the image toolbar is part of the browser interface and it
should be up to the user, not the author, to decide whether it's
annoying or not and hence to turn it off or not.

cheers,
Steve
 
D

DU

Thomas said:
Also sprach DU:




But the W3C says both should be included.

Check again. In XHTML 1.0 strict, you do not have to include a lang
attribute in the html element. In XHTML 1.1, a lang attribute is
forbidden in the html element.
Is this really useful - having already specified the language in the html
element?

Good question. You may have a good point here. It depends on which
statement (or how) which user agent checks for language. Since I do not
know that in advance for all kinds of software, user agents, devices,
etc.., then I declare a meta for content-language. On the issue of
internationalization of documents and interoperability, I think the
current state of the web has a long way to go.
How do translation web sites, translation web-aware application work
with documents? Hard to say with certainty. What do they look for
exactly in a) HTML 4.01 documents b) XHTML documents when
considering/looking for the language of a document? What about search
engines? etc. What's first or decisive for a translation software
dealing with an HTML 4.01 document? The lang attribute in the html tag
or a meta content-language? or both?
Some websites have full webpage translation features. E.g.:
http://www.nba.com/warriors/free_stuff/language_translation.html
How does WorldLingo....
Shouldn't <style type="text/css"> and <script type="text/javascript"> do
just as well?

Defining the default script language and style type from the start, from
the beginning and for the whole document can not be wrong. This
contributes to clarify the status of the document, hopefully serving an
interoperability purpose (pre-empt, parsing, efficiency or likewise)
across devices and application.
Maybe it achieves nothing now; maybe it will achieve something worthy in
the next generation of browsers, who knows. The doctype declaration used
to serve only a validation purpose a few years ago; now, many browsers
(IE6) use it to define a parsing mode and a rendering mode for the document.
Yes, that I will certainly add as well as some author info.




Fortunately, this toolbar appears only for images larger than ? x ? (don't
remember), and for those it may not really be that disturbing. Besides, some
people might actually find it useful.

I have an opinion on this but your opinion is certainly respectable and
defendable.

DU
 
T

Thomas Mlynarczyk

Also sprach DU:

[use both xmlns:lang and lang]
Check again. In XHTML 1.0 strict, you do not have to include a lang
attribute in the html element. In XHTML 1.1, a lang attribute is
forbidden in the html element.
http://www.w3.org/TR/xhtml1/#C_7
Good question. You may have a good point here. It depends on which
statement (or how) which user agent checks for language. Since I do
not know that in advance for all kinds of software, user agents,
devices, etc.., then I declare a meta for content-language. On the
issue of internationalization of documents and interoperability, I
think the current state of the web has a long way to go.
How do translation web sites, translation web-aware application work
with documents? Hard to say with certainty. What do they look for
exactly in a) HTML 4.01 documents b) XHTML documents when
considering/looking for the language of a document? What about search
engines? etc. What's first or decisive for a translation software
dealing with an HTML 4.01 document? The lang attribute in the html tag
or a meta content-language? or both?

So I'd better put the language code in as many places as I can, to be on the
safe side...
I have an opinion on this but your opinion is certainly respectable
and defendable.

Well, first I'll write my site and then I can decide if I want to leave the
imagetoolbar or not... If the page contains no images that would trigger the
imagetoolbar then there's no point in preventing it.
 
T

Thomas Mlynarczyk

Also sprach Steve Pugh:
What stylesheet language is used here: <div style="color: red;>. Now
it looks like CSS but is it really? Is <div style="text-color: 'red'">
an error or a piece from another stylesheet language?
Likewise, is <div onclick="doStuff()"> JavaScript or some other
scripting language? (There's also the case on when should the
<noscript> be displayed by browsers that understand scripting language
A but not scripting language B - e.g. some versions of Opera got
confused by pages with both JavaScript and VBScript in them.)

Thanks for this illustration. I see the point now.
 
T

Thomas Mlynarczyk

Also sprach Andreas Prilop:
It is an ersatz for the "real thing".

So the browser does not "rewrite" the received header "on the fly" upon
parsing the meta tags but rather *might* choose to use the meta info if it
can't find anything useful in the header?
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,780
Messages
2,569,611
Members
45,280
Latest member
BGBBrock56

Latest Threads

Top