Ben said:
That's an interesting way of looking at it.
I tend to think of a DTD as an input to a parser. Markup + DTD goes in,
a DOM tree comes out if you're lucky. By the time you've got your DOM
tree you're finished with the HTML and DTD and no longer care about
them.
Thinking that it's a matter of "caring" is an interesting way to look at
it. If it were a matter of "caring", what would be the point of "caring"
in the first place if the user agent immediately lost interest?
Does the user agent no longer care if you create a dozen new elements
all with the same ID and then expect your client-side code to know which
of those items you're referring to?
Does it no longer care if you stick TRs directly inside TDs so that it
no longer knows what the table's structure is and can no longer render
it, and an attached screen reader can no longer read it?
Does it no longer care if you use scripting to nest one hyperlink inside
another or one form inside another and then expect the user agent to
guess what behavior you expect?
If you see it as just a matter of meaningless tags to which you apply
styles, why are you bothering with HTML at all? Just create your own DTD
with tags that have nothing to do with HTML but that represent the
contents of your document, and base your documents on that.
CSS user agents don't care whether the document's structure corresponds
to valid HTML or not-- all they need is a proper tree structure (i.e.
not a "soup" or a graph) and a set of properties for each node. Then
they can render it according to CSS specifications.
You get undefined _parsing_ in browsers if you use invalid HTML. The
consequence is the DOM tree you end up with isn't always the one you
expected. If you construct the DOM tree with scripting, it doesn't
matter whether it corresponds to valid HTML or not it-- you will get
exactly the tree you created, you can style it how you like, and it
should be rendered correctly according to the CSS spec.
What is the point of valid HTML? So as to get predictable results from
the parsers in browsers,
Validity isn't a prerequisite for document parsing. If a document is
well-formed, it doesn't have to be valid under any DTD for the result to
be predictable. A parser that knows the rules for well-formed (X/SG)ML
style documents can parse
<foo>
<bar att="3">Hello</bar>
</foo>
and create a document object from it with no trouble at all.
> and so as to provide a document that can be
parsed by other programs for other purposes.
There's is nothing in the specifications for browser rendering or
scripting that I have found that requires a document whose structure
corresponds to valid HTML.
Of course not. See foo and bar, above. But if you *tell* the browser
that you want an HTML document, then it should be what you say it is. I
don't understand why you think the user agent needs to be assured that
it's HTML one moment and not the next moment.