RobG said:
Why does Firefox insert #text nodes as children of TR elements?
Because they are in the source e.g. if you have
<tr><td>Kibo</td></tr>
then there are no white space text nodes as child nodes of the <tr>
element but as people usually author
<tr>
<td>Kibo</td>
</tr>
if you build a (normalized) DOM then the first child of the <tr> element
However, in Firefox 1.0.7
text nodes are inserted between the TDs. I'm certain that this didn't
use to happen with older versions.
I rather doubt that, I don't think such behaviour has changed between
1.0.x releases. I haven't tested now but I don't think the beahvior is
much different in earlier Mozilla releases.
The HTML specification states that the only element that can be the
child of a TR is a TD, so why does Firefox put text nodes in there?
Well the browser's tag soup parser does not look at a DTD so even if you
author your HTML to a HTML 4.01 DTD and declare that at the beginning of
the document the browser does not check whether your markup complies
with that DTD and throws out stuff that is not supposed to be in there
when building a DOM.
If this how the DOM is supposed to be built, can someone give me a
reference to where it states that?
White space and whether it or when it should occur in the DOM tree is an
underspecified issue I think, anyone picturing the tree usually ignores
that.
The current W3C DOM FAQ has this entry
<
http://www.w3.org/DOM/faq.html#emptytext> on the issue:
"In XML, all whitespace has to be passed through to the application.
This means that if you have whitespace, such as carriage returns,
between tags in your source file, these have to be passed through, even
if they're just there for pretty-printing. The DOM implementation has to
put this whitespace somewhere, and the only possibility is a text node.
Thus you will get text nodes which look empty, but in fact have a
carriage return or other whitespace in them."
But then it adds: "Note that some DOM implementations, which do not
consider whitespace in element content to be meaningful for the XML
languages they support, discard these whitespace nodes before exposing
the DOM to their users."
So that FAQ acknowledges that implementations differ and that means you
have to deal with the differences if you write code against the
different implementations.
The latest core DOM Level 3 in its introduction pictures an example tree
for a table in
<
http://www.w3.org/TR/2004/REC-DOM-Level-3-Core-20040407/introduction.html>
and does not show white space text nodes but mentions in the text before
the graphics:
"A graphical representation of the DOM of the example table, with
whitespaces in element content (often abusively called "ignorable
whitespace") removed"
In the end if you script against different implementations then my
experience is that if you walk childNodes or use firstChild or lastChild
then you should never expect a certain type of node (e.g. an element
node) at a particular position as what is an element node in one
implementation might be a text node in another. Checking nodeType and
then dealing with the type found is a safer way.