Blank textNodes in the DOM

D

Daz

Hi everyone.

This might be a straight forward question, but when I open the DOM
inspector inside of Firefox, I see that almost every DOM element seems
to have a textNode above it. When I try to see the contents, it shows
as being 'undefined'. Can anyone plase explain to me the purpose that
they server. Perhaps then I will be able to understand them better, as
quite often, when I use firstChild, instead of getting the element I
want, I end up with what appears to be a blank text node.

Many thanks.

Daz.
 
V

VK

Daz said:
This might be a straight forward question, but when I open the DOM
inspector inside of Firefox, I see that almost every DOM element seems
to have a textNode above it. When I try to see the contents, it shows
as being 'undefined'. Can anyone plase explain to me the purpose that
they server. Perhaps then I will be able to understand them better, as
quite often, when I use firstChild, instead of getting the element I
want, I end up with what appears to be a blank text node.

1)
<http://en.wikipedia.org/wiki/Phantom_nodes>
<http://en.wikipedia.org/wiki/Talk:Phantom_nodes>

2)
<https://bugzilla.mozilla.org/show_bug.cgi?id=26179>

3)
<http://developer.mozilla.org/en/docs/DOM:element.firstChild#Notes>

That's a RWAR (<http://mindprod.com/jgloss/rwar.html>) topic, so just
make your own decision and choose from existing workarounds.
 
V

VK

Daz said:
Aha. Phantom nodes. Thanks VK! :) Looks like I've walked into an rwar
zone...

Yeah, with the first step and right in the middle :)

For Firefox DOM Inspector itself you can change the view option by
going in DOM Inspector window to View > Show Whitespace Nodes. This
affects the DOM Inspector display *only*, no DOM Tree changes.

To deal with the ... observed phenomenon ... programwise see for
instance:

<http://www.codingforums.com/showthread.php?t=7028>
<http://developer.mozilla.org/en/docs/Whitespace_in_the_DOM>
<http://developer.mozilla.org/en/docs/Talk:Whitespace_in_the_DOM>
 
F

Frederik Vanderstraeten

Daz schreef:
Hi everyone.

This might be a straight forward question, but when I open the DOM
inspector inside of Firefox, I see that almost every DOM element seems
to have a textNode above it. When I try to see the contents, it shows
as being 'undefined'. Can anyone plase explain to me the purpose that
they server. Perhaps then I will be able to understand them better, as
quite often, when I use firstChild, instead of getting the element I
want, I end up with what appears to be a blank text node.

Many thanks.

Daz.

The purpose that they serve:
<span>Hi,</span>
<span>how are you?</span>

Should return:
Hi, how are you?

If the white-space were discarded, this would return:
Hi,how are you?
 
D

Daz

VK said:
Yeah, with the first step and right in the middle :)

For Firefox DOM Inspector itself you can change the view option by
going in DOM Inspector window to View > Show Whitespace Nodes. This
affects the DOM Inspector display *only*, no DOM Tree changes.

To deal with the ... observed phenomenon ... programwise see for
instance:

<http://www.codingforums.com/showthread.php?t=7028>
<http://developer.mozilla.org/en/docs/Whitespace_in_the_DOM>
<http://developer.mozilla.org/en/docs/Talk:Whitespace_in_the_DOM>

That's some very interesting stuff! However, I think I will just leave
the 'phantom nodes' be for now, and learn to work with them.

After reading the articles you suggested, the thought of attempting to
remove the phantom nodes terrified me somewhat. Perhaps it WILL work
effectively, but to me it looks like there is always something that
just 'might' go wrong, so you may not get the results you'd expect,
(for example, you could accidentally concatenate two words without the
whitespace). Also, removing them takes time, and resources, and
apparently IE renders the phantom nodes differently to other browsers
(no surprises there), so it would just make sense to make a small
abstraction layer which can be placed over methods such as
firstChild(), nextSibling(), childNodes() etc... just check to see if
the child is a whitespace, and if it is, perhaps return the next child
(even if that happens to be a whitespace, too).

For childNodes(), we could have a method that will return an array of
elements and textNodes, without the phantom spaces. How will we know if
it's a phantom node? This I am not entirely sure of. From observation,
it looks like phantom nodes tend to follow elements of nodeType '1'.
Although there is a chance the whitespace we are assuming is a phantom
node, actually is not, it doesn't matter. It's very unlikely that we
will need to do anything with the whitespace, even if it's not a
phantom node, so we can work with the text/element that follows. As we
aren't removing anything from the page, the formatting will remain the
same.

We can also use the normalize() method, which when used on an example
like this:

<span>
This is
some
text
</span>
will join text nodes so we will end up with the code rendered like
this:

<span>
This is some text
</span>

I think this is a useful method to utilise, but I seem to remember
reading that IE doesn't support it, so I guess it's just a case of
going back to basics, and doing everything with XP scripting, and not
relying on methods that aren't supported by all browsers.

I am not saying my method is correct, however, I noticed that it wasn't
posted on any of the forums you suggested, and as you rightly said,
there may never be an agreement as to who's method is the best. Granted
that sometimes a certain method can be better than others, and other
times a different approach is more effective, but in my humble opinion,
I believe that removing the phantom nodes is a recipe for disaster! I
think we should all just work with them. :D

I think it's the programmers responsibility to ensure that their HTML
comes with phantom nodes in all of the right places, or none at all. If
they are sending both the HTML and the JavaScript to the user, there is
no reason for the script not to be compatible with the code, and vice
versa. Having to phantom nodes at all would be the preference as it
would save page loading times (marginally, yes, but exponantially in
the long run on a busy server). I think my method (which probably isn't
unique) should cover both scenarios, and of course be helpful to people
who are coding JavaScript that works independantly of the page (i.e a
Firefox extension).

Thanks again VK.
 
D

Daz

VK said:
Yeah, with the first step and right in the middle :)

For Firefox DOM Inspector itself you can change the view option by
going in DOM Inspector window to View > Show Whitespace Nodes. This
affects the DOM Inspector display *only*, no DOM Tree changes.

To deal with the ... observed phenomenon ... programwise see for
instance:

<http://www.codingforums.com/showthread.php?t=7028>
<http://developer.mozilla.org/en/docs/Whitespace_in_the_DOM>
<http://developer.mozilla.org/en/docs/Talk:Whitespace_in_the_DOM>

Here are some ideas for method names. I would appreciate any input you
might have to offer, as I am contemplating making my first JavaScript
library, but it's pointless if I am going to be the only one to use it.
It might be overkill, and it's only a few rough ideas that could do
with some tweaking:

// All functions return null if there is nothing to return.

firstChildElement:
Returns the first child node of the specified parent node, that is an
element, if one exists.

firstChildTextNode:
Returns the first child node of the specified parent node that is a
textNode, but not a whitespace, if one exists.


lastChildElement:
Returns the last child node of the specified parent node that is an
element, if one exists.

lastChildTextNode:
Returns the last child node of the specified parent node that is an
textNode, but not a whitespace, if one exists.



nextSiblingElement:
Returns the next sibling node of the specified node that is an element,
if one exists.

nextSiblingTextNode:
Returns the next sibling node of the specified node that is a textNode,
but not a whitespace, if one exists.

previousSiblingElement:
Returns the previous sibling node of the specified node that is an
element, if one exists.

previousSiblingTextNode:
Returns the previous sibling node of the specified node that is a
textNode, but not a whitespace, if one exists.



childElements[]:
Returns an array of all of the child nodes of the specified parent
node, that are elements, if any exist.

childTextNodes[]:
Returns an array of all of the child nodes of the specified parent
node, that are textNodes, but not whitespaces, if any exist.


Please let me know if such a library already exists, it might save me
reinventing the wheel. I think the functions I have named above would
be seriously easy to implement, and also they should cover all of the
methods that might be effected by phantom nodes (i.e. the methods that
return a node of an unspecified type, unlike getElementsByTagName(), or
getElementById(), which shouldn't be effected by phantom nodes).

All the best.

Daz.
 
M

Michael Winter

Daz wrote:

[snip]
Here are some ideas for method names. I would appreciate any input you
might have to offer, as I am contemplating making my first JavaScript
library, but it's pointless if I am going to be the only one to use it.

Not really. There's no harm in writing functions that can be reused in
the future, though writing a library might not be a good idea. It
depends on what you attribute to the idea of a library.

[snip]
firstChildElement:
Returns the first child node of the specified parent node, that is an
element, if one exists.

firstChildTextNode:
Returns the first child node of the specified parent node that is a
textNode, but not a whitespace, if one exists.

[snipped similar definitions]
Please let me know if such a library already exists, it might save me
reinventing the wheel.

I took a different approach to this sort of thing that's closer, in
principle, to the W3C DOM Traversal methods. That is, a few methods are
used to traverse the document tree given a filter with which to test the
nodes that are encountered. So, rather than a function that is dedicated
to returning the first child element, there's a function that is passed
a callback to test for elements, returning the first one for which the
callback returns true.

function createElementCallback(tagName) {
return function (node) {
return isElement(node, tagName);
};
}
function getFirstChild(parent, test) {
var node = parent.firstChild;

if (node) {
return test(node) ? node : this.getNextSibling(node, test);
}
return null;
}
function getLastChild(parent, test) {
var node = parent.lastChild;

if (node) {
return test(node)
? node : this.getPreviousSibling(node, test);
}
return null;
}
function getNextSibling(node, test) {
while ((node = node.nextSibling)) {
if (test(node)) {
return node;
}
}
return null;
}
function getPreviousSibling(node, test) {
while ((node = node.previousSibling)) {
if (test(node)) {
return node;
}
}
return null;
}
function isElement(node, tagName) {
return (node.nodeType == 1)
&& (tagName ? (node.nodeName == tagName)
: (node.nodeName != '!'));
}
function isTextNode(node) {
return node.nodeType == 3;
}

So,

firstChildElement(parent)

would be implemented as:

getFirstChild(parent, isElement)

To find a specific element type, use:

var isAnchor = createElementCallback('A');

getFirstChild(parent, isAnchor)


Your specific definition of

firstChildTextNode(parent)

could be implemented as:

getFirstChild(parent, function (node) {
return isTextNode(node) && /\W/.test(node.data);
})

I prefer the flexibility that this approach offers.

[snip]

Mike
 
D

Daz

Michael said:
Daz wrote:

[snip]
Here are some ideas for method names. I would appreciate any input you
might have to offer, as I am contemplating making my first JavaScript
library, but it's pointless if I am going to be the only one to use it.

Not really. There's no harm in writing functions that can be reused in
the future, though writing a library might not be a good idea. It
depends on what you attribute to the idea of a library.

[snip]
firstChildElement:
Returns the first child node of the specified parent node, that is an
element, if one exists.

firstChildTextNode:
Returns the first child node of the specified parent node that is a
textNode, but not a whitespace, if one exists.

[snipped similar definitions]
Please let me know if such a library already exists, it might save me
reinventing the wheel.

I took a different approach to this sort of thing that's closer, in
principle, to the W3C DOM Traversal methods. That is, a few methods are
used to traverse the document tree given a filter with which to test the
nodes that are encountered. So, rather than a function that is dedicated
to returning the first child element, there's a function that is passed
a callback to test for elements, returning the first one for which the
callback returns true.

function createElementCallback(tagName) {
return function (node) {
return isElement(node, tagName);
};
}
function getFirstChild(parent, test) {
var node = parent.firstChild;

if (node) {
return test(node) ? node : this.getNextSibling(node, test);
}
return null;
}
function getLastChild(parent, test) {
var node = parent.lastChild;

if (node) {
return test(node)
? node : this.getPreviousSibling(node, test);
}
return null;
}
function getNextSibling(node, test) {
while ((node = node.nextSibling)) {
if (test(node)) {
return node;
}
}
return null;
}
function getPreviousSibling(node, test) {
while ((node = node.previousSibling)) {
if (test(node)) {
return node;
}
}
return null;
}
function isElement(node, tagName) {
return (node.nodeType == 1)
&& (tagName ? (node.nodeName == tagName)
: (node.nodeName != '!'));
}
function isTextNode(node) {
return node.nodeType == 3;
}

So,

firstChildElement(parent)

would be implemented as:

getFirstChild(parent, isElement)

To find a specific element type, use:

var isAnchor = createElementCallback('A');

getFirstChild(parent, isAnchor)


Your specific definition of

firstChildTextNode(parent)

could be implemented as:

getFirstChild(parent, function (node) {
return isTextNode(node) && /\W/.test(node.data);
})

I prefer the flexibility that this approach offers.

[snip]

Mike

Hi Mike.

Thanks for your input, I really appreciate it. When my methods are
compared to yours, mine soon become 2D, as opposed to 3D as I thought
they were. Your methods are certainly more versatile, and beyond any
doubt give a LOT more flexibility.

Just one thing I don't understand about your code:
function isElement(node, tagName) {
return (node.nodeType == 1)
&& (tagName ? (node.nodeName == tagName)
: (node.nodeName != '!'));
}

Please could you explain what:
(node.nodeName != '!')

does? I don't understand why a nodeName might be '!'.

All the best.

Daz.
 
M

Michael Winter

Daz said:
Michael Winter wrote:
[snip]
function isElement(node, tagName) {
return (node.nodeType == 1)
&& (tagName ? (node.nodeName == tagName)
: (node.nodeName != '!'));
}

Please could you explain what:
(node.nodeName != '!')

does? I don't understand why a nodeName might be '!'.

An irritating error in IE 5.x is that the nodeType property of comment
nodes evaluates to a value of 1 (ELEMENT_NODE) not 8 (COMMENT_NODE).
However, one can distinguish between the two by testing whether the
nodeName or tagName properties evaluate to '!'.

Comments aren't a frequent occurrence in hand-written markup, but
content generation tools do include them, and IE 5.x users do still
exist. If you know that neither will ever be an issue, the isElement
function can be rewritten:

function isElement(node, tagName) {
return (node.nodeType == 1)
&& (!tagName || (node.nodeName == tagName));
}

Mike
 
D

Daz

Michael said:
Daz said:
Michael Winter wrote:
[snip]
function isElement(node, tagName) {
return (node.nodeType == 1)
&& (tagName ? (node.nodeName == tagName)
: (node.nodeName != '!'));
}

Please could you explain what:
(node.nodeName != '!')

does? I don't understand why a nodeName might be '!'.

An irritating error in IE 5.x is that the nodeType property of comment
nodes evaluates to a value of 1 (ELEMENT_NODE) not 8 (COMMENT_NODE).
However, one can distinguish between the two by testing whether the
nodeName or tagName properties evaluate to '!'.

Comments aren't a frequent occurrence in hand-written markup, but
content generation tools do include them, and IE 5.x users do still
exist. If you know that neither will ever be an issue, the isElement
function can be rewritten:

function isElement(node, tagName) {
return (node.nodeType == 1)
&& (!tagName || (node.nodeName == tagName));
}

Mike

Good stuff!

Thanks again. :)

Daz.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,766
Messages
2,569,569
Members
45,042
Latest member
icassiem

Latest Threads

Top