XML Parsing Newbie Madness


B

Benoit

I've been instructing myself in XML DOM parsing using the w3schools
tutorial and decided to try an example of my own. I'd written a short
XML file that looked like this:

<?xml version="1.0" encoding="UTF-8"?>
<apartment>
<tenant>
<name>
<first>John</first>
<last>Smith</last>
</name>
<age>23</age>
<occupation>Student</occupation>
</tenant>
<tenant>
<name>
<first>Alan</first>
<last>Smithee</last>
</name>
<age>22</age>
<occupation>Server</occupation>
</tenant>
<tenant>
<name>
<first>Jane</first>
<last>Smith</last>
</name>
<age>34</age>
<occupation>Manager</occupation>
</tenant>
</apartment>

I wanted to parse the xml dom with javascript and insert node values
into html elements at load calling the following function:

function parseXML()
{
xmlDoc = document.implementation.createDocument("","", null);
xmlDoc.async = "false";
xmlDoc.load("Apartment.xml");

document.getElementById("fname").innerHTML =
xmlDoc.getElementsByTagName("name")[0].childNodes[0].nodeValue;
}

When I tested to see the resulting output in Safari and Firefox, I
received to different error messages:

1) Safari 3.1's Inspector error console told me: value undefined Value
undefined (result of expression xmlDoc.load) is not object.
2) Firebug in Firefox told me: xmlDoc.getElementsByTagName("name")[0]
has no properties

I decided to ignore Safari and focus on the Firefox bug first. I
deleted the last statement of the function just to make sure the xml
file was being loaded. However, no matter where I parsed the tree for
a node value, I was told the node had no value. So I decided to
validate my XML using the XML Developer extension for Firefox and ran
the XML through its validator (it used some default scheme when none
was provided) and I was told that:

cvc-elt.1: Cannot find the declaration of element 'apartment'.

Huh? But its right there! Am I doing something wrong?
 
Ad

Advertisements

V

VK

I've been instructing myself in XML DOM parsing using the w3schools
tutorial and decided to try an example of my own. I'd written a short
XML file that looked like this:

<?xml version="1.0" encoding="UTF-8"?>
<apartment>
<tenant>
<name>
<first>John</first>
<last>Smith</last>
</name>
<age>23</age>
<occupation>Student</occupation>
</tenant>
<tenant>
<name>
<first>Alan</first>
<last>Smithee</last>
</name>
<age>22</age>
<occupation>Server</occupation>
</tenant>
<tenant>
<name>
<first>Jane</first>
<last>Smith</last>
</name>
<age>34</age>
<occupation>Manager</occupation>
</tenant>
</apartment>

I wanted to parse the xml dom with javascript and insert node values
into html elements at load calling the following function:

function parseXML()
{
xmlDoc = document.implementation.createDocument("","", null);
xmlDoc.async = "false";
xmlDoc.load("Apartment.xml");

document.getElementById("fname").innerHTML =
xmlDoc.getElementsByTagName("name")[0].childNodes[0].nodeValue;
}

When I tested to see the resulting output in Safari and Firefox, I
received to different error messages:

1) Safari 3.1's Inspector error console told me: value undefined Value
undefined (result of expression xmlDoc.load) is not object.
2) Firebug in Firefox told me: xmlDoc.getElementsByTagName("name")[0]
has no properties

I decided to ignore Safari and focus on the Firefox bug first. I
deleted the last statement of the function just to make sure the xml
file was being loaded. However, no matter where I parsed the tree for
a node value, I was told the node had no value. So I decided to
validate my XML using the XML Developer extension for Firefox and ran
the XML through its validator (it used some default scheme when none
was provided) and I was told that:

cvc-elt.1: Cannot find the declaration of element 'apartment'.

Huh? But its right there! Am I doing something wrong?

Starting from the bottom: your XML data sourse seems fine. I copy-
pasted it as-is w/o problems. So your XML validator is
hallucinogening.

Now about the actual problem: non-IE XML data reading implementations
are very screwed in different ways.
First of all, the XML parser by default creates parasite data entries
(so called "phantom nodes") on your pretty-print white spaces. So say
for the fragment
<name>
<first>John</first>
the first child of [name] is not [first] as you might expect but a
text node created instead of your newline-spaces pretty-print. You
either have to use anti-phantom nodes pretty print as I do or to use
one of numerous "tree walkers" to fix the problem. In the working
sample I'm posting I did neither to leave you free to choose. I just
used childNodes[1] instead of childNodes[0] to jump over the phantom
nodes.

nodeValue is null unless it is a text node. So by asking element
[first] nodeValue you are getting null instead of "John". The only
reason to do such thing I can think of is some hidden hate of browser
developers towards their user: so they wanted everyone of them to feel
as a fool at least once in their life. IMO foolishly looking here
still developers, the end users just paying their nerves for that. Any
way, you have to move one level deeper yet to finally get what you
want.

In the below sample it is assumed that your XML data saved as data.xml

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html lang="en-US">
<head>
<meta http-equiv="Content-type"
content="text/html; charset=iso-8859-1">
<title>Demo</title>
<script type="text/javascript">
function init() {
xmlDoc = document.implementation.createDocument('','', null);
xmlDoc.async = false;
xmlDoc.load('data.xml');
document.getElementById('output').
innerHTML = xmlDoc.getElementsByTagName('name')[0].
childNodes[1].firstChild.nodeValue;
// displays "John"
}
function releaseContextAndInit() {
window.setTimeout('init()',10);
}
window.onload = releaseContextAndInit;
</script>
</head>
<body>
<p id="output">Default content</p>
</body>
</html>
 
P

pr

Benoit said:
I've been instructing myself in XML DOM parsing using the w3schools
tutorial and decided to try an example of my own. I'd written a short
XML file that looked like this:
[...]

It checks out.
I wanted to parse the xml dom with javascript and insert node values
into html elements at load calling the following function:

function parseXML()
{
xmlDoc = document.implementation.createDocument("","", null);

Don't forget *var*.
xmlDoc.async = "false";

I don't believe Firefox lets you do this, so you need an event listener
to intercept a successful load(), or to use an XMLHttpRequest.
xmlDoc.load("Apartment.xml");

document.getElementById("fname").innerHTML =
xmlDoc.getElementsByTagName("name")[0].childNodes[0].nodeValue;

That isn't quite right. nodeValue only applies to text nodes. Firefox
2.x (at least) supports textContent on element nodes, however.

Putting it all together, you can replace those two lines with:

xmlDoc.addEventListener("load", loaded, false);

function loaded(e) {
document.getElementById("fname").innerHTML =
xmlDoc.getElementsByTagName("name")[0].textContent;
}

xmlDoc.load("Apartment.xml");
}

When I tested to see the resulting output in Safari and Firefox, I
received to different error messages:

1) Safari 3.1's Inspector error console told me: value undefined Value
undefined (result of expression xmlDoc.load) is not object.
2) Firebug in Firefox told me: xmlDoc.getElementsByTagName("name")[0]
has no properties

I decided to ignore Safari and focus on the Firefox bug first. I
deleted the last statement of the function just to make sure the xml
file was being loaded. However, no matter where I parsed the tree for
a node value, I was told the node had no value. So I decided to
validate my XML using the XML Developer extension for Firefox and ran
the XML through its validator (it used some default scheme when none
was provided) and I was told that:

cvc-elt.1: Cannot find the declaration of element 'apartment'.

Huh? But its right there! Am I doing something wrong?

You don't want to get into serious XML validation at this point, believe
me :)
 
P

pr

pr said:
I don't believe Firefox lets you do this, so you need an event listener
to intercept a successful load(), or to use an XMLHttpRequest.

Argh, you confused me for a minute there. Of course async doesn't work
if it's a string value. What you wanted was:

xmlDoc.async = false;
 
V

VK

xmlDoc.getElementsByTagName("name")[0].textContent;

Right, I forgot that DOM 3 finally introduced .textContent as
equivalent to IE's .text and it is finally mostly supported. So the
cross-browser code might be like:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN">
<html lang="en-US">
<head>
<meta http-equiv="Content-type"
content="text/html; charset=iso-8859-1">
<title>Demo</title>
<script type="text/javascript">
function init() {
if ('ActiveXObject' in window) {
xmlDoc= new ActiveXObject('Msxml2.DOMDocument.6.0');
xmlDoc.async = false;
xmlDoc.load('data.xml');
output.innerText = xmlDoc.getElementsByTagName('name')[0].text;
}
else {
xmlDoc = document.implementation.createDocument('','', null);
xmlDoc.async = false;
xmlDoc.load('data.xml');
document.getElementById('output').
innerHTML = xmlDoc.getElementsByTagName("name")[0].textContent;
}
}

function releaseContextAndInit() {
window.setTimeout('init()',10);
}

window.onload = releaseContextAndInit;
</script>
</head>
<body>
<p id="output">Default content</p>
</body>
</html>

Safari still doesn't work with new tools so the old way with getting
your left ear by left hand over your head has to be used.

But who really cares about Safari?
 
P

pr

VK said:
xmlDoc.getElementsByTagName("name")[0].textContent;

Right, I forgot that DOM 3 finally introduced .textContent as
equivalent to IE's .text and it is finally mostly supported. So the
cross-browser code might be like:
[...]

For that simple example maybe. For serious purposes you would IMO use
XMLHttpRequest asynchronously, supply an alternative to textContent/text
and avoid innerHTML. Safari 3.1 wouldn't have any difficulty with that.
 
Ad

Advertisements

B

Benoit

Seeing as the W3schools tutorial is misleading, could suggest a text
on XML technologies that's modern and browser-agnostic if not browser-
sensitive?

VK said:
xmlDoc.getElementsByTagName("name")[0].textContent;
Right, I forgot that DOM 3 finally introduced .textContent as
equivalent to IE's .text and it is finally mostly supported. So the
cross-browser code might be like:

[...]

For that simple example maybe. For serious purposes you would IMO use
XMLHttpRequest asynchronously, supply an alternative to textContent/text
and avoid innerHTML. Safari 3.1 wouldn't have any difficulty with that.
 
Ad

Advertisements

P

pr

Benoit said:
Seeing as the W3schools tutorial is misleading, could suggest a text
on XML technologies that's modern and browser-agnostic if not browser-
sensitive?

Here are a few links:

http://developer.mozilla.org/en/docs/AJAX:Getting_Started
http://developer.mozilla.org/en/docs/XMLHttpRequest
http://developer.mozilla.org/en/docs/Gecko_DOM_Reference

http://jibbering.com/2002/4/httprequest.html

On the subject of the version 2.0 DOM (for navigating and updating HTML
& XML content), there's nothing better than:

http://www.w3.org/TR/DOM-Level-2-Core

(once you get over the 'officialese').

You could also look at the Apple site, for example:

http://developer.apple.com/internet/webcontent/dom2i.html

Microsoft has some good documentation on its MSXML SDK:

http://msdn2.microsoft.com/en-us/library/ms760399.aspx

but you have to bear in mind that IE doesn't support XHTML and therefore
interaction between HTML and XML documents in IE is carried out by
rather primitive means (innerHTML instead of importNode(), for example).

Hope that helps.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Top