Dumb question about getElementById and element.className

S

sp

I create an xml file in a text editor:
<?xml version="1.0" encoding="utf-8"?>
<elts>
<elt id="1" class="c1">content1</elt>
<elt id="2" class="c1">content2</elt>
</elts>

Then I load the file via xmlHttpRequest into xmlData variable.
Why does xmlData.getElementById('1') return null and
xmlData.firstChild.childNodes[0].className returns null also?

(I use Firefox 1.5).
 
T

TheBagbournes

sp said:
I create an xml file in a text editor:
<?xml version="1.0" encoding="utf-8"?>
<elts>
<elt id="1" class="c1">content1</elt>
<elt id="2" class="c1">content2</elt>
</elts>

Then I load the file via xmlHttpRequest into xmlData variable.
Why does xmlData.getElementById('1') return null and
xmlData.firstChild.childNodes[0].className returns null also?

Post the code.
 
R

RobG

sp said:
I create an xml file in a text editor:
<?xml version="1.0" encoding="utf-8"?>
<elts>
<elt id="1" class="c1">content1</elt>

In HTML, an ID can't start with a number - but that is unlikely to cause
you any actual grief here unless you are trying to turn this into HTML
and browsers have become more strict (most will tolerate IDs that start
with a digit anyway, but it's invalid as HTML).

<elt id="2" class="c1">content2</elt>
</elts>

Then I load the file via xmlHttpRequest into xmlData variable.
Why does xmlData.getElementById('1') return null and

It could be that the browser is strictly enforcing the above validation,
but I doubt it - post the code.

xmlData.firstChild.childNodes[0].className returns null also?

Gecko browsers add text nodes to keep whitespace in the source.
xmlData.firstChild.childNodes[0] possibly references a text node that
doesn't have a className attribute.

I think that xmlData.firstChild will probably reference the text node
placed after <elts> and that childNodes[0] doesn't exist since a text
node can't have any children. But then I'd expect a result of an error
like 'object expected' or undefined, not null.

(I use Firefox 1.5).

Use the DOM Inspector to see what it makes of your XML. Add some alerts
to see what you've got:

- alert( typeof xmlData );
- alert( typeof xmlData.firstChild );
- alert( typeof xmlData.firstChild.childNodes );
- alert( typeof xmlData.firstChild.childNodes.length );


and so on.
 
M

Michael Winter

sp wrote:
[snip]
<?xml version="1.0" encoding="utf-8"?>
<elts>
<elt id="1" class="c1">content1</elt>

In HTML, an ID can't start with a number

In general, that is true of any application of SGML, and it is also true
of XML and its applications.

In XML, this is a violation of the ID validity constraint. A conforming,
non-validating parser may continue to process the rest of the document,
but it should be corrected, nevertheless.


The following is based on how I would /expect/ the Gecko engine to
behave. I say this because I've not used its XML DOM, so corrections are
not only welcome, but may be necessary.

[snip]

In order for the getElementById method to find an element, it must know
what attributes are identifiers. It doesn't do this by name, but by
type: ID. As you haven't provided a document type, it has no way of
knowing what attributes are of what type. That said, the Gecko engine
doesn't implement a validating parser, therefore it wouldn't actually
fetch the DTD, even if you included a declaration.

There are two options.

1) Define an internal subset for your document.
2) Avoid the getElementById method and walk the document tree.

The former would produce a document that may look something like:

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE elts [
<!ELEMENT elts (elt)* >

<!ELEMENT elt (#PCDATA) >
<!ATTLIST elt
id ID #IMPLIED
class NMTOKENS #IMPLIED]>
<elts>
<elt id="a" class="c1">content1</elt>
<elt id="b" class="c1">content2</elt>
</elts>

[snip]
xmlData.firstChild.childNodes[0].className returns null also?

Gecko browsers add text nodes to keep whitespace in the source.

Even if Rob wasn't right about white space nodes, the className property
will be undefined for elements, anyway. The className property, like all
other shortcut attribute properties, is defined for the HTML DOM. As an
XML document should only expose the Core module, you would need to use
methods such as getAttribute to retrieve the attribute value.

[snip]

Mike
 
T

Thomas 'PointedEars' Lahn

Michael said:
In general, that is true of any application of SGML, and it is also true
of XML and its applications.

Full ACK.
In XML, this is a violation of the ID validity constraint. A conforming,
non-validating parser may continue to process the rest of the document,

Even a validating XML processor may continue, and then it MUST make a
list of errors available that were encountered while parsing.

The difference between a validating and a non-validating XML processor
is layed out in <URL:http://www.w3.org/TR/REC-xml/>, chapter 5. It is
not specified there that a validating XML processor MUST stop parsing
when it encounters a violation of XML well-formedness.
but it should be corrected, nevertheless.
Indeed.

The following is based on how I would /expect/ the Gecko engine to
behave. I say this because I've not used its XML DOM, so corrections
are not only welcome, but may be necessary.
Gladly.
[snip]
Why does xmlData.getElementById('1') return null and

In order for the getElementById method to find an element, it must know
what attributes are identifiers. It doesn't do this by name, but by
type: ID. As you haven't provided a document type, it has no way of
knowing what attributes are of what type.
ACK

That said, the Gecko engine doesn't implement a validating parser,
therefore it wouldn't actually fetch the DTD, even if you included
a declaration.

Quite the contrary. Gecko's _XML_ parser is a validating one, and it stops
parsing when it encounters an error (although it is not required to, see
above). The DOCTYPE declaration does matter there for XML document types
in general. But apparently for certain DOCTYPE declarations, such as for
XHTML 1.x, a local catalog file is used, and the DTD resource is not
fetched from the location given by the system identifier of the
declaration, while it is for other XML document types. AFAIS, the internal
subset declaration is evaluated always for XML document types (when served
using the appropriate media type).
There are two options.

1) Define an internal subset for your document.
2) Avoid the getElementById method and walk the document tree.

3) Use an XPath expression, such as //elt[@id="1"]
The former would produce a document that may look something like:

<?xml version="1.0" encoding="utf-8" ?>
<!DOCTYPE elts [
<!ELEMENT elts (elt)* >

<!ELEMENT elt (#PCDATA) >
<!ATTLIST elt
id ID #IMPLIED
class NMTOKENS #IMPLIED
[...]

There is no necessity for the restrictions the NMTOKENS type imposes here;
CDATA suffices. This is not (X)HTML, so the value of the `class' attribute
does not need to be a space-separated list of CSS classes.
xmlData.firstChild.childNodes[0].className returns null also?
Gecko browsers add text nodes to keep whitespace in the source.

Even if Rob wasn't right about white space nodes, the className property
will be undefined for elements, anyway. The className property, like all
other shortcut attribute properties, is defined for the HTML DOM.

It is defined for document types that W3C DOM Level 2 HTML applies to.
Those are HTML 4.01 document types, and XHTML 1.0 document types.
As an XML document should only expose the Core module, you would need to
use methods such as getAttribute to retrieve the attribute value.

Not only the Core module; consider DOM Level 3 Load and Save, for example.
True is that a user agent should/must not expose interfaces of W3C DOM
Level 2 HTML for XML document types, unless it is an XHTML 1.0 document
type or these interfaces are specified elsewhere.

BTW: While doing research for my response, I have found
<URL:http://www.w3.org/2003/02/06-dom-support.html>.


PointedEars
 
M

Michael Winter

Michael Winter wrote:
[snip]
In XML, this is a violation of the ID validity constraint. A
conforming, non-validating parser may continue to process the rest
of the document,

Even a validating XML processor may continue, and then it MUST make a
list of errors available that were encountered while parsing.

But it may only continue to parse in that error reporting capacity. I
was trying to draw the distinction that a non-validating parser may
continue to parse /normally/ (except when encountering a violated
well-formedness constraint).

[snip]
That said, the Gecko engine doesn't implement a validating parser,
therefore it wouldn't actually fetch the DTD, even if you included
a declaration.

Quite the contrary. Gecko's _XML_ parser is a validating one [...]

Then it seems to act in a way that I wouldn't expect: the MDC states
that "Mozilla does not load external entities from the web"[1]. So, it
would seem that it's conditionally validating.

In any case, it is unlikely that the OP will be able to use an external
subset to define his attribute types.
AFAIS, the internal subset declaration is evaluated always for XML
document types (when served using the appropriate media type).

A conforming parser /must/ process the internal subset[2], and the MDC
suggests that it does[1].
There are two options.
[snip]

3) Use an XPath expression, such as //elt[@id="1"]

I should have added, "as it see it." :)

[snip]
There is no necessity for the restrictions the NMTOKENS type imposes
here; CDATA suffices.

True, but NMTOKEN is a fairly close match for the IDENT lexer token in
the CSS grammar. The CDATA type would be necessary to allow the complete
range of possible values, but I think it would be excessive.

In any case, it was only a suggestion.

[snip]
As an XML document should only expose the Core module, you would
need to use methods such as getAttribute to retrieve the attribute
value.

Not only the Core module; [...]

Indeed. I really didn't mean to make that statement as absolute as it is.

[snip]
BTW: While doing research for my response, I have found
<URL:http://www.w3.org/2003/02/06-dom-support.html>.

I am aware of that document, though it's been a while since I've seen it.

Mike


[1] MDC: DTDs and Other External Entities

<http://developer.mozilla.org/en/docs/XML_in_Mozilla#DTDs_and_Other_External_Entities>
[2] XML 1.0: Non-validating parsers and internal subsets
<http://www.w3.org/TR/REC-xml/#dt-use-mdecl>
 
T

Thomas 'PointedEars' Lahn

Michael said:
But it may only continue to parse in that error reporting capacity.

Parse error :)
I was trying to draw the distinction that a non-validating parser may
continue to parse /normally/

A validating XML processor may do the same.
(except when encountering a violated well-formedness constraint).

That is a possibility, not a necessity, which is the very idea of a
non-validating XML processor.
That said, the Gecko engine doesn't implement a validating parser,
therefore it wouldn't actually fetch the DTD, even if you included
a declaration.
Quite the contrary. Gecko's _XML_ parser is a validating one [...]

Then it seems to act in a way that I wouldn't expect: the MDC states
that "Mozilla does not load external entities from the web"[1]. So, it
would seem that it's conditionally validating.

Occam's Razor: MDC, which is a Wiki, is simply wrong about this.
In any case, it is unlikely that the OP will be able to use an external
subset to define his attribute types.
AFAIS, the internal subset declaration is evaluated always for XML
document types (when served using the appropriate media type).

A conforming parser /must/ process the internal subset[2],

| Non-validating processors are REQUIRED to check only the document entity,
^^^^^^^^
| including the entire internal DTD subset, for well-formedness.
^^^^^^^^^^^^^^^^^^^^
| [Definition: While they are not required to check the document for
| validity, they are REQUIRED to process all the declarations they read in
| the internal DTD subset and in any parameter entity that they read, up to
^^^^^
| the first reference to a parameter entity that they do not read; that is
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| to say, they MUST use the information in those declarations to normalize
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| attribute values, include the replacement text of internal entities, and
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
| supply default attribute values.]
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
and the MDC suggests that it does[1].

I do not need MDC to tell me that. It can be easily tested. In fact, I
have said that Gecko's XML parser parses the internal subset, have I not?
True, but NMTOKEN is a fairly close match for the IDENT lexer token in
the CSS grammar. The CDATA type would be necessary to allow the complete
range of possible values, but I think it would be excessive.

The next sentence in my posting, which you snipped, should clarify this.

Your presumption is that the value of a `class' attribute in a user-defined
XML document type must be a reference to one or more CSS classes. It needs
not to, hence the NMTOKENS restriction is unnecessary here, and misleading.
In any case, it was only a suggestion.

OK


PointedEars
 
M

Michael Winter

Michael Winter wrote:

[On validating processors encountering invalid markup]
But it may only continue to parse in that error reporting capacity.
[snip]
I was trying to draw the distinction that a non-validating parser
may continue to parse /normally/

A validating XML processor may do the same.

Sorry. I thought that validating processors were meant to consider
validity errors as fatal. They are not.
That is a possibility, not a necessity, which is the very idea of a
non-validating XML processor.

Violations of well-formedness constraints are fatal errors, irrespective
of the validating nature of the processor.

well-formedness constraint
[Definition: A rule which applies to all well-formed XML
documents. Violations of well-formedness constraints are
fatal errors.]

fatal error
[Definition: An error which a conforming XML processor must
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
detect and report to the application. After encountering a
^^^^^^
fatal error, the processor may continue processing the data
to search for further errors and may report such errors to
the application. In order to support correction of errors,
the processor may make unprocessed data from the document
(with intermingled character data and markup) available to
the application. Once a fatal error is detected, however, the
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
processor must not continue normal processing (i.e., it must
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
not continue to pass character data and information about the
document's logical structure to the application in the normal
way).]

[snip]
Then it seems to act in a way that I wouldn't expect: the MDC
states that "Mozilla does not load external entities from the
web"[1]. So, it would seem that it's conditionally validating.

Occam's Razor: MDC, which is a Wiki, is simply wrong about this.

Perhaps, but I don't think it is. In a quick test, I don't see a request
for a DTD made to my test server. With the same document, the W3C
validator (in its XML-like mode) does make such a request.

[snip]
[snip]

In fact, I have said that Gecko's XML parser parses the internal
subset, have I not?

As shown above, you started your statement with "as far as I see", which
usually denotes some reservations about what's to follow. I merely
stated that I agree, and why.

[snip]
Your presumption is that the value of a `class' attribute in a
user-defined XML document type must be a reference to one or more CSS
classes.

Yes, it is, but I think that it's a fairly safe presumption to make. The
OP can always correct me.

[snip]

Mike
 
T

Thomas 'PointedEars' Lahn

Michael said:
Michael Winter wrote:
[On validating processors encountering invalid markup]
That is a possibility, not a necessity, which is the very idea of a
non-validating XML processor.

Violations of well-formedness constraints are fatal errors, [...]
ACK

[...]
the application. Once a fatal error is detected, however, the
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
processor must not continue normal processing (i.e., it must
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
not continue to pass character data and information about the
document's logical structure to the application in the normal
way).]

But what is "normal processing", and what is "the normal way"?
Then it seems to act in a way that I wouldn't expect: the MDC
states that "Mozilla does not load external entities from the
web"[1]. So, it would seem that it's conditionally validating.
Occam's Razor: MDC, which is a Wiki, is simply wrong about this.

Perhaps, but I don't think it is. In a quick test, I don't see
a request for a DTD made to my test server.

My bad. Indeed it does not. That explains why I have to "redeclare"
entities of the used DTD in the internal subset of the DOCTYPE declaration
in order to have Gecko-based UAs display the tree structure of my XML
document if I use references to those entities in it. It would appear
that applying Occam's Razor should have resulted in: I am wrong about
this.
With the same document, the W3C validator (in its XML-like mode)
does make such a request.
ACK
AFAIS, the internal subset declaration is evaluated always for
XML document types (when served using the appropriate media
type).
[snip]

In fact, I have said that Gecko's XML parser parses the internal
subset, have I not?

As shown above, you started your statement with "as far as I see", which
usually denotes some reservations about what's to follow. I merely
stated that I agree, and why.

ACK, sorry.
Yes, it is, but I think that it's a fairly safe presumption to make.

Because? XML is originally but a data format.
The OP can always correct me.

I do not think it is helpful to be more restrictive than first indicated.
So far we are only talking about an XML data source, not something that
is used for rendering. It is better that the OP may wonder about "CDATA",
then gets informed about DTDs, and then finds CDATA not restrictive enough,
than him/her only realizing that the example you posted does not work for
his _arbitrary_ data because it does not validate with your internal
subset.


Regards,
PointedEars
 
M

Michael Winter

Michael Winter wrote:
[snip]
the application. Once a fatal error is detected, however,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
the processor must not continue normal processing (i.e.,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
it must not continue to pass character data and
information about the document's logical structure to the
application in the normal way).]

But what is "normal processing",

Parsing the document in order to construct a document tree, presumably.
and what is "the normal way"?

I should think that is implementation-defined. However, it seems that
they're trying to say that any further data about the document should be
passed in such a way that marks it as erroneous.

[snip]

The OP is clearly borrowing concepts from HTML, therefore it's not
unreasonable to believe that they will be used in the same way.
The OP can always correct me.

[...] It is better that the OP may wonder about "CDATA", then gets
informed about DTDs, and then finds CDATA not restrictive enough,
than him/her only realizing that the example you posted does not work
for his _arbitrary_ data because it does not validate with your
internal subset.

True. However, I would hope that someone would seek to understand XML[1]
thoroughly before committing to using it to transfer or store
information, in which case the OP would be in a position to evaluate my
earlier suggestion properly.

Mike


[1] The topic in the OP doesn't necessarily mean that the poster
is unfamiliar with the specifics of XML, just the DOM.
 
T

Thomas 'PointedEars' Lahn

Michael said:

The OP is clearly borrowing concepts from HTML, [...]

I don't think so. What is for sure only is that he knows the W3C DOM 2 HTML
to the point that there is a `className' shortcut property of an element
object to hold the value of the `class' attribute of the corresponding
element.
The OP can always correct me.
[...] It is better that the OP may wonder about "CDATA", then gets
informed about DTDs, and then finds CDATA not restrictive enough,
than him/her only realizing that the example you posted does not work
for his _arbitrary_ data because it does not validate with your
internal subset.

True. However, I would hope that someone would seek to understand XML[1]
thoroughly before committing to using it to transfer or store
information, in which case the OP would be in a position to evaluate my
earlier suggestion properly.
[...]
[1] The topic in the OP doesn't necessarily mean that the poster
is unfamiliar with the specifics of XML, just the DOM.

Fair enough.


PointedEars
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
474,432
Messages
2,571,682
Members
48,796
Latest member
Greg L.

Latest Threads

Top