MSXML 4.0 loadXML and namespaces - what's wrong with this picture

S

Steve Jorgensen

I was having a bear of a time today trying to figure out some inconsistent
behavior selecting nodes from and MSXML DOM document, so I distilled the issue
down to a trivial test demonstrating the confusing behavior.

The deal is that I want to create most of my XML nodes from code using
createNode, but the code is much more clear if I can start by building the
basic skeleton of the document using a simple XML string. The catch is, I'll
need to use namespaces.

== First - my test code output ==

name=root namespaceURI=uri:foo
name=child1 namespaceURI=uri:foo
name=value namespaceURI=


== Now - the code that makes that output ==

Public Sub TestDomFromString()
Dim objDoc As New MSXML2.DOMDocument40

objDoc.setProperty "SelectionNamespaces", _
"xmlns:foo='uri:foo'"

objDoc.setProperty "SelectionLanguage", "XPath"

objDoc.loadXML _
"<root xmlns='uri:foo'>" & _
"<child1 value='123'/>" & _
"</root>"

PrintNodeProfile objDoc.selectSingleNode("/*")
PrintNodeProfile objDoc.selectSingleNode("/*/*[1]")
PrintNodeProfile objDoc.selectSingleNode("/*/*[1]/@*")

End Sub

Public Sub PrintNodeProfile(Node As MSXML2.IXMLDOMNode)
Debug.Print "name=" & Node.nodeName & " " & _
"namespaceURI=" & Node.namespaceURI
End Sub

== End of code ==

It looks like MSXML automatically trats child elements as being in the parent
namespace as expected, but the unqualified attribute of an element in a
namespace has no namespace! needless to say, that was leading to some
confusion earlier today while debugging my selectNode results.

Now - the question is, is this an MSXML bug or quirk, or is there something I
just don't get yet about how a DOM loadXML call is supposed to behave? If
this is a problem with MSXML, how do I work around it. If I'm doing it wrong,
how should I do it right?

I have another issue that builds off that one. I notice that when I do a
selectNodes or SelectSingleNode, if I'm trying to select an attribute that
does have a namespace, I have to prefix the attribute name, but I don't have
to do that in XSLT nor when building key and keyref expressions in W3C XML
Schemas. Why is the MSXML selectNodes behavior different than these cases.

Thanks,

- Steve J.
 
P

Philippe Poulard

Steve said:
I was having a bear of a time today trying to figure out some inconsistent
behavior selecting nodes from and MSXML DOM document, so I distilled the issue
down to a trivial test demonstrating the confusing behavior.

The deal is that I want to create most of my XML nodes from code using
createNode, but the code is much more clear if I can start by building the
basic skeleton of the document using a simple XML string. The catch is, I'll
need to use namespaces.

== First - my test code output ==

name=root namespaceURI=uri:foo
name=child1 namespaceURI=uri:foo
name=value namespaceURI=


== Now - the code that makes that output ==

Public Sub TestDomFromString()
Dim objDoc As New MSXML2.DOMDocument40

objDoc.setProperty "SelectionNamespaces", _
"xmlns:foo='uri:foo'"

objDoc.setProperty "SelectionLanguage", "XPath"

objDoc.loadXML _
"<root xmlns='uri:foo'>" & _
"<child1 value='123'/>" & _
"</root>"

PrintNodeProfile objDoc.selectSingleNode("/*")
PrintNodeProfile objDoc.selectSingleNode("/*/*[1]")
PrintNodeProfile objDoc.selectSingleNode("/*/*[1]/@*")

End Sub

Public Sub PrintNodeProfile(Node As MSXML2.IXMLDOMNode)
Debug.Print "name=" & Node.nodeName & " " & _
"namespaceURI=" & Node.namespaceURI
End Sub

== End of code ==

It looks like MSXML automatically trats child elements as being in the parent
namespace as expected, but the unqualified attribute of an element in a
namespace has no namespace! needless to say, that was leading to some
confusion earlier today while debugging my selectNode results.

the default namespace doesn't apply on attributes
unprefixed attributes are not is a namespace ; it is useless because
they are told to "belong" to their host element ; those that do have a
namespace are not belonging to it, they are "foreign attributes" and
must be prefixed
Now - the question is, is this an MSXML bug or quirk, or is there something I
just don't get yet about how a DOM loadXML call is supposed to behave? If
this is a problem with MSXML, how do I work around it. If I'm doing it wrong,
how should I do it right?

I have another issue that builds off that one. I notice that when I do a
selectNodes or SelectSingleNode, if I'm trying to select an attribute that
does have a namespace, I have to prefix the attribute name, but I don't have
to do that in XSLT

the behaviour is the same ; the XML data model is the same whatever the
tool you use, an XSLT stylesheet or an XPath engine

nor when building key and keyref expressions in W3C XML
Schemas. Why is the MSXML selectNodes behavior different than these cases.

Thanks,

- Steve J.


--
Cordialement,

///
(. .)
-----ooO--(_)--Ooo-----
| Philippe Poulard |
-----------------------
 
M

Martin Honnen

Steve Jorgensen wrote:

It looks like MSXML automatically trats child elements as being in the parent
namespace as expected, but the unqualified attribute of an element in a
namespace has no namespace! needless to say, that was leading to some
confusion earlier today while debugging my selectNode results.

But that is nothing specific to MSXML, an attribute is in no namespace
unless it has a qualified name which the value attribute does not have,
it is in no namespace.
Exception is an xmlns="http://example.com/ns1" attribute but you would
not see that in XPath as an attribute but as a namespace node. In the
DOM however such an attribute is by definition in the namespace
http://www.w3.org/2000/xmlns/. Looks however as if MSXML 4 does not
implement that.
 
S

Steve Jorgensen

Steve Jorgensen wrote:



But that is nothing specific to MSXML, an attribute is in no namespace
unless it has a qualified name which the value attribute does not have,
it is in no namespace.
Exception is an xmlns="http://example.com/ns1" attribute but you would
not see that in XPath as an attribute but as a namespace node. In the
DOM however such an attribute is by definition in the namespace
http://www.w3.org/2000/xmlns/. Looks however as if MSXML 4 does not
implement that.

OK, what you're saying clears up a lot of things. I think I may be having a
different problem with MSXML and XML Schema validation, then, or I never would
have thought I had a problem with attributes being unqualified in the first
place. In any case, I'll proceed with diagnosis today with a much more useful
understanding.

Thanks very much to you and to Philippe for your explanations.
 
S

Steve Jorgensen

....
OK, what you're saying clears up a lot of things. I think I may be having a
different problem with MSXML and XML Schema validation, then, or I never would
have thought I had a problem with attributes being unqualified in the first
place. In any case, I'll proceed with diagnosis today with a much more useful
understanding.

OK, this turned out to be all just a series of my misconceptions. I had
declared one of the attributes in my schema globally, and I didn't realize
that forced it to be qualified. That was compounded by my lack of
understanding that when attributes are "unqualified", that really means they
have no namespace, and doesn't just mean that they don't need a prefix to be
in the parent element's namespace.
 
S

Soren Kuula

Philippe Poulard wrote:

Hi, forgive med for breaking in here, I think the OP's had his questions
answered anyway ;)
the default namespace doesn't apply on attributes
unprefixed attributes are not is a namespace ; it is useless because
they are told to "belong" to their host element ; those that do have a
namespace are not belonging to it, they are "foreign attributes" and
must be prefixed

That's the most normal way to use namespace-qualified attributes, yes...

From Namespaces in XML (http://www.w3.org/TR/REC-xml-names/)

5.2 Namespace Defaulting

A default namespace is considered to apply to the element where it is
declared (if that element has no namespace prefix), and to all elements
with no prefix within the content of that element. If the URI reference
in a default namespace declaration is empty, then unprefixed elements in
the scope of the declaration are not considered to be in any namespace.
Note that default namespaces do not apply directly to attributes.

On the other hand:

[Definition:] The attribute's value, a URI reference, is the namespace
name identifying the namespace. The namespace name, to serve its
intended purpose, should have the characteristics of uniqueness and
persistence. It is not a goal that it be directly usable for retrieval
of a schema (if any exists). An example of a syntax that is designed
with these goals in mind is that for Uniform Resource Names [RFC2141].
However, it should be noted that ordinary URLs can be managed in such a
way as to achieve these same goals.

I have a situation where I need to insert an element in no-namespace
land into a document where the default namespace is declared. Is it
permitted to do:

<foo xmlns="gedefims">
<blah/>
<blah/>
<blah/>
<namespaceless-intruder xmlns=""/>
</foo>

- It should be permitted because 5.2 in NS in XML says it is
but
- The empty string ain't no URI

Does anyone know if I can always expect it to work? At least one XML
validator complains about the empty URI. I know I can always get around
it the hard way -- using a prefix-declared namespace for foo and blahs
instead. But there's too much trouble with it .. for one thing, I also
have attribute VALUES being interpreted depending on the bindings of the
NSs.

Soren
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,582
Members
45,065
Latest member
OrderGreenAcreCBD

Latest Threads

Top