namespaces and xpath queries

Discussion in 'XML' started by yawnmoth, Mar 31, 2010.

  1. yawnmoth

    yawnmoth Guest

    Say I have the following XML file:

    <parent xmlns="uri://domain.tld/"><child /></parent>

    In order to query the child node I need to register the uri://domain.tld/
    namespace with a prefix, like 'parent'. I then need to do '//
    parent:parent//parent:child'. My question is this: if one node is in
    a namespace are all children to that node assumed to be in the same
    name space? In that case it seems like it's redundant to have to
    specify the namespace for each and every successive child. I mean, is
    there even a way to construct a child to parent:parent such that '//
    parent:parent//child' would return a node?

    Also, it doesn't seem to me like there's really a lot of point in
    assigning a namespace to the root node? I mean, since there can only
    be one root node it's not as if disambiguation is necessary. For
    <root><parent xmlns="uri://a.a" /><parent /></root>, you'd need a
    namespace to differentiate the two parent tags, but for my initial XML
    it seems totally unnecessary? And yet I frequently see XML documents
    that do just this - place the root node in it's own namespace.
     
    yawnmoth, Mar 31, 2010
    #1
    1. Advertisements

  2. yawnmoth

    Mayeul Guest

    Yes, but you should really keep the prefix short. Like p or par.
    More or less. The rules are more precise than that.
    XML provides two ways to declare an element's namespace:
    - default namespace
    - prefixed namespace

    Default namespace is declared as such:
    xmlns="uri://domain.tld/"
    Any element without prefix is in the default namespace.
    Note that default namespaces cannot be declared for an XPath expression.

    Prefixed namespaces are declared as such:
    xmlns:p="uri://domain.tld/"
    And an element is in this namespace if it is prefixed with p :
    <p:parent xmlns:p="uri://domain.tld/">
    <p:child />
    </p:parent>

    Both default namespace and prefixed namespaces are inherited by descendants.

    Therefore, it is possible to declare this:
    <parent xmlns="uri://domain.tld/" xmlns:c="http://anotheruri.tld">
    <c:child>
    <grandchild/>
    </c:child>
    </parent>

    Here, <grandchild> is in the same namespace as <parent>, because they're
    both in the declared default namespace.

    <child> is in another namespace though, even though it is a descendant
    In practice it is, yes. Better keep your prefixes short.
    Yes:
    <parent:parent xmlns:parent="uri://domain.tld/">
    <child/>
    </parent:parent>

    Here <child> is in the "no namespace" namespace.

    Equivalent example:
    <parent xmlns="uri://domain.tld/">
    I guess I can't see a point to do that either.

    However, I'd prefer the consistency: what sense would it make if every
    elements is in a namespace but the root element is in no namespace?

    Besides, the root element is a great place where to declare namespaces,
    since they're inherited. So it would usually be in whatever default
    namespace it declared, or on the contray, would need a prefix to *not*
    be in *another* namespace than the default it declared.
     
    Mayeul, Mar 31, 2010
    #2
    1. Advertisements

  3. yawnmoth

    yawnmoth Guest

    Thanks - that was a very helpful response! :)
     
    yawnmoth, Mar 31, 2010
    #3
  4. Yes, unfortunately XPath 1.0 does not have the concept of default
    namespace, so your path must use prefixes to specify namespaced nodes.
    In general, namespace declarations are inherited to child elements until
    another declaration overrides them. In specific, that includes the
    default namespace declaration, which is what you've used here. So, yes,
    your <child/> element is in the same namespace as <parent/>, because it
    has the same prefix (none) and does not redeclare that prefix, and hence
    is bound to the same namespace (the default)
    See my first point. Sorry, but XPath 1.0 doesn't support that shortcut.
    The namespace is part of the name, and the semantics, of the node. It
    will affect how namespace-aware applications process the node, including
    how it is validated against schemas, how it is processed by stylesheets,
    and how other tools handle it. When there is *any* possiblity of
    confusion about what kind of document you may be looking at, or simply
    for code cleanliness reasons, putting the root node into a namespace
    will be wise.

    Generally, the namespace is related to a kind of document or data
    structure -- a purchase order, for example, or a mailing address -- and
    all elements (and occasionally some attributes) related to that kind of
    data will use that namespace. That will be true whether the element
    happens to be the root of this particular document or not. If you leave
    off the namespace, you've changed the meaning of the element.

    Also, it's often just as easy to do a few global namespace declarations
    on the root element and have those prefixes (and/or that default)
    available through the rest of the document via inheritance, rather than
    having to repeatedly redeclare those bindings. As long as you're doing
    them at the root anyway, putting the root element itself into the proper
    namespace really doesn't add any significant amount of work.

    --
    Joe Kesselman,
    http://www.love-song-productions.com/people/keshlam/index.html

    {} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
    /\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
     
    Joe Kesselman, Mar 31, 2010
    #4
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.