Re: more XPath struggles (tDOM)

Discussion in 'XML' started by Joseph J. Kesselman, May 2, 2008.

  1. Mikhail Teterin wrote:
    > $xml selectNodes {//mp:date}
    >
    > does not find any, but
    >
    > $xml selectNodes {//*[name()="mp:date"]}
    >
    > works. What's the reason for the difference?


    XPath is namespace-aware. To select a namespaced node, you must use a
    prefix in your path (which you did) and tell your XPath evaluator what
    the prefix bound to (which you didn't). Look at the user's manual for
    your tool.

    (comp.lang.xml doesn't exist on my server, so I can't crosspost there.)
     
    Joseph J. Kesselman, May 2, 2008
    #1
    1. Advertising

  2. Mikhail Teterin wrote:
    > But the namespaces are defined in the document itself
    > -- for example:
    > <xc:XmlCache xmlns:xc="XmlCache" xc:action="Update">
    > why do I still need to specify them? It certainly works with xml_grep... Is
    > there a bug in the package (tDOM), or is the above element not sufficient
    > to define a namespace?


    Basically, namespaces and XPath don't sit too well together (when the
    XPath expression is located in an XML document) because the document
    namespace context at the point in the document where the XPath is
    located is shrouded from the expression (because it is formally an
    xsd:string, which is not namespace-aware according to the namespaces
    spec). This means that if you're embedding an XPath expression in an
    XML document, you also need to embed a way to tell it what the
    namespace context to evaluate in is. (This is stupid, but the way it
    is and isn't a Tcl problem at all.)

    If the XPath expression is not contained in some XML context, then it
    is even more obvious that the namespace context needs to be given.

    Donal.
     
    Donal K. Fellows, May 2, 2008
    #2
    1. Advertising

  3. Mikhail Teterin wrote:
    > I got some progress... But the namespaces are defined in the document itself
    > -- for example:
    > <xc:XmlCache xmlns:xc="XmlCache" xc:action="Update">
    > why do I still need to specify them?


    Because they could be set differently at different places in the
    document, and/or whatever generated your XPath might have different
    prefixes bound to those namespaces or vice versa. You need to provide a
    context so the system knows what you meant.

    Some processors will let you specify a context node and will pick up the
    namespaces defined there. Again, check your docs.

    > It certainly works with xml_grep


    I don't know xml_grep, so I can't advise. It may be assuming the root
    node as the context if not told otherwise. Or it may be flat-out broken
    and not processing namespaces correctly.

    Say what you mean. The system can't read your mind, and shouldn't try.
     
    Joseph J. Kesselman, May 3, 2008
    #3
  4. > Basically, namespaces and XPath don't sit too well together

    It works fine when you understand how to use it properly.

    The only real problem is that XPath relied on prefixes retrieved from
    some unspecified environment (depending on the context/tool in which the
    XPath is being executed). That's a bit less verbose than using an
    "expanded qualified name" like {http://my_namespace}foo, or requiring
    that the namespace bindings be specified via some syntax in the XPath
    string. But it does mean that an XPath is partly defined by that
    context. (Then again, XPaths which use variables also need a context, as
    do those which use some of the functions, so this is just the most
    obvious -- and most unnecessary -- instance thereof.)

    It is possible to write a portable namespace-aware XPath that doesn't
    rely on prefixes (via some ugly predicate hacks)... but it really should
    be easier to do so. Oh well. 20:20 hindsight; maybe XPath 3.0 will
    finally reconsider that point.

    By the way: The namespaces shown in the original example are not
    considered acceptable by today's standards. Namespace names should be
    fully-qualified ("absolute") URI References. Yes, the original namespace
    spec was fuzzy about that, and many tools won't enforce this... but
    after much painful debate, the W3C agreed that the concept of a
    "relative namespace" really didn't make any sense no matter how you
    sliced it. Tim Berners-Lee reserves the right to reintroduce that idea
    if and when the Semantic Web effort comes up with a way to make those
    meaningful... but until then, you really should make sure all your
    namespace names follow the official absolute-URI-reference syntax.
     
    Joseph J. Kesselman, May 3, 2008
    #4
  5. In article <>,
    Mikhail Teterin <> wrote:

    >I don't want it to read my mind, I want it to read the document. The
    >namespaces are set there with an xmlns-attribute of containing elements. In
    >fact, when I for the node's name [$node nodeName], I get the
    >fully-qualified foo.bar.woof.meow.
    >
    >It KNOWS the namespace-mapping, but it wants me to repeat it (f means foo, b
    >means bar, w means woof, etc.). That's gratuitous...


    Suppose you try to use the same XPath expressions with a document
    that uses different prefixes. How's that going to work? Your XPath
    expressions will all be wrong.

    The choice of prefixes is supposed to be arbitrary. You can't rely on
    f meaning foo. Even within a single document, you can use the same
    prefix for different namespaces and different prefixes for the
    same namespace.

    -- Richard

    --
    :wq
     
    Richard Tobin, May 3, 2008
    #5
  6. Joseph J. Kesselman wrote:
    > It is possible to write a portable namespace-aware XPath that doesn't
    > rely on prefixes (via some ugly predicate hacks)... but it really should
    > be easier to do so. Oh well. 20:20 hindsight; maybe XPath 3.0 will
    > finally reconsider that point.


    It'd be OK if there was a type "like xs:string, but understands the
    current namespace context" but there isn't. (Of course, once you
    extract the XPath from its context document you then need to remember
    to explicitly get the NS context from somewhere, which is almost
    certainly the root of the problem in the message that started this
    thread.)

    Donal.
     
    Donal K. Fellows, May 3, 2008
    #6
  7. Joseph J. Kesselman

    Rolf Ade Guest

    Mikhail Teterin wrote:
    >Joseph J. Kesselman wrote:
    >> XPath is namespace-aware. To select a namespaced node, you must use a
    >> prefix in your path (which you did) and tell your XPath evaluator what
    >> the prefix bound to (which you didn't). Look at the user's manual for
    >> your tool.

    >
    >Thanks. After I explicitly set:
    >
    > $xml selectNodesNamespaces {mp MarketParameters xc XmlCache}


    That's the right way to bind prefixes to a namespace. One way. You can
    always use the -namespaces option to the selectNodes method, but
    setting things up with one selectNodesNamespaces call for the rest of
    the lifetime of the document seems to be more convenient to me.

    >I got some progress... But the namespaces are defined in the document itself
    >-- for example:
    >
    > <xc:XmlCache xmlns:xc="XmlCache" xc:action="Update">
    >
    >why do I still need to specify them? It certainly works with xml_grep... Is
    >there a bug in the package (tDOM), or is the above element not sufficient
    >to define a namespace?


    No, it's not a bug. As long as no selectNodesNamespaces setting nor
    the -namespaces option is given, tDOM even respects the XML namespace
    declarations of the document. The context node of your XPath
    expression is the node, from which you call your XPath expression. If
    the (all) prefixes, you're using in your XPath expression are in scope
    of that node, you've to do nothing; namespace resolving will work as
    you expect. Since you had trouble with this, I'd bet, not all used
    XML namespace declarations are in scope of your context node.

    But, as others already have pointed out, it is _dangerous_ to bank on
    the prefixes in the document. Prefixes don't matter, it's the
    namespaces, that matters.

    From the XML viewpoint,

    <a:doc xmlns:a="http://foo.bar.com">
    <a:elem>data</a:elem>
    </a:doc>

    and

    <b:doc xmlns:b="http://foo.bar.com">
    <b:elem>data</a:elem>
    </b:doc>

    are the in some sense the 'same' documents.

    You can't just say [$someNode selectNodes a:elem] in your code and
    expect that to work reliable. If the document provider uses another
    prefix (bound to the same namespace), your code will fail.

    The clear way out is, to say the XPath engine, which namespace you
    mean with which prefix. With e.g. selectNodesNamespaces.

    rolf
     
    Rolf Ade, May 6, 2008
    #7
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Samuel
    Replies:
    2
    Views:
    724
    Samuel
    Sep 9, 2005
  2. Michael
    Replies:
    4
    Views:
    472
    Matt Hammond
    Jun 26, 2006
  3. Richard Tobin
    Replies:
    0
    Views:
    557
    Richard Tobin
    Apr 9, 2008
  4. Richard Tobin
    Replies:
    0
    Views:
    479
    Richard Tobin
    May 2, 2008
  5. Robert Klemme

    With a Ruby Yell: more, more more!

    Robert Klemme, Sep 28, 2005, in forum: Ruby
    Replies:
    5
    Views:
    242
    Jeff Wood
    Sep 29, 2005
Loading...

Share This Page