Ways to tell if two XPath expressions have intersecting results

Discussion in 'XML' started by Weston, Oct 18, 2007.

  1. Weston

    Weston Guest

    Are there any quick, reliable shortcuts to determine if two arbitrary
    XPath expressions yield result sets that intersect?

    The obviously way would be to iterate over the set of nodes from one
    expression, checking each one to see if it's also contained in the
    other set, but I'm wondering if there might be ways of doing this by
    looking at or rewriting the actual XPath expressions themselves,
    without evaluating them.
     
    Weston, Oct 18, 2007
    #1
    1. Advertising

  2. Weston wrote:
    > Are there any quick, reliable shortcuts to determine if two arbitrary
    > XPath expressions yield result sets that intersect?


    In general, no. Appropriate logic on the paths can recognize cases that
    must intersect, and cases that definitely can't intersect, but there's
    going to be a huge set in the middle which depends on the actual data
    the paths are being evaluated against.

    Folks who have worked on XML processing have explored, published, and/or
    patented some related techniques. So a literature search would seem to
    be a good place to start... Sorry, I don't have any pointers already on
    hand to pass you.

    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
     
    Joseph Kesselman, Oct 18, 2007
    #2
    1. Advertising

  3. "Weston" <> wrote in
    message news:...
    > Are there any quick, reliable shortcuts to determine if two arbitrary
    > XPath expressions yield result sets that intersect?
    >
    > The obviously way would be to iterate over the set of nodes from one
    > expression, checking each one to see if it's also contained in the
    > other set, but I'm wondering if there might be ways of doing this by
    > looking at or rewriting the actual XPath expressions themselves,
    > without evaluating them.
    >


    (count(XPath1) + count(XPath2)) != Count(XPath1 | XPath2)

    When creating union of node sets with the | operator any node common to both
    sides will only be included in the resulting set once. Hence the above
    boolean expression is true when XPath1 and XPath2 contain at least one Node
    in common.

    --
    Anthony Jones - MVP ASP/ASP.NET
     
    Anthony Jones, Oct 18, 2007
    #3
  4. Anthony Jones wrote:
    > (count(XPath1) + count(XPath2)) != Count(XPath1 | XPath2)


    I was presuming the question was how to determine this for all possible
    input documents (ie, only via analysis of the XPaths), rather than for a
    specific set of inputs. But I may have misunderstood.

    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
     
    Joseph Kesselman, Oct 18, 2007
    #4
  5. Weston

    Joe Fawcett Guest

    "Anthony Jones" <> wrote in message
    news:%...
    > "Weston" <> wrote in
    > message news:...
    >> Are there any quick, reliable shortcuts to determine if two arbitrary
    >> XPath expressions yield result sets that intersect?
    >>
    >> The obviously way would be to iterate over the set of nodes from one
    >> expression, checking each one to see if it's also contained in the
    >> other set, but I'm wondering if there might be ways of doing this by
    >> looking at or rewriting the actual XPath expressions themselves,
    >> without evaluating them.
    >>

    >
    > (count(XPath1) + count(XPath2)) != Count(XPath1 | XPath2)
    >
    > When creating union of node sets with the | operator any node common to
    > both
    > sides will only be included in the resulting set once. Hence the above
    > boolean expression is true when XPath1 and XPath2 contain at least one
    > Node
    > in common.
    >
    > --
    > Anthony Jones - MVP ASP/ASP.NET
    >

    As there are an infinite number of XPaths for any sequence of nodes in a
    document it seems unlikely that you could do this.
    <a>
    <b>
    <c/>
    </b>
    </a>
    How could you tell that c/.. = a/* , especially without a schema of some
    sort, just by examination?

    As well as Anthony's technique there is an official 'intersect' operator if
    you actually need the nodes.


    --

    Joe Fawcett (MVP - XML)
    http://joe.fawcett.name
     
    Joe Fawcett, Oct 19, 2007
    #5
  6. "Weston" <> wrote in
    message news:...
    > Are there any quick, reliable shortcuts to determine if two arbitrary
    > XPath expressions yield result sets that intersect?
    >
    > The obviously way would be to iterate over the set of nodes from one
    > expression, checking each one to see if it's also contained in the
    > other set, but I'm wondering if there might be ways of doing this by
    > looking at or rewriting the actual XPath expressions themselves,
    > without evaluating them.


    In XPath 1.0 the following XPath expression selects all nodes that are the
    intersection (belong to both) of two node-sets:

    $ns1[count(. | $ns2) = count($ns2)]

    Also knowns as the Kaysian (after Michael Kay) method of finding the
    intersection of two node-sets.

    In the above XPath expression just replace $ns1 and $ns2 with your two XPath
    expressions.


    Cheers,
    Dimitre Novatchev
     
    Dimitre Novatchev, Oct 19, 2007
    #6
  7. Weston

    Weston Guest

    On Oct 18, 3:06 pm, Joseph Kesselman <>
    wrote:
    > I was presuming the question was how to determine this for all possible
    > input documents (ie, only via analysis of the XPaths), rather than for a
    > specific set of inputs. But I may have misunderstood.


    You're right, that's what I was originally hoping for, but I've
    realized it's impossible for the reasons Joe F pointed out. So I'm
    definitely open to helpful tips that involve evaluating the
    expressions against a document, and Anthony's test looks like one
    thing that would work for me.

    I've also realized that the problem I'm trying to solve might be
    reducible to a comparison between two expressions which are known to
    resolve to single nodes. If this turns out to be the case, then
    perhaps one could solve the problem by comparing canonical xpath
    expressions for the given nodes, right? Are there convenient ways of
    getting a canonical expression?

    On Oct 19, 1:33 am, "Joe Fawcett" <> wrote:
    > As well as Anthony's technique there is an official 'intersect' operator if
    > you actually need the nodes.


    I'd had no idea -- that would *definitely* solve the problem. :) Looks
    like it's part of XPath 2, though... is it commonly implemented in
    processors? Guess I'll have to check mine to see if it works.

    On Oct 19, 6:06 am, "Dimitre Novatchev" <> wrote:
    > In XPath 1.0 the following XPath expression selects all nodes that are the
    > intersection (belong to both) of two node-sets:
    >
    > $ns1[count(. | $ns2) = count($ns2)]
    >


    Thank you! This will especially be helpful if the intersection
    operator isn't supported. I'd like to understand how it works, though,
    and I don't think I do.
     
    Weston, Oct 19, 2007
    #7
  8. >>$ns1[count(. | $ns2) = count($ns2)]
    >
    > I'd like to understand how it works, though, and I don't think I do.


    The predicate is true if the union of . ("this node") and the node set
    $ns2 is the same length as $ns2 by itself... in other words, if . is
    already present in $ns2.

    So applying the predicate to nodeset $ns1 yields "those nodes in $ns1
    which are present in $ns2"... in other words, the intersection of the
    two sets.


    --
    Joe Kesselman / Beware the fury of a patient man. -- John Dryden
     
    Joseph Kesselman, Oct 19, 2007
    #8
  9. Weston a écrit :
    > On Oct 18, 3:06 pm, Joseph Kesselman <>
    > wrote:
    >> I was presuming the question was how to determine this for all possible
    >> input documents (ie, only via analysis of the XPaths), rather than for a
    >> specific set of inputs. But I may have misunderstood.

    >
    > You're right, that's what I was originally hoping for, but I've
    > realized it's impossible for the reasons Joe F pointed out. So I'm
    > definitely open to helpful tips that involve evaluating the
    > expressions against a document, and Anthony's test looks like one
    > thing that would work for me.
    >
    > I've also realized that the problem I'm trying to solve might be
    > reducible to a comparison between two expressions which are known to
    > resolve to single nodes. If this turns out to be the case, then
    > perhaps one could solve the problem by comparing canonical xpath
    > expressions for the given nodes, right?


    Right.

    However, it depends on what you call "canonical path" ; my own
    definition is here :
    http://ns.inria.org/active-tags/glossary/glossary.html#canonical-path

    > Are there convenient ways of getting a canonical expression?


    I've got a class that can be built from a DOM node :
    http://reflex.gforge.inria.fr/javadoc/org/inria/reflex/xml/CanonicalPath.html
    and 2 canonical paths can be compared, or tested to check if a node
    addressed by one is reachable by the other (browse the doc of the class
    for details)

    Additionally, if you have to store that canonical path inside an element
    (say a DOM attribute), the host element will take care of the prefix
    mappings of the canonical path, and eventually will redefine prefixes to
    avoid clashes.

    --
    Cordialement,

    ///
    (. .)
    --------ooO--(_)--Ooo--------
    | Philippe Poulard |
    -----------------------------
    http://reflex.gforge.inria.fr/
    Have the RefleX !
     
    Philippe Poulard, Oct 22, 2007
    #9
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Xeon

    Intersecting Nodeset

    Xeon, Jul 21, 2003, in forum: XML
    Replies:
    1
    Views:
    406
    Dimitre Novatchev
    Jul 21, 2003
  2. Tjerk Wolterink

    XPath: efficiency in xpath expressions

    Tjerk Wolterink, Nov 13, 2004, in forum: XML
    Replies:
    1
    Views:
    1,645
    Richard Tobin
    Nov 13, 2004
  3. Replies:
    4
    Views:
    987
  4. RajkiranPro

    Area of self intersecting polygon

    RajkiranPro, Dec 15, 2007, in forum: C++
    Replies:
    2
    Views:
    886
    Jim Langston
    Dec 15, 2007
  5. Benman
    Replies:
    0
    Views:
    133
    Benman
    Dec 9, 2005
Loading...

Share This Page