Returning "nearest in document" matches using XPath

Discussion in 'XML' started by Nick Leverton, Dec 5, 2008.

  1. I have an application which attempts to describe a tree of TCP subnets,
    which in essence are not fully accessible from each other. I have a
    description of the network in XML as shown in the excerpt below.

    The application is actually trying to optimise delivery of large files
    to multiple destinations over expensive links, so it's not just a matter
    of opening up firewalls and adding a bit of NATting. To avoid duplicated
    transfers I need to know what is the nearest machine which leads onto
    the ultimate destination for the file I am currently handling.

    So for instance files destined for units 26 and 27 are first delivered
    to node V9990 which then delivers them to V9991, to which 26 and 27
    are directly attached. The distinction between nodes and units isn't
    important for this part of the task. The ID attribute defines the
    ultimate destination which I am trying to reach and each ID is unique,
    so there is only one "nearest" IP address corresponding to each ID.

    <?xml version="1.0"?>
    <nodes>
    <node id="V9990" ip="1.1.1.1">
    <unit id="23" ip="10.10.10.10"/>
    <unit id="24" ip="10.10.10.11"/>
    <node id="V9991" ip="10.10.10.12">
    <unit id="26" ip="192.168.0.1"/>
    <unit id="27" ip="192.168.0.2"/>
    </node>
    </node>
    <node id="V9992" ip="2.2.2.2">
    <node id="V9993" ip="10.10.10.10">
    <unit id="21"/>
    <unit id="22"/>
    </node>
    </node>
    </nodes>

    To simplify network maintenance I would like to use the same config file
    on all the "nodes", and to modify the XPath query with extra terms on
    the sub-nodes. In other words, on the "root" machine a query for id=26
    will return ip=1.1.1.1, but on node V9990 a query for id=26 will return
    ip=10.10.10.12

    In summary, what I want to do is to retrieve the nearest ip attribute
    in the document which has a given id attribute as a descendant. I am
    currently using the following XPath:

    Querying from the root:
    descendant-or-self::*[@ip and descendant-or-self::*[@id="26"]][last()]/@ip

    I used descendant-or-self as the first term here rather than //*
    because I don't want XPath to descend the doc and return all matches,
    only the node which matches nearest the root of the XML document.

    Querying from a sub-node:
    //*[@id="V9990"]/*[@ip and descendant-or-self::*[@id="26"]][last()]/@ip

    Here I establish a context node first and then work on that with
    predicates.

    First question - these two work, but are probably not ideal since I'm
    not yet very familiar with XPath. In particular I don't understand
    why I need to use [last()] predicate rather than [1], as I thought the
    descendant axis should work downwards in document order not upwards.

    Secondly, I now have a requirement to retrieve all the "nearest" ip
    attributes for polling/reporting purposes. In other words, querying
    from the root I would want to return 1.1.1.1 and 2.2.2.2. Or querying
    from node V9990 I would want to return 10.10.10.10, 10.10.10.11 and
    10.10.10.12. I don't mind about getting multiple instances of the same
    attribute back as de-duping is simple. But I cannot figure out how to
    arrange the predicates so as to return the "topmost" ip attribute only,
    neither for the root case nor for the sub-context case.

    Am I bending XPath a step too far here ? I was hoping not to have to
    introduce an extra processing step but I am thinking maybe the sub-nodes
    need to extract their "local" view of the network and only to work
    on that. Any advice would be very helpful.

    I'm working in perl XML::XPath in case it makes a difference.

    Thankyou

    Nick
    --
    Serendipity: http://www.leverton.org/blosxom (last update 19th September 2008)
    "The Internet, a sort of ersatz counterfeit of real life"
    -- Janet Street-Porter, BBC2, 19th March 1996
    Nick Leverton, Dec 5, 2008
    #1
    1. Advertising

  2. What do you mean by "nearest"? Is this the geographical distance b/n two
    nodes? I dont see this reflected in the XML document.

    Cheers,
    Dimitre Novatchev

    "Nick Leverton" <> wrote in message
    news:...
    >I have an application which attempts to describe a tree of TCP subnets,
    > which in essence are not fully accessible from each other. I have a
    > description of the network in XML as shown in the excerpt below.
    >
    > The application is actually trying to optimise delivery of large files
    > to multiple destinations over expensive links, so it's not just a matter
    > of opening up firewalls and adding a bit of NATting. To avoid duplicated
    > transfers I need to know what is the nearest machine which leads onto
    > the ultimate destination for the file I am currently handling.
    >
    > So for instance files destined for units 26 and 27 are first delivered
    > to node V9990 which then delivers them to V9991, to which 26 and 27
    > are directly attached. The distinction between nodes and units isn't
    > important for this part of the task. The ID attribute defines the
    > ultimate destination which I am trying to reach and each ID is unique,
    > so there is only one "nearest" IP address corresponding to each ID.
    >
    > <?xml version="1.0"?>
    > <nodes>
    > <node id="V9990" ip="1.1.1.1">
    > <unit id="23" ip="10.10.10.10"/>
    > <unit id="24" ip="10.10.10.11"/>
    > <node id="V9991" ip="10.10.10.12">
    > <unit id="26" ip="192.168.0.1"/>
    > <unit id="27" ip="192.168.0.2"/>
    > </node>
    > </node>
    > <node id="V9992" ip="2.2.2.2">
    > <node id="V9993" ip="10.10.10.10">
    > <unit id="21"/>
    > <unit id="22"/>
    > </node>
    > </node>
    > </nodes>
    >
    > To simplify network maintenance I would like to use the same config file
    > on all the "nodes", and to modify the XPath query with extra terms on
    > the sub-nodes. In other words, on the "root" machine a query for id=26
    > will return ip=1.1.1.1, but on node V9990 a query for id=26 will return
    > ip=10.10.10.12
    >
    > In summary, what I want to do is to retrieve the nearest ip attribute
    > in the document which has a given id attribute as a descendant. I am
    > currently using the following XPath:
    >
    > Querying from the root:
    > descendant-or-self::*[@ip and
    > descendant-or-self::*[@id="26"]][last()]/@ip
    >
    > I used descendant-or-self as the first term here rather than //*
    > because I don't want XPath to descend the doc and return all matches,
    > only the node which matches nearest the root of the XML document.
    >
    > Querying from a sub-node:
    > //*[@id="V9990"]/*[@ip and descendant-or-self::*[@id="26"]][last()]/@ip
    >
    > Here I establish a context node first and then work on that with
    > predicates.
    >
    > First question - these two work, but are probably not ideal since I'm
    > not yet very familiar with XPath. In particular I don't understand
    > why I need to use [last()] predicate rather than [1], as I thought the
    > descendant axis should work downwards in document order not upwards.
    >
    > Secondly, I now have a requirement to retrieve all the "nearest" ip
    > attributes for polling/reporting purposes. In other words, querying
    > from the root I would want to return 1.1.1.1 and 2.2.2.2. Or querying
    > from node V9990 I would want to return 10.10.10.10, 10.10.10.11 and
    > 10.10.10.12. I don't mind about getting multiple instances of the same
    > attribute back as de-duping is simple. But I cannot figure out how to
    > arrange the predicates so as to return the "topmost" ip attribute only,
    > neither for the root case nor for the sub-context case.
    >
    > Am I bending XPath a step too far here ? I was hoping not to have to
    > introduce an extra processing step but I am thinking maybe the sub-nodes
    > need to extract their "local" view of the network and only to work
    > on that. Any advice would be very helpful.
    >
    > I'm working in perl XML::XPath in case it makes a difference.
    >
    > Thankyou
    >
    > Nick
    > --
    > Serendipity: http://www.leverton.org/blosxom (last update 19th September
    > 2008)
    > "The Internet, a sort of ersatz counterfeit of real life"
    > -- Janet Street-Porter, BBC2, 19th March 1996
    Dimitre Novatchev, Dec 5, 2008
    #2
    1. Advertising

  3. In article <4939431f$0$17068$>,
    Dimitre Novatchev <> wrote:
    >What do you mean by "nearest"? Is this the geographical distance b/n two
    >nodes? I dont see this reflected in the XML document.


    No, sorry for being unclear. I mean that from the set of ip attributes
    on the axis which contains both the root and the required id attribute:

    / ... @ip ... @ip ... @ip ... @id

    I want to find the left-most one in the above diagram, nearest to the root
    (or to other selected starting node inbetween the root and the required @id).

    I can do this for single ids with the Xpath query I posted, although I
    don't fully understand the ordering I am getting. I can't figure out
    how to make a satisfactory query which will return the set of leftmost
    @ip for all the ids in the XML document.

    Thanks for your interest, if I'm still not explaining clearly please let
    me know. I'm quite new to XML/Xpath and don't always know the correct
    way to describe what I want to do.

    Nick
    --
    Serendipity: http://www.leverton.org/blosxom (last update 19th September 2008)
    "The Internet, a sort of ersatz counterfeit of real life"
    -- Janet Street-Porter, BBC2, 19th March 1996
    Nick Leverton, Dec 5, 2008
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Marvin_123456

    "Memory leak" in javax.xml.xpath.XPath

    Marvin_123456, Jul 29, 2005, in forum: Java
    Replies:
    4
    Views:
    1,953
    jan V
    Jul 29, 2005
  2. Alastair Cameron
    Replies:
    1
    Views:
    7,364
    SQL Server Development Team [MSFT]
    Jul 8, 2003
  3. Jonathan Lukens

    returning regex matches as lists

    Jonathan Lukens, Feb 15, 2008, in forum: Python
    Replies:
    7
    Views:
    292
    Jonathan Lukens
    Feb 16, 2008
  4. Markus Fischer
    Replies:
    9
    Views:
    160
    7stud --
    Apr 8, 2011
  5. John otac0n Gietzen
    Replies:
    2
    Views:
    164
    John otac0n Gietzen
    Feb 5, 2006
Loading...

Share This Page