XPathAPI(node, xpathStr) & XPathContext.getDTMHandleFromNode(node) slow

Discussion in 'Java' started by David Portabella, Jul 27, 2007.

  1. Hello,

    I am using xalan 2.7.0: http://xml.apache.org/xalan-j/
    As I run XPathAPI.eval(node, xpathStr) over and over again on several
    nodes, it gets slower and slower.
    This is documented in the XPathAPI documentation, and it suggests to
    use the low-level XPath API:
    http://xml.apache.org/xalan-j/apidocs/org/apache/xpath/XPathAPI.html

    I am now using the low-level XPath API as follows:
    XPathContext xpathSupport = new XPathContext();
    PrefixResolverDefault prefixResolver = new
    PrefixResolverDefault(document);
    XPath xpath = new XPath(xpathStr, null, prefixResolver,
    XPath.SELECT, null);

    and then, for each node:
    int ctxtNode = xpathSupport.getDTMHandleFromNode(contextNode);
    XObject object = xpath.execute(xpathSupport, node,
    prefixResolver);

    It gets a bit better, but still, after using over and over again on
    several nodes, it gets slower and slower.
    I think that the problem is that
    XPathContext.getDTMHandleFromNode(child) does not free memory.

    Test this simplistic example yourself:
    ++++++++++++++++++++++++++++++++++++++++++++
    import org.w3c.dom.*;
    import javax.xml.parsers.*;
    import javax.xml.transform.*;
    import javax.xml.transform.dom.*;
    import org.apache.xpath.*;
    import org.apache.xml.utils.*;

    public class Test {
    public static void main(String[] argv) throws Exception {
    int numChilds = 100000+1;

    System.out.println("Building a document with " + numChilds + "
    childs");
    Document doc =
    DocumentBuilderFactory.newInstance().newDocumentBuilder().newDocument();
    Element root = doc.createElement("root");
    doc.appendChild(root);
    for (int i = 0; i < numChilds; i ++) {
    Element child = doc.createElement("child");
    root.appendChild(child);
    Element subChild = doc.createElement("sub-child");
    child.appendChild(subChild);
    Element subSubChild = doc.createElement("sub-sub-child");
    subChild.appendChild(subSubChild);
    subSubChild.setAttribute("title", "title" + i);
    }


    XPathContext xpathSupport = new XPathContext();
    PrefixResolverDefault prefixResolver = new
    PrefixResolverDefault(doc);
    XPath titleXpath = new XPath("sub-child/sub-sub-child/@title",
    null, prefixResolver, XPath.SELECT, null);
    Runtime r = Runtime.getRuntime();

    System.out.println("Evaluating XPath for each " + numChilds +
    " childs");
    NodeList nodeList = root.getChildNodes();
    int size = nodeList.getLength();
    for (int i = 0; i < size; i++) {
    long start = System.currentTimeMillis();
    Element child = (Element) nodeList.item(i);
    int ctxtNode = xpathSupport.getDTMHandleFromNode(child);
    //String title = titleXpath.execute(xpathSupport,
    ctxtNode, prefixResolver).toString();
    long duration = System.currentTimeMillis() - start;
    if (i < 10 || (i % (numChilds/10)) == 0)
    System.out.println("child #" + i + "\t took " +
    duration + " ms." +
    "\tfreeMemory: " + r.freeMemory() +
    "\ttotalMemory: "+r.totalMemory());
    else if (i == 10)
    System.out.println("printing some selected childs only
    from now on...");
    }
    }
    }

    ++++++++++++++++++++++++++++++++++++++++++++
    Here you can see an example of the result:

    $ java Test
    Building a document with 100001 childs
    Evaluating XPath for each 100001 childs
    child #0 took 77 ms. freeMemory: 10642840 totalMemory:
    45129728
    child #1 took 1 ms. freeMemory: 10583848 totalMemory:
    45129728
    child #2 took 0 ms. freeMemory: 10583848 totalMemory:
    45129728
    child #3 took 0 ms. freeMemory: 10583848 totalMemory:
    45129728
    child #4 took 0 ms. freeMemory: 10583848 totalMemory:
    45129728
    child #5 took 0 ms. freeMemory: 10583848 totalMemory:
    45129728
    child #6 took 0 ms. freeMemory: 10583848 totalMemory:
    45129728
    child #7 took 1 ms. freeMemory: 10583848 totalMemory:
    45129728
    child #8 took 0 ms. freeMemory: 10583848 totalMemory:
    45129728
    child #9 took 0 ms. freeMemory: 10583848 totalMemory:
    45129728
    printing some selected childs only from now on...
    child #10000 took 3 ms. freeMemory: 10980392 totalMemory:
    45129728
    child #20000 took 5 ms. freeMemory: 9976808 totalMemory:
    45129728
    child #30000 took 7 ms. freeMemory: 6332656 totalMemory:
    45129728
    child #40000 took 9 ms. freeMemory: 5112168 totalMemory:
    45129728
    child #50000 took 12 ms. freeMemory: 1373472 totalMemory:
    45129728
    child #60000 took 14 ms. freeMemory: 19851264 totalMemory:
    66650112
    child #70000 took 16 ms. freeMemory: 16515832 totalMemory:
    66650112
    child #80000 took 19 ms. freeMemory: 15040280 totalMemory:
    66650112
    child #90000 took 21 ms. freeMemory: 7435744 totalMemory:
    66650112
    child #100000 took 24 ms. freeMemory: 17416944 totalMemory:
    66650112


    ++++++++++++++++++++++++++++++++++++++++++++
    each time I call xpathSupport.getDTMHandleFromNode(child) it does not
    free the memory,
    and so it gets slower and slower.

    How to solve this problem?
    Some people has suggested to use the DOM4J package instead of Xalan.
    However, we already have quite a lot of software using Xalan and
    changing the code would have some cost.
    Is it possible to solve this problem without discarding xalan?

    Regards,
    DAvid
     
    David Portabella, Jul 27, 2007
    #1
    1. Advertising

  2. David  Portabella

    Piotr Kobzda Guest

    Re: XPathAPI(node, xpathStr) & XPathContext.getDTMHandleFromNode(node)slow

    David Portabella wrote:

    > I am now using the low-level XPath API as follows:
    > XPathContext xpathSupport = new XPathContext();
    > PrefixResolverDefault prefixResolver = new
    > PrefixResolverDefault(document);
    > XPath xpath = new XPath(xpathStr, null, prefixResolver,
    > XPath.SELECT, null);
    >
    > and then, for each node:
    > int ctxtNode = xpathSupport.getDTMHandleFromNode(contextNode);
    > XObject object = xpath.execute(xpathSupport, node,
    > prefixResolver);
    >
    > It gets a bit better, but still, after using over and over again on
    > several nodes, it gets slower and slower.
    > I think that the problem is that
    > XPathContext.getDTMHandleFromNode(child) does not free memory.


    It seems to me that Xalan holds all the references to the DTM nodes it
    creates (possibly in XPathContext, or DTMManager instance). I'm not
    very familiar with Xalan API, nor its internals, but I came to that
    conclusion after some experimenting with your example code under Java SE
    embedded version of Xalan (I don't know which particular versions of
    Xalan each Java embeds).

    I tried to remove that DTM references from context as follows:

    DTMManager dtmManager = xpathSupport.getDTMManager();

    DTM dtm = dtmManager.getDTM(ctxtNode);
    dtmManager.release(dtm, true);

    But it seems that all DTM references released that way are still
    referenced somewhere (possibly in per document context). As the result,
    your example performs even slower with that.

    However, the above seems to be handy when used for each child node
    separately from the original DOM document. i.e. used for the node's
    clone referred to that way:

    int ctxtNode =
    xpathSupport.getDTMHandleFromNode(child.cloneNode(true));

    Without releasing the clone's DTM handle, the example very quickly ends
    with OutOfMemoryError. But when both the above changes are used, xpath
    performs equally fast for each child's clone created in the loop.

    Of course, the above solution will work correctly as long as your xpath
    expression is not referencing any data of the child's parent node (nor
    any data of some other nodes not within its subtree). There are
    possibly some other limitations caused by this trick, which I can't come
    up with now. But in your particular example, it seems to work fast and
    properly.


    For those who'd like to check that with Sun's Java 5 and 6 internally
    embedded version of Xalan, enough is to replace the following two imports:

    > import org.apache.xpath.*;
    > import org.apache.xml.utils.*;


    with:

    import com.sun.org.apache.xml.internal.utils.*;
    import com.sun.org.apache.xpath.internal.*;
    import com.sun.org.apache.xml.internal.dtm.*;


    > Is it possible to solve this problem without discarding xalan?


    Hope so. Let us know if the above solves your problem.


    piotr
     
    Piotr Kobzda, Jul 30, 2007
    #2
    1. Advertising

  3. On Jul 30, 4:04 pm, Piotr Kobzda <> wrote:
    > David Portabella wrote:
    > > I am now using the low-level XPath API as follows:
    > > XPathContext xpathSupport = new XPathContext();
    > > PrefixResolverDefault prefixResolver = new
    > > PrefixResolverDefault(document);
    > > XPath xpath = new XPath(xpathStr, null, prefixResolver,
    > > XPath.SELECT, null);

    >
    > > and then, for each node:
    > > int ctxtNode = xpathSupport.getDTMHandleFromNode(contextNode);
    > > XObject object = xpath.execute(xpathSupport, node,
    > > prefixResolver);

    >
    > > It gets a bit better, but still, after using over and over again on
    > > several nodes, it gets slower and slower.
    > > I think that the problem is that
    > > XPathContext.getDTMHandleFromNode(child) does not free memory.

    >
    > It seems to me that Xalan holds all the references to the DTM nodes it
    > creates (possibly in XPathContext, or DTMManager instance). I'm not
    > very familiar with Xalan API, nor its internals, but I came to that
    > conclusion after some experimenting with your example code under Java SE
    > embedded version of Xalan (I don't know which particular versions of
    > Xalan each Java embeds).
    >
    > I tried to remove that DTM references from context as follows:
    >
    > DTMManager dtmManager = xpathSupport.getDTMManager();
    >
    > DTM dtm = dtmManager.getDTM(ctxtNode);
    > dtmManager.release(dtm, true);
    >
    > But it seems that all DTM references released that way are still
    > referenced somewhere (possibly in per document context). As the result,
    > your example performs even slower with that.
    >
    > However, the above seems to be handy when used for each child node
    > separately from the original DOM document. i.e. used for the node's
    > clone referred to that way:
    >
    > int ctxtNode =
    > xpathSupport.getDTMHandleFromNode(child.cloneNode(true));
    >
    > Without releasing the clone's DTM handle, the example very quickly ends
    > with OutOfMemoryError. But when both the above changes are used, xpath
    > performs equally fast for each child's clone created in the loop.
    >
    > Of course, the above solution will work correctly as long as your xpath
    > expression is not referencing any data of the child's parent node (nor
    > any data of some other nodes not within its subtree). There are
    > possibly some other limitations caused by this trick, which I can't come
    > up with now. But in your particular example, it seems to work fast and
    > properly.
    >
    > For those who'd like to check that with Sun's Java 5 and 6 internally
    > embedded version of Xalan, enough is to replace the following two imports:
    >
    > > import org.apache.xpath.*;
    > > import org.apache.xml.utils.*;

    >
    > with:
    >
    > import com.sun.org.apache.xml.internal.utils.*;
    > import com.sun.org.apache.xpath.internal.*;
    > import com.sun.org.apache.xml.internal.dtm.*;
    >
    > > Is it possible to solve this problem without discarding xalan?

    >
    > Hope so. Let us know if the above solves your problem.
    >
    > piotr


    Hello Piotr,

    Thanks for your help; sorry for the long delay.

    Your solution works for me also, thanks a lot!

    I'm now looking at the implications.
    For instance, it may be difficult to use the trick for selecting some
    nodes which need to be modified.
    I'll let you know.

    Many thanks,
    DAvid
     
    David Portabella, Aug 27, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Steve W. Jackson

    Re: XPathAPI: a query that returns an int ?

    Steve W. Jackson, Jul 17, 2003, in forum: Java
    Replies:
    2
    Views:
    469
    Steve W. Jackson
    Jul 18, 2003
  2. asd
    Replies:
    3
    Views:
    632
  3. Replies:
    1
    Views:
    492
    Martin Honnen
    May 17, 2005
  4. Scott Simpson
    Replies:
    2
    Views:
    619
    Scott Simpson
    Jun 10, 2005
  5. Nene
    Replies:
    0
    Views:
    396
Loading...

Share This Page