"walk over," and XPath-based substitutions?

Discussion in 'XML' started by Ivan Shmakov, Apr 6, 2013.

  1. Ivan Shmakov

    Ivan Shmakov Guest

    [Cross-posting to yet omitting it from
    Followup-To:, for I'm primarily interested in Perl-based
    solutions.]

    Is there an easy way to invoke a particular code for each of XML
    nodes that satisfies an XPath expression out of a certain list?

    A simple-minded approach (based on XML::LibXML) could be like:

    require XML::LibXML;

    my %xpath_sub = {
    q {//node ()[@foo = "bar"]} => \&foo_bar,
    q {//node ()[@baz = "qux"]} => sub { baz ("qux", @_); }
    };

    foreach my $xpath (keys (%xpath_sub)) {
    my $sub
    = $xpath_sub{$xpath};
    foreach my $node ($context->findnodes ($xpath)) {
    $sub->($node);
    }
    }

    However, AIUI, the code above implies that the XML tree is to be
    traversed multiple times. Which could probably be avoided by
    traversing the tree explicitly, as in:

    sub traverse {
    my ($node, $xsubs) = @_;
    foreach my $xpath (keys (%$xsubs)) {
    next
    unless ($node->find ($xpath));
    ## FIXME: check if the result is a boolean?
    $xsubs->{$xpath}->($node);
    ## FIXME: there, one may wish for a recursion; or not
    }
    ## recurse over the children
    foreach my $child ($node->childNodes ()) {
    traverse ($child, $xsubs);
    }
    ## .
    }

    Still, it may repeatedly traverse the children of $node while
    computing ->find () for each of the XPath expressions. (Unlike
    the way an "optimized," or "compiled," regular expression would
    be handled, IIUC.)

    The question is: does LibXML (or some other library) provide a
    way to make such a task both simpler to code and more efficient
    on execution?

    ... Or do I "optimize" all the XPath expressions themselves into
    a single one somehow?

    TIA.
     
    Ivan Shmakov, Apr 6, 2013
    #1
    1. Advertisements

  2. Ivan Shmakov

    Ivan Shmakov Guest

    [Given that there were little Perl-specific matter in this
    subthread, cross-posting back to and setting
    Followup-To: there.]
    I see little advantage in using XSLT for my task (and I'm not
    familiar with XQuery), as XML is not the only data source I need
    to interface. (E. g., I'm also accessing an SQLite database.)
    The usual benefits of XSLT -- the existence of browser-based
    implementations and its "Lisp-like" nature (in that it uses the
    same syntax for both the code and data) -- do not seem to apply.
    Indeed, thanks for clarification!
    Is it Apache Xerces [1]? It doesn't seem to include either XSLT
    or XQuery.

    [1] https://xerces.apache.org/
    Which is?
    ACK, thanks. My XMLs are rather small, so I'm more interested
    in reducing computational load than memory usage. But even that
    is not a priority right now. Rather, I'm looking for the ways
    to avoid total code rewrite at some later point.

    I guess I should check XML::Twig. Or, given that the conditions
    that I currently need to consider are rather simple, a
    straight-forward ->childNodes ()-based, no-XPath implementation
    may be possible.

    [...]
    I believe that I may be under a jurisdiction which has no notion
    of software patents. (Subject to the reading of TRIPS, though.)
     
    Ivan Shmakov, Apr 7, 2013
    #2
    1. Advertisements

  3. That's a valid point.
    Those aren't. The benefit of XSLT and XQuery is that they are query
    languages specialized for constructing new documents from XML input, and
    for XSLT in particular that it's a nonprocedural language for the
    purpose. This makes writing and maintaining the transformations easier,
    and may permit optimizations under the covers that would otherwise cost
    you a lot of coding effort.

    Same reasons you use SQL rather than hand-coding your own database.

    Not always the right answer, by any means. And "may" is indeed a valid
    caveat; that's a quality-of-implementation issue, as is true any time
    you use software provided by someone else (including compiler libraries)
    rather than coding it yourself. But don't sell these short by assuming
    that they are only for browsers, and don't get too hung up on the fact
    that XSLT happens to be expressed in XML. (XQuery isn't, and the two are
    in many ways isomorphic.)

    Though in fact the ability to use XML tools -- including XSLT -- to
    manipulate XSLT itself is sometimes surprisingly useful.
    Xerces interacts with Xalan (the Apache XPath/XSLT code), but I *think*
    Xerces also had an implementation of a small subset of XPath that could
    operate in streaming mode.
    Not available independently, alas. XL-TXE 1.0 ships with IBM's JRE,
    which in turn ships only with IBM products. XL-TXE 2.0 (which adds XPath
    2.0, XSLT 2.0, and XQuery support) currently ships only as the IBM
    Websphere Application Server's "XML feature".

    (IBM had been actively contributing to Xerces and Xalan -- in fact, we'd
    been pretty much carrying those projects -- but had to reassign that
    resource. XL-TXE is major reimplementation.)
    Not familiar with; can't advise.

    --
    Joe Kesselman,
    http://www.love-song-productions.com/people/keshlam/index.html

    {} ASCII Ribbon Campaign | "may'ron DaroQbe'chugh vaj bIrIQbej" --
    /\ Stamp out HTML mail! | "Put down the squeezebox & nobody gets hurt."
     
    Joe Kesselman, Apr 7, 2013
    #3
    1. Advertisements

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments (here). After that, you can post your question and our members will help you out.