c++ parsing with mix of sax & dom for large files

Discussion in 'XML' started by alex masselot, Jan 10, 2007.

  1. Hello

    I'm not familiar with xerces in c++

    Currently, we parse xml file with perl (typically XML::Twig) and java
    (dom4j).
    With both API, there is a very comfortable way to mix Sax/DOM, by
    setting handlers to some elements paths.

    The xml file is parsed, then once a defined paths is reached, the
    element is considered and given to a handler subroutines.
    All the subtree can be explored with domlike call (xpath etc.) as a
    memory stored element.
    Then, the tree can be purged, thus the memory released

    It's a quite convenient merge, to get the best of two worlds.

    Is ithat possible with xerces in c++???
    I cannot find any simple answer in apache doc

    thanks
    Alex
     
    alex masselot, Jan 10, 2007
    #1
    1. Advertising

  2. alex masselot wrote:
    > Hello
    >
    > I'm not familiar with xerces in c++
    >
    > Currently, we parse xml file with perl (typically XML::Twig) and java
    > (dom4j).
    > With both API, there is a very comfortable way to mix Sax/DOM, by
    > setting handlers to some elements paths.
    >
    > The xml file is parsed, then once a defined paths is reached, the
    > element is considered and given to a handler subroutines.
    > All the subtree can be explored with domlike call (xpath etc.) as a
    > memory stored element.
    > Then, the tree can be purged, thus the memory released


    It's a job for Active Tags and the XML Control Language !

    XCL pipelines are working in the same way in RefleX (the engine) ;
    however, you can also use XPath directly on SAX streams :
    you can define XPath patterns for filtering (like with XSLT) except that
    large files are supported as well

    additionally, you can "cast" a tree or a subtree from DOM to SAX or SAX
    to DOM at will

    here are some examples :
    http://reflex.gforge.inria.fr/saxPatterns.html#N802B53
    http://reflex.gforge.inria.fr/tutorial.html#N801C30

    and the slides that were shown at <XML2006> in Boston :
    http://disc.inria.fr/perso/philippe.poulard/xml/active-tags.pdf (pages 7
    and 8)

    >
    > It's a quite convenient merge, to get the best of two worlds.


    this is also my opinion ; you can achieve very complex things thanks to
    very few active tags

    >
    > Is ithat possible with xerces in c++???


    sure ! as you explain it yourself, it's not a question of language

    > I cannot find any simple answer in apache doc
    >
    > thanks
    > Alex
    >



    --
    Cordialement,

    ///
    (. .)
    --------ooO--(_)--Ooo--------
    | Philippe Poulard |
    -----------------------------
    http://reflex.gforge.inria.fr/
    Have the RefleX !
     
    Philippe Poulard, Jan 10, 2007
    #2
    1. Advertising

  3. The traditional technique for mixing SAX and DOM is to use a SAX parser
    together with a SAX-driven DOM-tree builder, and to write a SAX handler
    that filters the events appropriately before passing them to the builder.

    Once you've got your filtered DOM, you can of course run a compatable
    XPath implementation against it. DOM Level 3 introduced XPath support,
    though not all DOMs implement that optional feature and I'm not sure
    offhand whether Xerces-C's DOM includes it or not. If not, I presume
    Xalan-C has an XPath API, though I'm not sure how efficiently it
    interoperates with the Xerces-C DOM (Xalan prefers to manipulate its own
    data model).

    So the answer is: Yes, it's possible, though you may need to write a bit
    of code to glue it all together.
     
    Joseph Kesselman, Jan 10, 2007
    #3
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Igor Akkerman
    Replies:
    0
    Views:
    366
    Igor Akkerman
    Jul 30, 2003
  2. Sakcee

    rss parsing: sax or dom

    Sakcee, Dec 29, 2005, in forum: Python
    Replies:
    1
    Views:
    314
    Jarek Zgoda
    Dec 29, 2005
  3. Replies:
    4
    Views:
    393
  4. Sidhartha
    Replies:
    1
    Views:
    532
    Martin Honnen
    Sep 15, 2008
  5. aha
    Replies:
    2
    Views:
    502
    Stefan Behnel
    Jan 23, 2009
Loading...

Share This Page