Tree splitting/merging

Discussion in 'XML' started by William Ahern, Nov 4, 2003.

  1. I'm looking for resources on splitting and merging XML trees. Specifically,
    on methods to pare large XML documents into smaller documents which can be
    merged later.

    Off of the top of my head, I can envision unions of node sets, and unions of
    node text. But I know there's much more to the subject than that, if not
    more alternatives than greater technical detail.

    TIA,

    Bill
     
    William Ahern, Nov 4, 2003
    #1
    1. Advertising

  2. > I'm looking for resources on splitting and merging XML trees.
    Specifically,
    > on methods to pare large XML documents into smaller documents which can be
    > merged later.


    I have something for a problem (perhaps) close to yours: I need to perform
    XSLT transformation on very large document which doesn't fit in memory. I
    use a SAX parser with three XMLFilter (concretely, sub-classes of
    org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream (i.e.
    it throw a "start document" and a "end document" events) when it encouters a
    specific start and endElement. So the next filter receive several (smaller)
    documents one at once. This second filter is a TransformerHandler which
    perform the transformation. Then it pass the event to a last filter, a
    "merger", who discard the "start" and "endDocument" event except the very
    first and the very last one.
    I was inspired by a Perl module by Barrie Slaymaker.
    (inccidentaly, I noticed that there is nothing as convenient for Java that
    the XML::SAX::pipeline Perl module)

    In fact I was coming on this list for a question close to this one: it's in
    a new thread...

    > Off of the top of my head, I can envision unions of node sets, and unions

    of
    > node text. But I know there's much more to the subject than that, if not
    > more alternatives than greater technical detail.


    Which level of well-formedness have your merging problem, i.e. do you want
    only add node to existing nodes in a DOM mode (you just need standard method
    of the Node interface), or do you want to insert mixed content checking for
    well-formedness, tag nesting, etc?

    > TIA,
     
    sylvain.loiseau, Nov 4, 2003
    #2
    1. Advertising

  3. sylvain.loiseau <> wrote:
    >> I'm looking for resources on splitting and merging XML trees.

    > Specifically,
    >> on methods to pare large XML documents into smaller documents which can be
    >> merged later.

    >
    > I have something for a problem (perhaps) close to yours: I need to perform
    > XSLT transformation on very large document which doesn't fit in memory. I
    > use a SAX parser with three XMLFilter (concretely, sub-classes of
    > org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream (i.e.
    > it throw a "start document" and a "end document" events) when it encouters a
    > specific start and endElement. So the next filter receive several (smaller)
    > documents one at once. This second filter is a TransformerHandler which
    > perform the transformation. Then it pass the event to a last filter, a
    > "merger", who discard the "start" and "endDocument" event except the very
    > first and the very last one.
    > I was inspired by a Perl module by Barrie Slaymaker.
    > (inccidentaly, I noticed that there is nothing as convenient for Java that
    > the XML::SAX::pipeline Perl module)


    Right after posting I tripped over the XPipe project (http://xpipe.sf.net/).
    XPipe associates this w/ the scatter/gather pattern, and they seem to have
    put a lot of thought into the issues. Specifically, they elaborate on a
    notion of a "fulcra", or the node-depth I suppose you could call it, that a
    document can be split on. Probably you're already thought this through, but
    maybe you can find more info on that site. They have code and list
    discussions you can wade through.

    - Bill
     
    William Ahern, Nov 4, 2003
    #3
  4. Thanks, it looks very interesting.

    Sylvain

    "William Ahern" <william@wilbur.25thandClement.com> a écrit dans le message
    de news: g4ol71-0jq.ln1@wilbur.25thandClement.com...
    > sylvain.loiseau <> wrote:
    > >> I'm looking for resources on splitting and merging XML trees.

    > > Specifically,
    > >> on methods to pare large XML documents into smaller documents which can

    be
    > >> merged later.

    > >
    > > I have something for a problem (perhaps) close to yours: I need to

    perform
    > > XSLT transformation on very large document which doesn't fit in memory.

    I
    > > use a SAX parser with three XMLFilter (concretely, sub-classes of
    > > org.xml.sax.helpers.XMLFilterImpl). The first class "split" the stream

    (i.e.
    > > it throw a "start document" and a "end document" events) when it

    encouters a
    > > specific start and endElement. So the next filter receive several

    (smaller)
    > > documents one at once. This second filter is a TransformerHandler which
    > > perform the transformation. Then it pass the event to a last filter, a
    > > "merger", who discard the "start" and "endDocument" event except the

    very
    > > first and the very last one.
    > > I was inspired by a Perl module by Barrie Slaymaker.
    > > (inccidentaly, I noticed that there is nothing as convenient for Java

    that
    > > the XML::SAX::pipeline Perl module)

    >
    > Right after posting I tripped over the XPipe project

    (http://xpipe.sf.net/).
    > XPipe associates this w/ the scatter/gather pattern, and they seem to have
    > put a lot of thought into the issues. Specifically, they elaborate on a
    > notion of a "fulcra", or the node-depth I suppose you could call it, that

    a
    > document can be split on. Probably you're already thought this through,

    but
    > maybe you can find more info on that site. They have code and list
    > discussions you can wade through.
    >
    > - Bill
     
    sylvain.loiseau, Nov 4, 2003
    #4
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. freesoft_2000

    Merging And Splitting

    freesoft_2000, Feb 5, 2005, in forum: Java
    Replies:
    0
    Views:
    1,251
    freesoft_2000
    Feb 5, 2005
  2. John Ericson
    Replies:
    0
    Views:
    435
    John Ericson
    Jul 19, 2003
  3. Mark
    Replies:
    0
    Views:
    449
  4. John Dibling
    Replies:
    0
    Views:
    422
    John Dibling
    Jul 19, 2003
  5. Stub

    B tree, B+ tree and B* tree

    Stub, Nov 12, 2003, in forum: C Programming
    Replies:
    3
    Views:
    10,175
Loading...

Share This Page