Problem: XSLT on a large XML using Java results in OutOfMemory error

Discussion in 'Java' started by Lenny Wintfeld, May 17, 2006.

  1. Hi

    I'm attempting additions/changes to a Java program that (among other
    things) uses XSLT to transform a large (96 Mb) XML file. It runs fine on
    small XML files but generates OutOfMemory exceptions with large XML
    files. I tried a simple punt of -Xmx512MB but that didn't work. In the
    future, the input XML file may become considerably bigger than 96 MB, so
    even if it did work, it probably would be putting off the inevitable to
    some later date.

    I'm using JavaSE 1.4.2_11 and the XSL/XML libraries that come with it.
    The conversion is from and to an xml file. The code I inherited looks a
    lot like most of the example code you can find on the net for doing an
    XSLT transformation. The relevant part is:

    TransformerFactory tf = TransformerFactory.newInstance();
    Transformer transformer = tf.newTransformer(xsltSource);
    transformer.transform(new StreamSource(new StringReader(x)),
    xsltDest);

    where xsltSource is XSLT in the form of a string, generated by code
    immediately above the snip shown, and the "x" is the input xml to be
    transformed.

    Things I tried:

    1. I modified the above code to use a file instead of a String as the
    XML to be transformed and a file for the XSLT that specifies the
    transformation. It works fine with small XML input files but not with
    large ones. I assume this code is using the DOM parser, and there is
    simply not enough room in memory to house the input XML file.

    2. Based on some old (years old) newsgroup posts I found, I tried using
    a SAX equivalent of the above code, assuming that SAX takes in, parses
    and transforms the input XML file either picemeal (maybe element by
    element?) or that SAX uses the complete virtual memory of the computer.
    But this code also results successful runs on small input XML files and
    OutOfMemory errors on large ones. Here is a snip of the SAX code
    (adapted from a chapter of Burke's "XSLT and Java" at the O'Reilly
    website):

    FileInputStream brXSLT = new FileInputStream ("C:/Documents and
    Settings/Lenny/Desktop/OCCxsl.xsl");

    // Set up the transformer
    TransformerFactory transFact =
    TransformerFactory.newInstance( );
    SAXTransformerFactory saxTransFact =
    (SAXTransformerFactory) transFact;
    Source xsltSource = new StreamSource(brXSLT);
    TransformerHandler transHand =
    saxTransFact.newTransformerHandler(xsltSource);

    // Set up input source
    InputSource inxml = new InputSource(inXML);
    SAXSource saxSource = new SAXSource(inxml);

    // Set the destination for the XSLT transformation
    transHand.setResult(new StreamResult(outXML));

    // attach the XSLT processor to the XMLReader
    String parserClass = "org.apache.crimson.parser.XMLReaderImpl";
    XMLReader reader = XMLReaderFactory.createXMLReader(parserClass);

    //parse the input file to an output file
    reader.setContentHandler(transHand);
    reader.parse(inxml);


    I'm considering making a custom parser of the input XML file which
    basically identifies elements of the input XML file and treats each
    element as if it were a comlete document. e.g. send the content handler
    ch.startDocument()
    ch.startElement(..) // pass through the original element
    ch.characters(..) // "
    ch.endElement(..) // "
    ch.endDocument()
    for each element in the input XML file.

    But being a newbie to XSLT, I don't know if this is worth pursuing, or
    even if it would work; I'm hoping there are simpler, more strightforward
    ways of accomplising the same thing and at a higher level. It does seem
    pretty clumsy, even if it would work.

    I found a reply on the web to someone who had a similar problem. To the
    effect that a "SAX pipeline" should be used. But there was no further
    elaboration, and so far, I haven't figured out what a SAX Pipeline is or
    how it would help.

    Any advice, or references to examples, or actual examples would be
    greatly appreciated.

    Also, this problem seems to fall in a crack between comp.text.xml and
    comp.lang.java.programmer. Do you think it's better addressed at the
    other group?

    Non-procedural programming is taking quite a bit of effort to
    understand!

    Thanks in advance for your help.

    Lenny Wintfeld
     
    Lenny Wintfeld, May 17, 2006
    #1
    1. Advertising

Want to reply to this thread or ask your own question?

It takes just 2 minutes to sign up (and it's free!). Just click the sign up button to choose a username and then you can ask your own questions on the forum.
Similar Threads
  1. Michael Borgwardt

    java.lang.OutOfMemory

    Michael Borgwardt, May 7, 2004, in forum: Java
    Replies:
    8
    Views:
    3,537
    Boudewijn Dijkstra
    May 15, 2004
  2. Replies:
    25
    Views:
    9,551
  3. Lenny Wintfeld
    Replies:
    6
    Views:
    5,964
  4. Ananth
    Replies:
    8
    Views:
    4,247
    Jason Cavett
    Mar 5, 2008
  5. Erik Wasser
    Replies:
    5
    Views:
    518
    Peter J. Holzer
    Mar 5, 2006
Loading...

Share This Page