Can these two lines be optimized (JDOM,XSLT) ?

C

Collin VanDyck

Hey,

I've profiled part of my application, which essentially aggregates and
transforms XML in various pipelines.

I spend a lot of time in these two lines of code:

Transformer transformer =
TransformerFactory.newInstance().newTransformer(new StreamSource(new
ByteArrayInputStream(stylesheet.getBytes())));
transformer.transform(new JDOMSource(in), out);

I'd like to streamline this somewhat, but don't know how to. Is there
anything glaringly obvious here, or do these two lines of code look about
right?

I'm basically taking in a org.jdom.Document (in) and transforming it
according to a XSLT string, and saving it in a JDOMResult (out).

thanks-
 
J

John C. Bollinger

Collin said:
Hey,

I've profiled part of my application, which essentially aggregates and
transforms XML in various pipelines.

I spend a lot of time in these two lines of code:

Transformer transformer =
TransformerFactory.newInstance().newTransformer(new StreamSource(new
ByteArrayInputStream(stylesheet.getBytes())));
transformer.transform(new JDOMSource(in), out);

I'd like to streamline this somewhat, but don't know how to. Is there
anything glaringly obvious here, or do these two lines of code look about
right?

I'm basically taking in a org.jdom.Document (in) and transforming it
according to a XSLT string, and saving it in a JDOMResult (out).

Well, I think you might want to look a little more deeply at your
profiling data. It is no surprise that you spend a lot of time in those
two lines, because they imply a rather massive amount of work. Analogy:
"I profiled my C program, and it spends all its time in the main()
function!"

One possible optimization would be to reuse a single TransformerFactory
instead of creating a new one every time, but I rather doubt that that
would have a noticeable impact. If one or both of the in and out
documents reside on disk and are not already buffered then you might see
improvement from inserting buffered streams into the chains.

Assuming that stylesheet is a String, a minor optimization that would be
valuable more because of its boost to your code's robustness would be to
replace "new ByteArrayInputStream(stylesheet.getBytes())" in your
StreamSource's constructor with "new StringReader(stylesheet)". This is
one of those cases described in the StreamSource docs where the
character encoding already has been (or should have been) resolved; if
it hasn't then you're already in trouble.

Other than buffering I/O, I doubt whether you can improve performance
there very much without finding a different approach -- a
better-performing XSLT package, for instance, if any exists.


John Bollinger
(e-mail address removed)
 
J

Jon Skeet

Collin VanDyck said:
I've profiled part of my application, which essentially aggregates and
transforms XML in various pipelines.

I spend a lot of time in these two lines of code:

Transformer transformer =
TransformerFactory.newInstance().newTransformer(new StreamSource(new
ByteArrayInputStream(stylesheet.getBytes())));
transformer.transform(new JDOMSource(in), out);

I'd like to streamline this somewhat, but don't know how to. Is there
anything glaringly obvious here, or do these two lines of code look about
right?

I'm basically taking in a org.jdom.Document (in) and transforming it
according to a XSLT string, and saving it in a JDOMResult (out).

Yes, there's definitely something better you can do here. Converting
the string into bytes isn't a good idea - there's no need for all that
encoding and decoding. Instead, create a StringReader from the string
and pass *that* in. I believe that should be more efficient.

Are you repeatedly using the same XSLT string? If so, I'd suggest
creating a Templates object rather than a Transformer directly, and
then calling newTransformer from that Templates each time - that way
the XSLT doesn't need to be parsed each time.
 
C

Collin VanDyck

Thanks for the responses. The same XSLT is used to transform over and
over, but there are about 500 different stylesheets that fill this position.
So I might implement a templating cache in order to reduce the parsing
overhead as you mentioned.

I also tried the new StringReader(templateXML) approach, but it would not
work for some reason. It looks as if though I have some work to do in order
to get that to work.

Thanks for the responses :)
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,769
Messages
2,569,580
Members
45,054
Latest member
TrimKetoBoost

Latest Threads

Top