XSLT benchmarking and performance advice ?

A

Andy Dingley

I've just started on a new project and inherited a huge pile of XSLT
(and I use the term "pile" advisedly !) It runs at glacial speed, and
I need to fix this this.

Platform is MSXML 4 / ASP

Any advice on benchmarking tools / techniques ? I have no budget for
tool-building, so if there's anything already out there, that would be
good to know.

Any general advice on performance bottlenecks to avoid ? Particular
areas I'm interested in are:

* xsl:call-template / name vs. xsl:apply-templates / match / mode

* Large stylesheets loaded from several small stylesheets, aggregated
by use of xsl:import

* Many template rules calling each other in chains, where some are
doing little more than being function calls mapping one rule onto
another.

* Widespread string-slicing with substring-before() etc.


I'm also looking at serious automatic compilation of templates. I can
do some specific tuning by hand, but most of these stylesheets are
automatically generated and there are simply too many of them. does
anyone have experience of either auto-merging stylesheets and trying
to auto-compile the various template rules (I think there are many
cases where only one rule of a group will ever be invoked, owing to
simple scoping rules).

Another compilation option would be to "flatten the call stack" where
three or four templates call each other in turn, but do little output
or logic until the very last step.


Any advice gratefully received !
 
D

Dimitre Novatchev [MVP]

Andy,

You are asking tens of serious questions in a single message...

While all those factors you mentioned could have impact on performance,
there are more serious ones that must be inspected first:

1. Algorithms / data structures -- this alone may change the
transformation time tens of times. See if there are O(N^2) or worse
algorithms and try to implement better ones -- e.g. O(N).

2. Deep recursion, which consumes huge stack space and causes virtual
memory access (thrashing). There are techniques to flatten recursion -- e.g.
DVC (Divide and Conquer -- see for example:

http://www.topxml.com/code/default.asp?p=3&id=v20020107050418

or to eliminate recursion at all -- see:

http://www.xml.com/pub/a/2003/08/06/exslt.html?page=2

3. Inefficient XPath expressions -- e.g. //someElement

4. Not using xsl:key and the key() function.


5. See also the recommendations of Mike Kay here:

http://aspn.activestate.com/ASPN/Mail/Message/774836


6. Google for xslt performance.

7. Try to understand the problem better and see if the application really
does what it is supposed to do. In many situations it may be more productive
to re-write the application from scratch than to drown in messed code.



Cheers,


Dimitre Novatchev [XML MVP],
FXSL developer, XML Insider,

http://fxsl.sourceforge.net/ -- the home of FXSL
Resume: http://fxsl.sf.net/DNovatchev/Resume/Res.html
 
P

Patrick TJ McPhee

% Any advice on benchmarking tools / techniques ? I have no budget for
% tool-building, so if there's anything already out there, that would be
% good to know.

xsltproc (from libxslt) has a --profile switch which spits out number of
calls and execution time for templates. You can find a win32 port through
http:/xmlsoft.org.
 
A

Andy Dingley

Any advice on benchmarking tools / techniques ?

Thanks for all the advice (email too). I'll probably re-post a few
specific issues, as I get to them.

Today I dived into "the nasty bit" for the first serious
investigation. It is far, far, more horrible than I ever dreamed was
even possible in XSLT. I don't think this is implementable in
InterCal !

Self-referential code is always fun. This bit renders images (so it's
buried deep and frequently called), and supplies a placeholder image
if the image descriptor is empty or the wrong size.

Placeholders are picked from a list of candidates, picked from a list
of candidate stylesheets. document() is used to load the stylesheets,
using a URL that's built dynamically on each call..... I look forward
to benchmarking _that_ !

Curiously the candidates are placed in the stylesheets (which have
probably been loaded already anyway) as the values of <xsl:variable>.
Yet we go through this insane indirection of
document($foo)/xsl:stylesheet/xsl:variable[@name='bar']
to load them !

Code usually drives me to coffee, or sometimes beer for the nastiest.
After this I opened a new bottle of Scotch !
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,744
Messages
2,569,482
Members
44,901
Latest member
Noble71S45

Latest Threads

Top