xsl-list
[Top] [All Lists]

Re: how to estimate speed of a transformation

2003-12-11 08:59:48
David Tolpin wrote:
XSLT is almost soley specified by expected result, not assuming any
processing model  ...

Not me, David Carlisle wrote that. But I agree.

First, some things will improve execution speed with almost any XSLT 
processor.  Anything that avoids walking large parts of the input tree 
multiple times is always good.  

The question is *what* avoids it. In the past messages I've tried to show
that constructs that may look as multiple walks through the input tree
are in fact not.

Saving certain results in variables is 
one way to do this.  Keys are often helpful.  Following the 
suggestions of Michael Kay and Jeni Tennison (and others) on this list 
will improve your stylesheets.

Not necessarily. While they are often helpful in particular cases, they
don't answer my questions; besides, sometimes (and quite often) solutions 
suffer from rapid execution time growth with some processors.

Grouping algorithms are a good example of that. Grouping algorithms
often proposed on this list exhibit quadratic complexity with a number of
processors, while are linear with more advanced ones. Even if linear,
they can be very slow with, say, jd.xslt, but fast with saxon (ten-fold).

Second, some XSLT processors (DataPower's among them) are 10 times 
faster than others.  Part of this is due to optimizations, and part is 
due to raw technology.  These processors will be faster no matter what 
stylesheets you write.

There is no raw technology that makes a*n^2 always smaller than b*n, and b*n 
always
smaller than c. (a,b,c are constants, n is the length of data). A processor 
written
in C is ten times faster than one written in Java for identity transform. The 
same
processor C is ten times slower than the other one when executes a grouping 
transformation
on a sequence of 10,000 elements.

Finally, any stylesheet complex enough to have started this discussion 
is probably too complex to analyze clearly.  Most commercial XSLT 
processors (again, DataPower's among them) allow you to profile your 
stylesheets during execution.  

No, a stylesheet that demonstrates differences between processors can be as 
short
as 30 lines long. And very simple. The whole point is that simple stylesheets
give the most differing results, not complex ones.

It's the only way to know what's really 
going on, and it will let you focus your improvements on those parts 
of the stylesheet that need the most attention.

It is not what many great people thought. In fact, relying on debuggers and 
profilers
produces poor programs. A debugger or profiler should confirm hypotheses and 
assumptions,
not provide optimization data. The question is what to base hypotheses and 
assumptions on.

 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list