Thanks for this insight. I'm sure there was no particular reason why the
code was written the way it was - it's changed completely in Saxon 7.x.
Interesting to see how dramatic the effect of a simple thing like this
can be. The second argument to the Vector() constructor should have no
effect on anything except performance.
I would be interested to know how you pinned it down.
(As you probably realise, development on the Saxon 6.5 branch, apart
from bug fixes, stopped over two years ago).
Michael Kay
-----Original Message-----
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com] On Behalf Of
David Tolpin
Sent: 17 January 2004 14:44
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] SAXON and exsl:node-set -- problem solved
Hi,
the problem is indeed in SAXON. And while exsl:node-set is
not directly involved, it is tightly coupled with
rtf->node-set. Consider these two simple stylesheets:
it1.xsl
==========
<xsl:template match="/">
<xsl:apply-templates/>
</xsl:template>
it2.xsl
===========
<xsl:template match="/">
<xsl:variable name="x">
<xsl:apply-templates/>
</xsl:variable>
</xsl:template>
Identity transformation by default is omitted for brevity. On
a 500Kb file, the first one runs 4 seconds with SAXON, the
second one 33 seconds. Obviously, the picture becomes only
worse, with dependency roughly quadratic.
The cause of this dependency is in
com/icl/saxon/expr/FragmentValue.java
It's line 23 says:
private Vector events = new Vector(20, 20);
which makes the container for the result-tree linearly
expanding (that is, each time the space is exhausted, new 20
elements are added to the vector, and the whole vector is
copied to the new Java array). The bigger the data is, the
bigger amount has to be copied for each twenty elements.
Changing this line to:
private Vectore events = new Vector(20);
eliminates the problem, the vector will now occupy at most
twice the place it would before the change, but the time to
expand it neglectible, so the two stylesheets complete in the
same time.
What concerns me is that a similar problem exists in jd.xslt
1.5.5. There is a cause that forced the author of SAXON, a
great program by all merits, to use linear increment on this
vector. Using exsl:node-set for chaining is incompatible with
this decision, and just fixing this problem may probably lead
to others more severe.
What are these problems and is this fix suitable for SAXON?
David Tolpin
http://davidashen.net/
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list