xsl-list
[Top] [All Lists]

RE: SAXON and exsl:node-set -- problem solved

2004-01-17 09:28:24
Thanks for this insight. I'm sure there was no particular reason why the
code was written the way it was - it's changed completely in Saxon 7.x.
Interesting to see how dramatic the effect of a simple thing like this
can be. The second argument to the Vector() constructor should have no
effect on anything except performance.

I would be interested to know how you pinned it down.

(As you probably realise, development on the Saxon 6.5 branch, apart
from bug fixes, stopped over two years ago).

Michael Kay 

-----Original Message-----
From: owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com 
[mailto:owner-xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com] On Behalf Of 
David Tolpin
Sent: 17 January 2004 14:44
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] SAXON and exsl:node-set -- problem solved


Hi,

the problem is indeed in SAXON. And while exsl:node-set is 
not directly involved, it is tightly coupled with 
rtf->node-set. Consider these two simple stylesheets:

it1.xsl
==========
  <xsl:template match="/">
    <xsl:apply-templates/>
  </xsl:template>

it2.xsl
===========
  <xsl:template match="/">
    <xsl:variable name="x">
      <xsl:apply-templates/>
    </xsl:variable>
  </xsl:template>

Identity transformation by default is omitted for brevity. On 
a 500Kb file, the first one runs 4 seconds with SAXON, the 
second one 33 seconds. Obviously, the picture becomes only 
worse, with dependency roughly quadratic.

The cause of this dependency is in 

  com/icl/saxon/expr/FragmentValue.java

It's line 23 says:

    private Vector events = new Vector(20, 20);

which makes the container for the result-tree linearly 
expanding (that is, each time the space is exhausted, new 20 
elements are added to the vector, and the whole vector is 
copied to the new Java array). The bigger the data is, the 
bigger amount has to be copied for each twenty elements. 

Changing this line to:

    private Vectore events = new Vector(20); 

eliminates the problem, the vector will now occupy at most 
twice the place it would before the change, but the time to 
expand it neglectible, so the two stylesheets complete in the 
same time.

What concerns me is that a similar problem exists in jd.xslt 
1.5.5. There is a cause that forced the author of SAXON, a 
great program by all merits, to use linear increment on this 
vector. Using exsl:node-set for chaining is incompatible with 
this decision, and just fixing this problem may probably lead 
to others more severe.

What are these problems and is this fix suitable for SAXON?

David Tolpin
http://davidashen.net/




   


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list



 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list