xsl-list
[Top] [All Lists]

[xsl] Different performance of nodesets created in different ways

2008-02-01 03:30:39
We recently experienced an out of memory error with an xslt 1.0 stylesheet 
which used the xalan nodeset() function to convert an <xsl:variable> with a 
non-empty body from a result tree fragment into a nodeset.
 
My test case data looks like this -
 
<a>
       <b>g-day<c>hello</c><c>hello</c><c>hello</c></b>
       <b>g-day<c>hello</c><c>hello</c><c>hello</c></b>
       ... several thousand more <b>...</b> elements like this ...
       <b>g-day<c>hello</c><c>hello</c><c>hello</c></b>
       <b>g-day<c>hello</c><c>hello</c><c>hello</c></b>
</a>
 
My (deeply flawed) test-case stylesheet originally looked like this -
 
<?xml version='1.0'?>
<xsl:stylesheet version="1.0" xmlns:xalan="http://xml.apache.org/xalan"; 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
 
<xsl:output method="text"/>
 
<xsl:template match="/">
            <xsl:call-template name="bigMemoryUsage"/>
</xsl:template>
 
<xsl:template name="bigMemoryUsage">
            <xsl:variable name="big">                      
                        // result tree fragment
                        <xsl:copy-of select="/a/b"/>
            </xsl:variable>
            <xsl:for-each select="/a/b">
                <xsl:variable name="i" select="position()"/>
                <xsl:value-of select="xalan:nodeset($big)/b[position()=$i]"/>
            </xsl:for-each>
</xsl:template>
 
</xsl:stylesheet>
 
When I changed the <xsl:variable> to get its value from the select="..." 
attribute, i.e. to be of type node-set, and removed the call to xalan:nodeset() 
-
 
<?xml version='1.0'?>
<xsl:stylesheet version="1.0" xmlns:xalan="http://xml.apache.org/xalan"; 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
 
<xsl:output method="text"/>
 
<xsl:template match="/">
            <xsl:call-template name="smallMemoryUsage"/>
</xsl:template>
 
<xsl:template name="smallMemoryUsage">
            <xsl:variable name="small" select="/a/b"/>                      // 
nodeset
            <xsl:for-each select="/a/b">
                <xsl:variable name="i" select="position()"/>
                <xsl:value-of select="$small/b[position()=$i]"/>
            </xsl:for-each>
</xsl:template>
 
</xsl:stylesheet>
 
my test case used a quarter as much memory.
 
The two versions of the stylesheet process the same nodes (or a copy of the 
same nodes) and produce the same output. Unfortunately the "small memory" 
version of the stylesheet ran for four times as long as the "big memory" 
version.
 
When I experimentally changed the axis in the <xsl:value-of> from child to 
descendant -
 
<?xml version='1.0'?>
<xsl:stylesheet version="1.0" xmlns:xalan="http://xml.apache.org/xalan"; 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
 
<xsl:output method="text"/>
 
<xsl:template match="/">
            <xsl:call-template name="smallMemoryUsage"/>
</xsl:template>
 
<xsl:template name="smallMemoryUsage">
            <xsl:variable name="small" select="/a/b"/>
            <xsl:for-each select="/a/b">
                <xsl:variable name="i" select="position()"/>
                <xsl:value-of select="$small//b[position()=$i]"/>            // 
descendant axis
            </xsl:for-each>
</xsl:template>
 
</xsl:stylesheet>
 
the "small memory" stylesheet took 5 times as long again to run. However, when 
I made the corresponding change to the "big memory" stylesheet -
 
<?xml version='1.0'?>
<xsl:stylesheet version="1.0" xmlns:xalan="http://xml.apache.org/xalan"; 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
 
<xsl:output method="text"/>
 
<xsl:template match="/">
            <xsl:call-template name="bigMemoryUsage"/>
</xsl:template>
 
<xsl:template name="bigMemoryUsage">
            <xsl:variable name="big">
                        <xsl:copy-of select="/a/b"/>
            </xsl:variable>
            <xsl:for-each select="/a/b">
                <xsl:variable name="i" select="position()"/>
                <xsl:value-of select="xalan:nodeset($big)//b[position()=$i]"/> 
    // descendant axis
 
            </xsl:for-each>
</xsl:template>
 
</xsl:stylesheet>
 
the "big memory" stylesheet ran in about the same time as before.
 
I then rewrote the "small memory" stylesheet like this -
 
<?xml version='1.0'?>
<xsl:stylesheet version="1.0" xmlns:xalan="http://xml.apache.org/xalan"; 
xmlns:xsl="http://www.w3.org/1999/XSL/Transform";>
 
<xsl:output method="text"/>
 
<xsl:template match="/">
            <xsl:call-template name="smallMemoryUsage"/>
</xsl:template>
 
<xsl:template name="smallMemoryUsage">
            <xsl:variable name="small" select="/a/b"/>
            <xsl:for-each select="$small/b">
                <xsl:value-of select="."/>
            </xsl:for-each>
</xsl:template>
 
</xsl:stylesheet>
 
Having got rid of the silly position() predicate, the performance of the "small 
memory" stylesheet was about 100 times better. Making the same change to the 
"big memory" stylesheet improved performance by about 50 times. This presumably 
reflects the extra cost of using the Xalan nodeset() function.
 
I am now happy with the performance of the "small memory" stylesheet when it is 
written sensibly, but I do not really understand why doing silly processing 
against a nodeset created from a call to xalan:nodeset() seems to run about 20 
times quicker than the same silly processing against a nodeset variable created 
by the select="..." attribute of <xsl:variable>. Is it something to do with 
whether my nodeset variable uses a NodeIterator, Nodelist or NodeVector under 
the bonnet? Am I accessing the nodes sequentially in one case and positionally 
in the other? Am I missing something fundamental about how predicates work?
 
I have been running the above through Stylus Studio using Xalan 2.7.0 and 
through an IBM java 1.5 jvm also using Xalan 2.7.0. My version of XSLT is 1.0. 
I've allocated my jvm between 64mb and 500mb of memory at various stages of 
testing, and the production IBM java 1.5 jvm which blew had 1.5 gb, and was 
running java code compiled at 1.4.2 .
 
Any help would be greatly appreciated!
 
Pete Taylor
 
_________________________________________________ 

AXA UK IT
Pete Taylor
IT Solution Consultant
AXA, Ballam Road (ABC Block), Lytham, FY8 4TQ
Tel: +44 (0)1253 683398 (internal - 741 3398) 
E-mail: peter(_dot_)j(_dot_)taylor(_at_)axa-insurance(_dot_)co(_dot_)uk
 
Make tea, not war.
_________________________________________________ 
 

This email originates from AXA Services Limited (reg. no. 446043) 
which is a service company for AXA UK plc (reg. no. 2937724) and 
the following companies within the AXA UK plc Group:
AXA Insurance Plc (reg. no. 932111)
AXA Insurance UK Plc (reg. no. 78950)
AXA General Insurance Limited (reg. no. 141885) 

All of the above mentioned companies are registered in England and 
have their registered office at 5 Old Broad Street, London EC2N 1AD, 
England. AXA Insurance UK plc is authorised and regulated by the 
Financial Services Authority.

This message and any files transmitted with it are confidential and 
intended solely for the individual or entity to whom they are addressed. 
If you have received this in error, you should not disseminate or copy 
this email. Please notify the sender immediately and delete this email 
from your system. 

Please also note that any opinions presented in this email are solely 
those of the author and do not necessarily represent those of The AXA 
UK Plc Group. Email transmission cannot be guaranteed to be secure, or 
error free as information could be intercepted, corrupted, lost, 
destroyed, late in arriving or incomplete as a result of the transmission 
process. The sender therefore does not accept liability for any errors or 
omissions in the contents of this message which arise as a result of 
email transmission. 

Finally, the recipient should check this email and any attachments for 
viruses. The AXA UK Plc Group accept no liability for any damage 
caused by any virus transmitted by this email.


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--

<Prev in Thread] Current Thread [Next in Thread>