Hi,
At 01:33 PM 7/12/2004, you wrote:
ok I have a piece of XSLT that processes a large XML file into smaller
chunks. The problem I have is that the deeper down into the XML file I am
processing the longer it takes. Is this just due to the way XSLT parsers
work or can I tweak my XSL file so it processes faster?
I get the same effect when I used to process the file as one pass using
Saxon Result:document as I do processing as seperate XSL files with either
Saxon or Sablotron.
This is the seperate file XSL file:- (Change the server[(_at_)name='Ahazi'] as
needed)
<?xml version="1.0"?>
<xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="1.0">
<xsl:output method="xml" indent='yes' encoding="utf-8"/>
<xsl:template match="server" />
<xsl:template match="server[(_at_)name='Ahazi']">
<resources>
<xsl:for-each
select=".//resource[not(@swgcraft_id=preceding::*/@swgcraft_id)]">
... this for-each is expensive. You are traversing the entire document
looking for 'resource' elements; each one you find is examined by looking
at all its preceding elements and comparing their @swgcraft_id attributes.
When you have lots of elements, lots and lots of them are compared. (n^2
performance.)
Since this happens every time the template is matched (which could itself
be lots of times), it adds up -- especially for the later nodes in your set
(as you noticed).
An easy tweak to improve performance would be to use keys to de-duplicate
instead of doing it by hand on the preceding:: axis.
So:
<xsl:key name="resource-by-id" match="resource" use="@swgcraft_id"/>
<xsl:variable name="resources" select="//resource"/>
(binding //resource to a variable $resource so we don't have to retrieve it
every single time)
then you can deduplicate in another variable declaration:
<xsl:variable name="unique-resources"
select="$resources[not(count(.|key('resources-by-id',@swgcraft_id)[1])
= 1)]"/>
In English: $unique-resources is the collection of all resources which,
when counted along with the first resource with the same swqcraft_id as
themselves, amount to a single node (which is true only of the first one
with each swgcraft_id).
This ought to help quite a bit.
Cheers,
Wendell
======================================================================
Wendell Piez
mailto:wapiez(_at_)mulberrytech(_dot_)com
Mulberry Technologies, Inc. http://www.mulberrytech.com
17 West Jefferson Street Direct Phone: 301/315-9635
Suite 207 Phone: 301/315-9631
Rockville, MD 20850 Fax: 301/315-8285
----------------------------------------------------------------------
Mulberry Technologies: A Consultancy Specializing in SGML and XML
======================================================================