Ill give you a quick hint to help clean up your process.. I wish I had
time to look deeper into this for you but time is not something I have a
lot of at the moment... None the less the first thing I noticed was the
simple fact that you are running a for-each loop, applying each result
element from the XPath of the select attribute of your for each loop to
the apply-templates which is the built in mechanism for recursion in
XSLT... so to simplify you are going through the "for-each" process
twice. While I am unsure as to the exact process implemented by
xsltproc (not sure if it will notice the obvious doule up and ignore the
for-each choosing instead to apply the result of the xpath expression
directly to the apply-templates process) I can tell you there is a
possibility that you are processing through your massive XML file which
is implied by simply using <xsl:apply-templates
select="document(@href)"/>... and now that I just saw what you are
using as your select attribute value I can definitely see why the
compiler may not optimize ;)
I would definitely pull out that for-each and see how much that
helps... As I'm looking further down I can definitely see some other
areas that could be optimized... I wish I had the time to help further
but I gotta get back to coding myself... None the less there are plenty
of others here that will be more than happy to help you further...
Best of luck!
<M:D/>
:: Saxon.NET is now available to early beta participants! Visit
http://www.x2x2x.org/x2x2x/home to sign up ::
Paul DuBois wrote:
I've been running some tests on a document that includes nested
Xinclude directives. The document is complex: upwards of 1500 files,
nested to a depth of up to 4 levels. Total size of content is about
4.8MB.
For simple testing, I'm attempting only to produce a "flattened"
document that just resolves the XIincludes. Stylesheet looks like
this:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet version="1.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<!-- Identity transform, but "flatten" xincludes -->
<xsl:output method="xml" indent="yes"/>
<xsl:preserve-space elements="*"/>
<xsl:template match="xi:include"
xmlns:xi="http://www.w3.org/2001/XInclude">
<xsl:for-each select="document(@href)">
<xsl:apply-templates/>
</xsl:for-each>
</xsl:template>
<!-- identity transform -->
<xsl:template match="/ | node() | @* | comment() |
processing-instruction()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
My processing command is:
xsltproc --xinclude --novalid xinclude.xsl input.xml > output.xml
This takes about 12 minutes on my 900 MHz G3 iBook (Mac OS X), and about
4 minutes on my 2.8 GHz Pentium 4 Gentoo Linux box.
That seems pretty slow, particular given that the control condition takes
mere seconds (running the flattenedly document through a standard
identity
transform with xsltproc).
I don't want to post the input here because it's so big, so this is
really
just a preliminary post to ask for advice as to how I might go about
improving the XInclude-d transform: Is this a known issue with
xsltproc/XInclude? Or is there perhaps some flag I should be using
that I
am failing to use? Something bad about my stylesheet?
--+------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--+--