Running your code on Saxon 9.7, I get
XTSE3430: Template rule is declared streamable but it does not satisfy the
streamability rules.
* The xsl:for-each-group/@group-starting-with pattern is not motionless
That's because *[position()] involves counting preceding siblings. Or to look
at it another way, the pattern can't be evaluated simply by looking at the node
in isolation, it has to examine its position relative to other nodes in the
document.
But there's an easy workaround: use group-adjacent="(position() - 1) idiv
1000". With this formulation, position() is counting the items being grouped,
not the number of siblings they have.
Here's the full stylesheet:
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
exclude-result-prefixes="xs"
version="3.0">
<xsl:mode streamable="yes"/>
<xsl:template match="ROWDATA">
<xsl:variable name="resultURIbase" as="xs:string"
select="concat('out', '/rowdata-')"
/>
<xsl:variable name="rootname" as="xs:string" select="name(.)"/>
<xsl:for-each-group select="ROW" group-adjacent="(position() - 1) idiv
1000">
<xsl:result-document href="{concat($resultURIbase, generate-id(),
'.xml')}">
<xsl:element name="{$rootname}">
<xsl:copy-of select="current-group()"/>
</xsl:element>
</xsl:result-document>
</xsl:for-each-group>
</xsl:template>
</xsl:stylesheet>
On 2 May 2017, at 21:55, Eliot Kimber ekimber(_at_)contrext(_dot_)com
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:
I have some very large (100s of MBs) XML database dump docs that I want to
break into smaller docs. This is an easy application of for-each-group or of
a simple tail recursion approach but I wanted to use this as an opportunity
to learn more about XSLT 3 streaming.
I’ve read through the XSLT 3 spec and I think I generally understand the
options but it’s still not clear either how or how best to do this type of
grouping so that it’s streamable. I didn’t find any examples of this specific
use case searching on “xslt streaming with grouping” (other than older items
that don’t actually work).
If my source looks like this:
<ROWDATA>
<ROW><SRVC_CAT_ID>54</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Exterior
Lights</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
<ROW><SRVC_CAT_ID>53</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Exterior
Body Panels</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
<ROW><SRVC_CAT_ID>51</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Entertainment
Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
<ROW><SRVC_CAT_ID>40</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Door
Locks & Anti-Theft Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories and
Body, Cab</PARENT_NAME></ROW>
… lots more rows …
</ROWDATA>
I’d like to generate result files containing 1000 records each, each wrapped
in the same root element.
The non-stream for-each-group is simple enough:
<xsl:template match="ROWDATA">
<xsl:variable name="resultURIbase" as="xs:string"
select="concat($outdir, '/rowdata-')"
/>
<xsl:variable name="rootname" as="xs:string" select="name(.)"/>
<xsl:for-each-group select="ROW" group-starting-with="*[(position()
mod 1000) = 0]">
<xsl:result-document href="{concat($resultURIbase, generate-id(),
'.xml')}">
<xsl:element name="{$rootname}">
<xsl:copy-of select="current-group()"/>
</xsl:element>
</xsl:result-document>
</xsl:for-each-group>
</xsl:template>
But I’m not seeing how do this using e.g., xsl:iterate. As is often the case
with XSLT, I feel like I’m missing the obvious.
Is it in fact possible to do what I want in a streamable way?
Thanks,
Eliot
--
Eliot Kimber
http://contrext.com
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--