I have some very large (100s of MBs) XML database dump docs that I want to
break into smaller docs. This is an easy application of for-each-group or of a
simple tail recursion approach but I wanted to use this as an opportunity to
learn more about XSLT 3 streaming.
I’ve read through the XSLT 3 spec and I think I generally understand the
options but it’s still not clear either how or how best to do this type of
grouping so that it’s streamable. I didn’t find any examples of this specific
use case searching on “xslt streaming with grouping” (other than older items
that don’t actually work).
If my source looks like this:
<ROWDATA>
<ROW><SRVC_CAT_ID>54</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Exterior
Lights</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
<ROW><SRVC_CAT_ID>53</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Exterior
Body Panels</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
<ROW><SRVC_CAT_ID>51</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Entertainment
Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body,
Cab</PARENT_NAME></ROW>
<ROW><SRVC_CAT_ID>40</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Door
Locks & Anti-Theft Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories and
Body, Cab</PARENT_NAME></ROW>
… lots more rows …
</ROWDATA>
I’d like to generate result files containing 1000 records each, each wrapped in
the same root element.
The non-stream for-each-group is simple enough:
<xsl:template match="ROWDATA">
<xsl:variable name="resultURIbase" as="xs:string"
select="concat($outdir, '/rowdata-')"
/>
<xsl:variable name="rootname" as="xs:string" select="name(.)"/>
<xsl:for-each-group select="ROW" group-starting-with="*[(position() mod
1000) = 0]">
<xsl:result-document href="{concat($resultURIbase, generate-id(),
'.xml')}">
<xsl:element name="{$rootname}">
<xsl:copy-of select="current-group()"/>
</xsl:element>
</xsl:result-document>
</xsl:for-each-group>
</xsl:template>
But I’m not seeing how do this using e.g., xsl:iterate. As is often the case
with XSLT, I feel like I’m missing the obvious.
Is it in fact possible to do what I want in a streamable way?
Thanks,
Eliot
--
Eliot Kimber
http://contrext.com
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--