xsl-list
[Top] [All Lists]

[xsl] How To Use Streaming To Group Elements in a Flat List?

2017-05-02 15:55:09
I have some very large (100s of MBs) XML database dump docs that I want to 
break into smaller docs. This is an easy application of for-each-group or of a 
simple tail recursion approach but I wanted to use this as an opportunity to 
learn more about XSLT 3 streaming.

I’ve read through the XSLT 3 spec and I think I generally understand the 
options but it’s still not clear either how or how best to do this type of 
grouping so that it’s streamable. I didn’t find any examples of this specific 
use case searching on “xslt streaming with grouping” (other than older items 
that don’t actually work).

If my source looks like this:

<ROWDATA>
    
<ROW><SRVC_CAT_ID>54</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Exterior
 Lights</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body, 
Cab</PARENT_NAME></ROW>
    
<ROW><SRVC_CAT_ID>53</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Exterior
 Body Panels</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body, 
Cab</PARENT_NAME></ROW>
    
<ROW><SRVC_CAT_ID>51</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Entertainment
 Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories and Body, 
Cab</PARENT_NAME></ROW>
    
<ROW><SRVC_CAT_ID>40</SRVC_CAT_ID><PARENT_ID>3</PARENT_ID><SRVC_CAT_NAME>Door 
Locks &amp; Anti-Theft Systems</SRVC_CAT_NAME><PARENT_NAME>Accessories and 
Body, Cab</PARENT_NAME></ROW>
… lots more rows …
</ROWDATA>

I’d like to generate result files containing 1000 records each, each wrapped in 
the same root element.

The non-stream for-each-group is simple enough:

    <xsl:template match="ROWDATA">
        <xsl:variable name="resultURIbase" as="xs:string"
            select="concat($outdir, '/rowdata-')"
        />
        <xsl:variable name="rootname" as="xs:string" select="name(.)"/>
        
        <xsl:for-each-group select="ROW" group-starting-with="*[(position() mod 
1000) = 0]">
            <xsl:result-document href="{concat($resultURIbase, generate-id(), 
'.xml')}">
                <xsl:element name="{$rootname}">
                    <xsl:copy-of select="current-group()"/>
                </xsl:element>
            </xsl:result-document>
        </xsl:for-each-group>
                
    </xsl:template>


But I’m not seeing how do this using e.g., xsl:iterate. As is often the case 
with XSLT, I feel like I’m missing the obvious. 

Is it in fact possible to do what I want in a streamable way?

Thanks,

Eliot

--
Eliot Kimber
http://contrext.com
 

--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>