xsl-list
[Top] [All Lists]

Re: [xsl] Question on streaming and grouping with nested keys

2017-07-14 07:42:00
On 14.07.2017 14:05, Felix Sasaki felix(_at_)sasakiatcf(_dot_)com wrote:

I tried the example from Martin with

<xsl:template match="TRANSACTION-LIST">
      <xsl:copy>
<xsl:for-each-group select="copy-of(TRANSACTION)" group-by="ITEM2/SUBITEM2/GROUPING-KEY">
            <xsl:copy>
<item1-sum><xsl:value-of select="sum(current-group()/ITEM2/SUBITEM2.1)"/></item1-count>

...

It gives me an of memory error. The input file is 160MB, but the individual transactions are rather small (around 20+ elements). The error also appears if I remove "<xsl:copy>".

160 MB doesn't sound like a file you need streaming for at all. Does that suggestion above cause memory problems only when using streaming (e.g. when you have <xsl:mode streamable="yes"/>) or also without streaming? Have you tried increasing the memory for Saxon/Java?

As you mention Saxon EE, let's hope Michael Kay comes across this thread and can certainly tell you more on how to tackle that problem with his product.

I have a working solution using an accumulator and maps, see below, but here I did not manage to use streaming. If I set the accumulator to streamable="yes", Saxon EE tells me


"The xsl:accumulator-rule/@select expression for a streaming accumulator must be motionless"


Although I am using xsl-copy() as in Martin's example.


<xsl:accumulator name="gather-values" as="map(xs:anyAtomicType, node())" initial-value="map{}">
     <xsl:accumulator-rule match="TRANSACTION">
       <xsl:variable name="current" select="copy-of()"/>

As far as I understand it, you can't use copy-of() in an accumulator you want to be streamable. Working with streaming and accumulating values requires a change of the usual coding habits with XSLT, I think, for instance to capture the key you have with an accumulator and streaming you would need to use e.g. <xsl:accumulator-rule match="TRANSACTION/ITEM2/SUBITEM2.2/GROUPING-KEY/text()" select="string()"/> as only on the text node you are able to read out that value while streaming through the document.

So to try to solve that problem with accumulators and streaming I think you need several of them, one counting ITEM1, one summing up SUBITEM2.1/text(), the above for the key and then you need to combine them to store the data together.
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>