xsl-list
[Top] [All Lists]

Re: [xsl] How to stream-process non-XML text using unparsed-text-lines( ) ?

2014-07-26 13:46:44


Is unparsed-text-lines(...) always streaming, 

It depends on the processor and on the way the code is written, but I think 
that was also what Michael Kay tried to point out. If you keep a reference to 
the result of fn:unparsed-text-lines and you go back and forth through the 
sequence, there is little chance that it will be streamed, but in the 
xsl:for-each approach, and probably with a SimpleMapExpr, you can be quite 
certain that it is streamed.

I.e., this is streamable, and effectively filters the set so you can work with 
only those lines you are interested in:
unparsed-text-lines()!.[contains(., 'hello world')]

or is it streaming only when the
XSLT contains:

      <xsl:mode streamable="yes" />

Whether fn:unparsed-text or fn:unparsed-text-lines are streamable is 
irrespective of the streamable="yes" mode (or, more generally, whether you are 
in a streamable context or not).  The reason is subtle: those functions return 
strings, not a node set, and streamable rules only apply to streamed nodes.

We have considered adding rules specifically for this situation, but it proved 
too complex and/or not worth the effort (can't really recall), because the 
use-cases are already covered in the definition of fn:unparsed-text-lines and 
the freedom it allows processors to implement it.


The below XSLT program works great. Would it work differently if I were to
remove the xsl:mode? How do I know that the input is being streamed?

Yes, it would work the same. And the only sure way that I know of to find out 
whether the input is actually being streamed is by using a large input document 
and monitoring the use of memory. However, even if you do see a lot of memory 
consumed, it may still use streaming, because processors might consider a 
certain buffering, potentially to the size of available memory, beneficial in a 
particular scenario. To cancel that out, the input file must be larger than 
available memory.

Note: if you want a sure-fire cross-processor way of streaming text, you can 
also create a UriResolver that returns a text document as an XML document with 
a (large) set of elements each containing one line or record of the text input. 
Then you can use xsl:stream and create a guaranteed streamable stylesheet 
(which turns out to be relatively simple, as your input will be flat and each 
time you select a text-node, the streamability rules will take into account 
that it cannot contain children, so the rules are a bit more lenient).

Cheers,

Abel Braaksma
Exselt XSLT 3.0 streaming processor
http://exselt.net
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>