xsl-list
[Top] [All Lists]

Re: [xsl] Streaming and grouping in XSLT 3.0

2017-03-21 07:21:33


First question: in your grouping of properties using the rdf:about attribute, 
are the groups adjacent? The for-each-group[@group-adjacent] instruction is 
fully streamable according to the XSLT 3.0 rules, and the Saxon implementation 
is also streamable (though in some circumstances -- I would need to check -- it 
may build the content of each group in memory before moving on to the next).

If the groups aren't adjacent, then you can use 
xsl:fork/xsl:for-each-group[@group-by], but this isn't really fully streamed 
because it will typically collect all the groups in memory.

The next concern is about doing a two-pass transformation where both passes are 
streamed. This hasn't really received as much attention as it should in either 
the spec or in Saxon. I think it should be possible to pipe the result of the 
first transformation directly into the second provided both are written as 
free-standing stylesheets (as distinct from a "micro-pipeline" within a single 
stylesheet). But I'm not sure I've tested this.

As to the question " how do I even begin analyzing the streamability o this 
approach in XSLT 3.0?" I think the answer is suck it and see. First ask 
yourself "does it seem intuitively streamable?" in the sense that the Nth thing 
in the output can be computed from the Nth thing in the input, plus perhaps a 
little bit of memory of earlier things in the input. If the answer to that is 
yes, then try writing the code and see if Saxon complains.

Michael Kay
Saxonica

My use case is as follows -- a triplestore returns streaming RDF/XML
(omitting namespaces):

<rdf:RDF>
 <rdf:Description rdf:about="http://dbpedia.org/resource/Copenhagen";>
   <dct:title>Copenhagen</dct:title>
 </rdf:Description>
 ...
 <rdf:Description rdf:about="http://dbpedia.org/resource/Copenhagen";>
   <dbo:country rdf:resource="http://dbpedia.org/resource/Denmark"/>
 </rdf:Description>
 ...
</rdf:RDF>

Since it's streaming, every resource description only contains one
property. It's not convenient, so during the first pass I group the
descriptions by subject using [1]:

<rdf:RDF>
 <rdf:Description rdf:about="http://dbpedia.org/resource/Copenhagen";>
   <dct:title>Copenhagen</dct:title>
   <dbo:country rdf:resource="http://dbpedia.org/resource/Denmark"/>
 </rdf:Description>
 ...
</rdf:RDF>

Then as the second pass I do the main transformation to XHTML, which
could produce something like using [2][3]

<fieldset>
 <label>Title</title>
 <input type="text" value="Copenhagen"/>
 <label>Country</title>
 <input type="text" value="http://dbpedia.org/resource/Denmark"/>
 ...
</fieldset>

where fieldsets represent resources, and labels/inputs represent
properties/values.

Now my question is, how do I even begin analyzing the streamability of
this approach in XSLT 3.0? I guess my main concern is that such
grouping would not be streamable, but maybe there are other solutions?

Thanks.

Martynas


[1] 
https://github.com/AtomGraph/Web-Client/blob/master/src/main/webapp/static/com/atomgraph/client/xsl/group-sort-triples.xsl
[2] 
https://github.com/AtomGraph/Web-Client/blob/master/src/main/webapp/static/com/atomgraph/client/xsl/bootstrap/2.3.2/layout.xsl
[3] 
https://github.com/AtomGraph/Web-Client/blob/master/src/main/webapp/static/com/atomgraph/client/xsl/bootstrap/2.3.2/imports/default.xsl

--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>