xsl-list
[Top] [All Lists]

Re: [xsl] Cannot write more than one result document to the same URI

2013-04-05 03:07:06


The problem is that the specification does not require the XSLT processor to 
complete the processing of the first <b> before starting or even ending the 
processing of the second <b>.  Sure a single-process implementation "X" 
likely would.  But a parallelized (is that a word?) implementation "Y" 
running on multiple CPUs could very well fully process the second <b> before 
the first <b> if it chose to do so.  Its only obligation is to arrange the 
resulting tree with the result of processing the first <b> before the result 
of processing the second <b>.  This obligation ensures that the result of 
processing by "X" is identical to the result of processing by "Y".  But there 
is no obligation on what the processor does to get to that result.

When using <xsl:result-document> the processor is not building the result 
tree.  It is creating a completely separate result.  If the instruction 
required "re-opening" of the file for append, processor "X" likely would 
produce the expected result, but processor "Y" in the situation above would 
produce an unexpected result.  Two processors would produce two results.

And this is also why one cannot assert that the writing to the file is even 
finished before the next attempt to write to the file starts.  The file 
handle could very well still be left open by one parallel process when the 
other is ready to open it for itself.  So it can't be used even if the file 
is opened for write and not for append.

Indeed. Saxon-EE 9.5 will execute xsl:result-document instructions 
asynchronously, so the rule in the spec that you can't write two documents to 
the same URI turns out to be very useful. If you do something like this:

<xsl:for-each select="employee">
 <xsl:result-document href="{@ssn}">
   <xsl:copy-of select="."/>
 </xsl:result-document>
</xsl:for-each>

then you might well have a dozen threads operating at once, each copying a 
different employee element to a different result file. If the URIs were not 
unique, this would cause havoc - in fact the optimization would not really be 
possible.

I can see why you find the rule irritating - I've been in the same situation 
myself - but it's there for a very good reason.

And by the way, your mental model that the result file is closed when the 
xsl:result-document end tag is encountered might be a convenient way of 
thinking about things, and perhaps not even too far from reality, but it's not 
the way the semantics of the language work, and sooner or later it will lead to 
difficulties in understanding what's going on. It's a bit like imagining that 
when you do readFile('xyz') in Java, the file is closed when the closing ')' is 
encountered.

Michael Kay
Saxonica


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--