xsl-list
[Top] [All Lists]

Re: A proposal:xsl:result-document asynchronous attribute

2003-03-11 03:33:19
Hi Francis and Kurt,

Kurt wrote:
I'd second Francis's note on the idempotency issue. Consider the
canonical HTTP GET based web service - the stock market ticker. Such
a service will be returning a different result set at any given
moment in time, to an extent that you could effectively argue that
time itself becomes a parameter in any such service, regardless of
the specific implementation. My experience with web services is that
most meaningful web service results vary with time.

As designed, the document() function will only GET such a document
once, so the result of calling the document() function with a
particular URL is always the same. Of course if you run the
transformation several times you will get different answers each time,
but within a transformation, there aren't side-effects.

Francis wrote:
How could xsl:result-document *not* cause changes on the server?

and Kurt wrote:
Likewise, <xsl:result-document> WILL cause changes on the server.
It's difficult to determine from the specification what happens when
an XML document is posted to an external location, but at the very
least the assumption is that you are creating an XML document at
that point, probably via HTTP PUT via WebDAV.

To be facetious, <xsl:result-document> doesn't necessarily create a
physical document -- it just creates a result tree, identifiable via
the URL specified in the href attribute, that the processor must make
accessible to whatever processes are managing the transformation. So
<xsl:result-document> wouldn't cause any changes on the server if you
ran the transformation and then immediately dropped all the result
trees from that transformation.

Conceptually, <xsl:result-document> only causes changes to the server
after the transformation is complete, when all the result trees are
serialised. But, as with the main result tree, <xsl:result-document>
you might have a streaming transformation and thus you may cause
changes on the server during the transformation process. However, it's
an error for you to try to access that document, so as far as the
transformation is concerned, such changes are invisible, and thus the
transformation is side-effect free. (Or as much as they can be: if the
document were accessible via an alternative URL, I don't think that
there'd be anything the processor could do to stop you accessing it.
However, I think it's fairly clear that it's unwise to rely on this
behaviour.)

By contrast, POST is explicitly about making changes immediately --
that's its whole purpose. It would be impossible to wait to do all the
POSTing until the end of the transformation, since that way you
wouldn't get the results of the POSTs during the transformation. The
situations are very different.

Kurt wrote:
The WSDL architecture does not in fact make any distinctions with
regard to whether an HTTP GET is being used to query as opposed to
being used to update. It is quite permissible (albeit again not
necessarily "good practice") to have a web service invocation of the
form:

http://www.myservice.com/updateValue?newValue=foo

The argument that the document() function should be the primary
interface for such web services then contradicts the fact that you
are changing state on the server; if this holds true for document(),
then it should be just as permissible for <xsl:result-document>).

I think that's it's reasonable for us to allow bad practice (using
GETs for unsafe and unidempotent requests) in order to support good
practice (using GETs for safe and idempotent requests). I don't think
that it's reasonable for us to support bad practice (using POSTs for
safe and idempotent requests).

Francis wrote:
I sometimes think that language designs can become fetishistic about
things like idempotency (in the technical sense - a fetish being
something originally associated with an aim - good language design
in this case - which ends up becoming a non-functional [no pun
intended] substitute for the original aim). Useful languages end up
having to deal with things that change state. Even a "purely"
functional language like Haskell has monads. I think at the very
least it would be useful to report a success or error on executing
the GET, and if you concede even that then idempotency has gone for
this function.

and Kurt wrote:
One of the major flaws in the XSLT 1.0 spec was that there were a
great many number of features than became incorporated into the spec
that were intended to prevent people from doing "dangerous" things -
the creation of XML fragments, for instance, rather than allowing
the creation of intermediate node-sets. The fact that most
implementations built work arounds for these limitations indicate to
me that far from being dangerous functionality, the attempt to
protect programmers from their own stupidity was itself pretty
misguided.

I disagree, but I suspect that's because I'm not facing the real
challenges of using XSLT with SOAP messaging day-in day-out whereas I
do deal with questions from newcomers to XSLT (often confused having
come from a procedural programming background) day-in day-out.

Basically, I don't think that it's XSLT's job to perform any actions
aside from transformations. POSTing is an action that should be taken
by the surrounding application, with the results passed into the
transformation.

Francis wrote:
We *could* manage the multiple-evaluation problem in XSLT in the
same way we do for GET by saying that the result of two POST
requests with the same URL and deep-equal message bodies must be
identical. This would force implementations to cache and reuse the
results of each POST. I think that the ordering problem would be
harder to manage, and that it's likely to lead to subtle bugs due to
different processors following different evaluation orders.

Or just say that this function is not idempotent. Which, in reality, it 
isn't - you might get a time-out one time and success the next.

So you would be happy with the situation where, given the variable
declaration:

  <xsl:variable name="foo" select="post($myURI, $myMessage)" />

You could have something like:

  <xsl:if test="$foo">
    <result>
      <xsl:copy-of select="$foo" />
    </result>
  </xsl:if>

and have the $foo that was tested be different from the $foo that was
copied?

Not to mention the fact that having such a post() function return
different results each time would mean that processors couldn't
perform the optimisations they could otherwise.

Francis wrote:
There are applications that use POST in safe, idempotent ways, as a
GET-with-complex-arguments. However, SOAP 1.2 explicitly discourages
that practice, and I think that it would be a bad idea to base XSLT
functionality on bad practice. SOAP 1.2 encourages, instead, the use
of a GET request resulting in a SOAP message, and this is already
supported with the document() function in XSLT.

Except that assumes that all useful idempotent web services are
restricted to - and will in fact be implemented - using "flat"
non-XML queries that do not require any kind of nested structure. I
have never seen anyone make a convincing attempt to justify this
assumption.

I have been pretty much convinced by Paul Prescod's arguments that
this is how idempotent web services *should* be designed. It seems
that the designers of SOAP 1.2 agree. Reality may be very different,
but I'd be very concerned about adding functionality to XSLT whose
only purpose was to support bad practice, no matter how common this
bad practice might be.

Here is another example, albeit one that I'm sure will raise more
than a few hackles:

<xsl:apply-templates select="$myContext/{$anXPathExpression}"/>

This structure is illegal in both XSLT 1.0 and 2.0, no doubt because
the cost of multiple evaluations of XPath context add overhead to
the work involved in the parser. However, such an expression has a
lot of potential usage, for instance, designing an XSLT processing
template in which the details of a given XML "record" are not known
until the time of evaluation, utilizing an external configuration
file to determine what makes up the requisite records, identity
attributes, and so forth. The fact that such a feature does exist in
the Saxon parser (and I believe in EXSLT, though obviously I may be
wrong here) indicates that it has utility that may outweigh its
"potentially" disruptive effects.

I am in absolute agreement that dynamic evaluation of strings as XPath
expressions is incredibly useful. It is something that I have argued
for in the past. The dyn:evaluate() function defined in EXSLT is
currently supported in Xalan-J, 4XSLT and libxslt.

In a related vein, there has been an implicit assumption that the
href value in the <xsl:result-doument> contains either an http: or
file: protocol, but I'm not necessarily sure that this is a valid
assumption. Suppose that you had the following construct:

<xsl:result-document href="mailto:{url}?subject={subject}";>
        <html>
                <body>
                        <xsl:copy-of select="body"/>
                </body>
        </html>
</xsl:result>

Is this construct invalid? It's possible it may be unsupported, of
course, but this is just as true of an http: protocol message.

There is nothing in the XSLT WD that says that the URI must use the
file or http protocol. The spec says:

  "There may be implementation-defined restrictions on the form of
   absolute URI that may be used, but the implementation is not
   required to enforce any restrictions. Any legal relative URI must be
   accepted."

As I pointed out above, the URI is really just an identifier that the
implementation uses to label the result tree. The application
controlling the transformation can do what it likes with the result
trees, including posting them (via HTTP or email).

If the processor is in charge of serialising the result tree, the spec
says:

  "The location to which result trees are serialized (whether in
   filestore or elsewhere) is implementation-defined (which in practice
   may mean that it is controlled using an implementation-defined API).
   However, these locations must satisfy the constraint that when two
   result trees are both created (implicitly or explicitly) using
   relative URIs in the href attribute of the xsl:result-document
   instruction, then these relative URIs may be used to construct
   references from one tree to the other, and such references must
   remain valid when both result trees are serialized."

It's really up to the implementation what it does with the result. I
think it would be perfectly reasonable for an implementation that
recognises a mailto URI to email the tree as XML, or an implementation
that recognises an ftp URI to upload the tree as an XML document.

By the way, I'm arguing about this here because these comments have
been posted to XSL-List. If you want to make a comment or suggestion
about the WD that will be read by the members of the XSL WG, you
should post it to public-qt-comments(_at_)w3(_dot_)org, or let me know if you'd
like me to forward your message there.

Cheers,

Jeni

---
Jeni Tennison
http://www.jenitennison.com/


 XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list