xsl-list
[Top] [All Lists]

Re: [xsl] ordered selection of child elements

2018-03-08 03:32:12
When people ask "why?" I'm never sure whether they mean

(a) where in the spec does it say this should happen?, or

(b) why was the spec written this way?

The second question can then be interpreted as either

(b1) as a matter of historical record, when was the decision made and what 
arguments were put forward on both sides?, or

(b2) can you think of any reason why a rational WG would have made this 
decision?

The answer to (a) can be found in §3.3.1.1 of the XPath (3.1) specification: 
https://www.w3.org/TR/xpath-31/#id-path-operator

Specifically:

<quote>
Each operation E1/E2 is evaluated as follows: Expression E1 is evaluated, and 
if the result is not a (possibly empty) sequence S of nodes, a type error 
<https://www.w3.org/TR/xpath-31/#dt-type-error> is raised [err:XPTY0019 
<https://www.w3.org/TR/xpath-31/#ERRXPTY0019>]. Each node in S then serves in 
turn to provide an inner focus (the node as the context item, its position in S 
as the context position, the length of S as the context size) for an evaluation 
of E2, as described in  2.1.2 Dynamic Context 
<https://www.w3.org/TR/xpath-31/#eval_context>. The sequences resulting from 
all the evaluations of E2 are combined as follows:

If every evaluation of E2 returns a (possibly empty) sequence of nodes, these 
sequences are combined, and duplicate nodes are eliminated based on node 
identity. The resulting node sequence is returned in document order 
<https://www.w3.org/TR/xpath-31/#dt-document-order>.
</quote>

So the sorting into document order is done by the "/" operator. (Note: it might 
be a good idea of getting into the habit of using "!" rather than "/", 
especially for trivial expressions like $e/@x, to save the optimizer the 
trouble of working out that it doesn't actually need to do a sort in this 
particular instance.)

The answer to (b) is roughly as follows.

Firstly, XPath 1.0 actually defines that the expression returns a node-set, 
that is, a set of nodes with no defined order. XSLT 1.0 specifies that 
constructs like xsl:for-each and xsl:apply-templates process these nodes in 
document order. In practice all XPath 1.0 processors that I know of return 
node-sets in document order, but there is no requirement in the spec to do so. 
I don't know historically why XSLT 1.0 decided on document order (so we're in 
(b2) territory here), but interoperability (that is, having all processors 
produce identical output) was strong on the WG's requirements list.

Secondly, the question came up again and was hotly debated during the XPath 2.0 
deliberations, where I was involved so I can tell you more about it (in (b1) 
terms). There was tension here between XQuery developers, who wanted to give 
optimizers the maximum freedom to optimize (which in the database world means 
using indexes), and XSLT developers, who were (a) more concerned with 
interoperability, and (b) more concerned with handling of mixed content (that 
is, documents rather than data). The XSL WG was also of course concerned with 
backwards compatibility between 1.0 and 2.0.

The rule that path expressions return results in document order is in fact 
present in the first published WD of XPath 2.0 
(https://www.w3.org/TR/2001/WD-xpath20-20011220/#id-path-expressions) and the 
minutes show intense discussion on the topic around the summer of 2001. I 
remember a particular posting of mine as being successful in swaying some 
XQuery participants: it is dated 23 July 2001 and reads:

<quote>
I was quite keen on Jonathan Marsh's proposal as a way forward on this.
Looking at the analysis we did on Friday, however, I've come to the
conclusion that for mixed content it's just not viable.

Consider the source:

<warnings>
<warning>Do <emph> not</emph> touch the mains switch, the computer will
<emph> explode</emph></warning>
</warnings>

and the stylesheet fragment:

<xsl:template match="warnings">
<p><xsl:apply-templates select=".//*/text()"/></p>
</xsl:template>

At XPath 1.0 the output is:

<p>Do not touch the mains switch, the computer will explode</p>

At XPath 2.0, with Jonathan's proposal, the output would be:

<p>Do touch the mains switch, the computer will not explode</p>

I chose a melodramatic example because I thought it would impress Dana [1], but
I think the point is clear anyway. With mixed element content, or any
document that has hierarchic structures of variable depth, users naturally
expect path expressions consisting only of "/" and "//" operators to return
results in document order, and if we redefine the semantics in
"breadth-first" terms without reordering, users will get results that are
surprising and disconcerting. As Evan pointed out, it's not just a backwards
compatibility issue, it's a usability issue: document order is the natural
order of the results.

So to move forward, I'm now convinced that we need a separate operator for
sequence-based projection, or a single polymorphic operator whose semantics
are inferred from the data type of the operands. I'm now convinced that
changing the existing semantics of "/" isn't on. (Good try, Jonathan: you
nearly persuaded me!)

Mike Kay

</quote>

[1] Dana Florescu in previous discussion had used arguments based on safety 
criteria.




On 8 Mar 2018, at 06:36, Dr. Patrik Stellmann 
patrik(_dot_)stellmann(_at_)gdv-dl(_dot_)de 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Hi,
 
a question more motivated by curiosity than by a real problem:
 
With
            <xsl:sequence select=“a, b“/>
I will get first element a and second element b – no matter of the order 
within the input document.
 
But with
            <xsl:sequence select=“root/(a, b)“/>
I will get the elements a and b in document order. So this behaves identical 
to
            <xsl:sequence select=“root/(a | b)“/>
 
Why?
 
Of course I could write
            <xsl:sequence select=“root/a, root/b“/>
To ensure a specific order. But sometimes the expression of “root” is much 
more complex so I’d like to avoid writing it twice or putting it in a 
variable…
 
Thanks and regards,
Patrik
 
 

------------------------------------------------------------------ 
Systemarchitektur & IT-Projekte 
Tel: +49 40 33449-1142 
Fax: +49 40 33449-1400 
E-Mail: Patrik(_dot_)Stellmann(_at_)gdv-dl(_dot_)de 
<mailto:Patrik(_dot_)Stellmann(_at_)gdv-dl(_dot_)de> 

<mailteaser_mks_2018.png> 
GDV Dienstleistungs-GmbH
Glockengießerwall 1
D-20095 Hamburg
www.gdv-dl.de <http://www.gdv-dl.de/>

Niederlassungen:

Wilhelmstraße 43 / 43 G
10117 Berlin

Frankenstraße 18a
20097 Hamburg

Sitz und Registergericht: Hamburg
HRB 145291
USt.-IdNr : DE 205183123

Geschäftsführer:
Dr. Jens Bartenwerfer
Michael Bathke
Fred di Giuseppe Chiachiarella
Thomas Fischer

Aufsichtsratsvorsitzender: Werner Schmidt

------------------------------------------------------------------
Diese E-Mail und alle Anhänge enthalten vertrauliche und/oder rechtlich 
geschützte Informationen. Wenn Sie nicht der richtige Adressat sind oder 
diese E-Mail irrtümlich erhalten haben, informieren Sie bitte sofort den 
Absender und vernichten Sie diese E-Mail. Das unerlaubte Kopieren sowie die 
unbefugte Weitergabe der E-Mail ist nicht gestattet.

This e-mail and any attached files may contain confidential and/or privileged 
information. If you are not the intended recipient (or have received this 
e-mail in error) please notify the sender immediately and destroy this 
e-mail. Any unauthorised copying, disclosure or distribution of the material 
in this e-mail is strictly forbidden.

XSL-List info and archive <http://www.mulberrytech.com/xsl/xsl-list>
EasyUnsubscribe <-list/293509> (by email <>)
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] Current Thread [Next in Thread>