xsl-list
[Top] [All Lists]

RE: Wath is the opposite of the union operator?

2005-09-22 09:03:20
You can expensively code 

A except B =>

   A[count(.|B) != count(B)]

A intersect B =>

   A[count(.|B) = count(B)]

Hi, Mike,

If you don't mind waxing theoretical for a minute, what makes those 
comparisons expensive?


Let's stick to "except", the same applies to "intersect".

It's reasonably easy for an optimizer to do the simple rewrite

let $c := count(B)
return A[count(.|B) != $c]

Saxon certainly does this, other processors might not. If B is a complex
expression with no dependency on the context node, Saxon will also evaluate
B outside the loop, so this reduces to

let $b := B
let $c := count(B)
return A[count(.|$b) != $c]

But that still leaves count(.|$b) inside the loop, to be evaluated once for
each item in A. For Saxon, this means scanning the node-set $b once for each
item in $a, giving overall performance O(count(A)*count(B)). By contrast,
the union, except, and intersect operators in XPath 2.0 are evaluated in
Saxon using a sort-merge strategy, giving performance of the order
O(N*log(N) + M*log(M)) where N is count(A) and M is count(B). In fact, the
sort will in most cases not be necessary since the node-sequence will often
be in document order already, giving performance O(count(A)+count(B)).

The sort-merge approach is also more efficient in memory. When the
expressions A and B deliver nodes in document order, all three operators can
be evaluated without breaking the pipeline: that is, it is not necessary to
allocate memory to hold the values of their operands. Memory allocation (and
de-allocation, through garbage collection) accounts for a significant part
of XSLT/XPath execution cost.

Of course, it's always possible that a different XPath 1.0 processor might
recognize these constructs specially, and rewrite them to give performance
equivalent to the 2.0 operators.

Michael Kay
http://www.saxonica.com/



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



<Prev in Thread] Current Thread [Next in Thread>