xsl-list
[Top] [All Lists]

Re: [xsl] following-sibling is evil

2014-07-06 05:29:07
Any reasonably intelligent processor will do an "early exit" on such an 
expression, which means it will only scan the following-sibling axis until it 
finds the first node that matches the predicates.

However, this kind of logic is still O(n^2), and the performance is especially 
bad if there are few or no duplicates.

It's much better to use constructs such as key() or distinct-values() or 
xsl:for-each-group for this kind of processing, though it may be difficult to 
achieve that in Schematron.

Michael Kay
Saxonica
mike(_at_)saxonica(_dot_)com
+44 (0118) 946 5893



On 6 Jul 2014, at 10:34, Costello, Roger L. costello(_at_)mitre(_dot_)org 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com> wrote:

Hi Folks,

I have a list of <Book> elements. Each Book contains Title, Author, and Genre.

For each Book I wish to determine if there is a following Book with the same 
Author and Genre.

I am using Schematron. This XSLT is generated from my Schematron rule:

   <xsl:template match="Book">
       <xsl:variable name="currentAuthor" select="Author" />
       <xsl:variable name="currentGenre" select="Genre" />
       <xsl:if test="following-sibling::Book[Title eq $currentAuthor][Genre 
eq $currentGenre][1]">
           <xsl:text>There is another Book with the same Author and 
Genre</xsl:text>
       </xsl:if>
   </xsl:template>

Notice the XPath expression:

following-sibling::Book[Title eq $currentAuthor][Genre eq $currentGenre][1]

As I understand it, an XPath processor will evaluate that XPath expression 
like so:

Step 1. Gather up all the following sibling Book elements. Let's denote the 
resulting set by S1.

Step 2. Filter S1 by eliminating those Books that don't satisfy this 
predicate: [Title eq $currentAuthor]. Let's denote the resulting set by S2.

Step 3. Filter S2 by eliminating those Books that don't satisfy this 
predicate: [Genre eq $currentGenre]. Let's denote the resulting set by S3.

Step 4. Filter S3 by eliminating all Books except the first.

Do I correctly understand how an XPath processor will evaluate the XPath 
expression?

Suppose the list contains 100,000 Books.

Consider evaluating the XPath expression for the first Book in the list. The 
XPath processor must collect 99,999 Books and apply the steps to them. 

For the second Book the XPath processor must collect 99,998 Books and apply 
the steps to them. 

For the third Book the XPath processor must collect 99,997 Books and apply 
the steps to them.

And so forth.

Yikes!

That XSLT program will take a very long time to execute!

Is there a way to modify my XPath so that the XPath processor stops as soon 
as it gets to a Book that matches all the predicates?

/Roger

--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--

<Prev in Thread] Current Thread [Next in Thread>