xsl-list
[Top] [All Lists]

RE: [xsl] xsl:for-each-group: start groups depending on number of group members?

2007-04-30 06:28:25
It's such a high-level description of the problem that it's hard to be
specific about how to tune the performance, but instinctively my reaction
would be to look for a multi-pass approach: preprocess the data to compute
properties of each node that will make the subsequent grouping operation
simpler and more efficient.

Michael Kay
http://www.saxonica.com/ 

-----Original Message-----
From: Yves Forkl [mailto:Y(_dot_)Forkl(_at_)srz(_dot_)de] 
Sent: 30 April 2007 14:13
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: Re: [xsl] xsl:for-each-group: start groups depending 
on number of group members?

Wendell,

you wrote:

While you can't restrict preceding-sibling to look only at 
members of 
the current group, you might be able to get somewhere with 
either of 
these approaches:

* The XPath 2.0 "intersect" operator can return those 
members common 
to two sequences of nodes, so (preceding-sibling::node() intersect
current-group()) will return just those members of the 
current group 
that are on the preceding-sibling axis relative to the context.

Thank you very much for this hint! The intersection of the 
group members and those not having a preceding sibling of a 
specific sort is what I was looking for. This makes my demo 
template look like:

<xsl:template match="B" mode="groups_at_root_level">
   <B_new>
     <xsl:variable name="this_group" select="current-group()"/>
     <xsl:for-each-group
       select="$this_group"
       group-starting-with="
         B|sub[not($this_group intersect preceding-sibling::A)]">
       <xsl:apply-templates select="current-group()"/>
     </xsl:for-each-group>
   </B_new>
</xsl:template>


* If, rather than using grouping constructs to select from 
the nodes 
in the source, you processed them into temporary trees, you could 
construct those trees exactly the way you wanted, including nesting 
elements in such a way that preceding-sibling would be 
useful. Such as:

<xsl:variable name="intermediate">
  <xsl:for-each-group select="*" group-by=".">
    <group>
      <xsl:copy-of select="current-group()"/>
    </group>
  </xsl:for-each-group>
<xsl:variable>

<xsl:for-each select="$intermediate/group">
  ... inside each group element, members of the group appear as 
siblings ...
</xsl:for-each>

That seems to be a neat approach, too, at least from a 
general point of view. However, in my case, the existence of 
preceding siblings is important for determining whether an 
item is allowed to start a group or not. So "unconditionally" 
starting a group on any instance of an element would yield a 
number of groups that would have to be resolved afterwards 
into members of other groups, because only looking at the 
siblings of the group starter will reveal that in fact it 
should not have fulfilled this role. xsl:for-each on "group" 
instances would then be quite
difficult: you can't process any group after you have 
processed them all, because you need to make sure that you 
don't miss any "late" member from a group that had to be resolved...

Unless you have "unstable" groups, this approach is 
definitely very interesting.


But I'm not sure either of these are actually necessary 
here. You have 
only presented your problem in fragmentary form, so it's 
hard to say; 
but to get the result you say you want, I'd do something 
much simpler:

 > [snip]

Thank you (as well as Andrew) for proposing simple and 
elegant solutions 
that accomplish the basic grouping task. Unfortunately, I 
can't use them 
because the grouping I'm doing is far more complicated. (E.g. 
repeated 
grouping based on the same element; grouping highly depends 
on preceding 
instances; dynamic creation of multiple group containers 
etc.) Trying to 
leave out the less relevant details, I crafted a demo that would just 
show my minimal requirements, however strange they might 
seem. Sorry for 
the confusion.

Let me be more elaborate on my grouping criteria. Rather than just 
matching an element I always need it to meet some condition, 
so instead of:

group-starting-with="
   B|sub[not($this_group intersect preceding-sibling::A)]"

I actually have more something like:

group-starting-with="
   B|sub[$condition1 and
         not($this_group intersect
             preceding-sibling::A[$condition2])]"

What I am curious about is how I could optimize my stylesheet runtime 
behaviour (I'm using Saxon 8.8) by computing some values only 
once, e.g. 
using a variable declared before xsl:for-each-group, given that:

- the negated expression appears several times within the attribute 
value (think of it like duplicating the above code for "sub"), while 
$condition1 is rather singular

- the number of instances matching unconstrained 
preceding-sibling::A[$condition2] is rather large, whereas within the 
grouping candidates it is small or zero

- the value of preceding-sibling::A[$condition2] depends, as far as I 
have understood, on the item that xsl:for-each-group 
currently examines, 
so it can't sensibly be evaluated beforehand

Any ideas on this?

   Yves


--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: 
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--



--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--