On Oct 14, 2013, at 2:58 PM, Wendell Piez <wapiez(_at_)wendellpiez(_dot_)com>
wrote:
Hi Ted,
First, a caution. Before you find yourself thinking about Muenchian
grouping in 2013 you should ask whether you really have no option but
to use XSLT 1.0. XSLT 2.0 is far superior in many respects, including
the availability of xsl:for-each-group. I acknowledge that there may
be other reasons to explore Muenchian grouping besides having no
choice, so I don't want to discourage your question. But why try to
bake bread on an open fire when stoves are readily available, etc.
Indeed, were 2.0 an option, I'd definitely be using it! Unfortunately I'm stuck
with a 1.0 implementation, but I could be mistaken. We're doing the
transformations through PHP (5.3). I've just always assumed libxslt (or
whatever it's called) was a 1.0-only implementation. Please: prove me wrong and
save me this extra work!
So:
<xsl:key
name="ports-by-ship"
match="td[position() = (count(.) - 1)]"
use="tr[count(td) = 4]/td[position() = 1]"
/>
is legal, but wrong. It won't work because
a. count(.) will always return '1' so your key will never match,
because position() will never be 0.
That should have been obvious. Thanks for pointing that out!
b. use='tr[etc]' will use the values of (certain) 'tr' children of
your matched 'td' elements as key values, but 'td' never has 'tr'
children, so even if the key matched, you'd have empty string key
values.
So, to clarify, the USE attribute must be a child attribute or element of the
MATCH attribute. Is this correct?
To devise a correct solution, I suggest
1. Considering whether you can't use XSLT 2.0 for-each-group.
Can't. :-(
2. If not, consider whether doing this in two passes would simplify
the problem. (In the first pass you would label the td elements with
their information types, simplifying the declaration of the key for
the second pass.)
Ideal but in this particular case I'm doing this for a client and such an
approach *might* imply a change to their base processing system. I do think,
though, that I could probably create a variable (via exslt extensions)
consisting of a fragment marked up as suggested and then operate on the
fragment.
3. If neither of these, please clarify the logic whereby you know
which td is of which type.
It may not have been clear in my sample markup so let me put it this way: we
are processing HTML tables of data. Each table contains "sections". The start
of each section is indicated by the presence of 4 TD elements in the first row.
Other rows only have 2 or 3 TD elements. The first TD element in the first row
has a ROWSPAN attribute running the length of the rows for that section. This
TD element has a value that represents the group name (is what we'd like to
group by).
Given your clarification about how keys work, it sounds like I need something
like this:
<xsl:key
name="ports-by-ship"
match="tr"
use="tr[count(td) = 4]/td[position() = 1]"
/>
but I have to think that this would only give me the first row of TD elements,
and not all of those that follow. I suspect that I might need something like
"following-sibling::td" in the MATCH attribute or maybe…
<xsl:key
name="ports-by-ship"
match="td"
use="td[ancestor::tr[count(td) = 4]][1]"
/>
and even then I suspect I'll get ALL the ancestors instead of just the most
recent…
At a higher level, I think the essence of the problem here is that you
aren't accounting for evaluation context properly in devising your
XPaths. Some review of the design and functionality of keys (apart
from how to do Muenchian grouping) would be effort well spent. Keys
are extremely useful in XSLT 2.0 as well! though no longer so
necessary for grouping.
I've read and re-read everything I could find about keys but it's just one of
those things that for me, takes a while to sink in, especially if I haven't
seen it in a while. I fully understand their power and have used them
successfully in the past, but I'm just a bit lost on this one.
A sample of required output for your input might be helpful in
presenting your problem to us.
Correct. I had forgotten to provide the desired output. Let me try this again.
Input:
<table>
<thead>
<tr>
<th>Ship</th>
<th>Route</th>
<th>Port</th>
<th>Date</th>
</tr>
</thead>
<tbody>
<tr>
<td rowspan="3">Titanic</td>
<td rowspan="2">Pacific South</td>
<td>San Francisco</td>
<td>dd/mm/yyyy</td>
</tr>
<tr>
<td>San Diego</td>
<td>dd/mm/yyyy</td>
</tr>
<tr>
<td>Acapulco</td>
<td>dd/mm/yyyy</td>
</tr>
<tr>
<td rowspan="2">Pacific Central</td>
<td>Acapulco</td>
<td>dd/mm/yyyy</td>
</tr>
<tr>
<td>Punteras Cantón</td>
<td>dd/mm/yyyy</td>
</tr>
<tr>
<td>Panamá</td>
<td>dd/mm/yyyy</td>
</tr>
</tbody>
</table>
Output:
<Titanic>
<port name="San Francisco">dd/mm/yyyy</port>
<port name="San Diego">dd/mm/yyyy</port>
<port name="Acapulco">dd/mm/yyyy</port>
<port name="Acapulco">dd/mm/yyyy</port>
<port name="Punteras Cantón">dd/mm/yyyy</port>
<port name="Panamá">dd/mm/yyyy</port>
</Titanic>
The actual input is a bit more complex than what I've shown here, but I think
I've presented the problem faithfully. It is important to note, though, that I
can't key off of the ROWSPAN attribute as there are other TD elements with this
attribute set that are NOT part of the section header and the value of the
ROWSPAN attribute can vary significantly (including being only 1 row). The only
thing that makes the section header unique is the number of TD elements in the
row.
And to be clear, I've managed to get the output I want, but not by using keys.
Fortunately my input is tiny so processing times really aren't an issue, but as
we all know, that could always change (and better just do things right to begin
with).
Thanks so much for your considered response.
Sincerely,
Ted Stresen-Reuter
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--