xsl-list
[Top] [All Lists]

Re: [xsl] How to properly use Key elements

2013-10-15 18:35:23
On Oct 14, 2013, at 2:58 PM, Wendell Piez <wapiez(_at_)wendellpiez(_dot_)com> 
wrote:

Hi Ted,

First, a caution. Before you find yourself thinking about Muenchian
grouping in 2013 you should ask whether you really have no option but
to use XSLT 1.0. XSLT 2.0 is far superior in many respects, including
the availability of xsl:for-each-group. I acknowledge that there may
be other reasons to explore Muenchian grouping besides having no
choice, so I don't want to discourage your question. But why try to
bake bread on an open fire when stoves are readily available, etc.

Indeed, were 2.0 an option, I'd definitely be using it! Unfortunately I'm stuck 
with a 1.0 implementation, but I could be mistaken. We're doing the 
transformations through PHP (5.3). I've just always assumed  libxslt (or 
whatever it's called) was a 1.0-only implementation. Please: prove me wrong and 
save me this extra work!

So:

<xsl:key
       name="ports-by-ship"
       match="td[position() = (count(.) - 1)]"
       use="tr[count(td) = 4]/td[position() = 1]"
/>

is legal, but wrong. It won't work because

a. count(.) will always return '1' so your key will never match,
because position() will never be 0.

That should have been obvious. Thanks for pointing that out!

b. use='tr[etc]' will use the values of (certain) 'tr' children of
your matched 'td' elements as key values, but 'td' never has 'tr'
children, so even if the key matched, you'd have empty string key
values.

So, to clarify, the USE attribute must be a child attribute or element of the 
MATCH attribute. Is this correct?

To devise a correct solution, I suggest

1. Considering whether you can't use XSLT 2.0 for-each-group.

Can't. :-(

2. If not, consider whether doing this in two passes would simplify
the problem. (In the first pass you would label the td elements with
their information types, simplifying the declaration of the key for
the second pass.)

Ideal but in this particular case I'm doing this for a client and such an 
approach *might* imply a change to their base processing system. I do think, 
though, that I could probably create a variable (via exslt extensions) 
consisting of a fragment marked up as suggested and then operate on the 
fragment.

3. If neither of these, please clarify the logic whereby you know
which td is of which type.

It may not have been clear in my sample markup so let me put it this way: we 
are processing HTML tables of data. Each table contains "sections". The start 
of each section is indicated by the presence of 4 TD elements in the first row. 
Other rows only have 2 or 3 TD elements. The first TD element in the first row 
has a ROWSPAN attribute running the length of the rows for that section. This 
TD element has a value that represents the group name (is what we'd like to 
group by).

Given your clarification about how keys work, it sounds like I need something 
like this:

<xsl:key
       name="ports-by-ship"
       match="tr"
       use="tr[count(td) = 4]/td[position() = 1]"
/>

but I have to think that this would only give me the first row of TD elements, 
and not all of those that follow. I suspect that I might need something like 
"following-sibling::td" in the MATCH attribute or maybe…

<xsl:key
       name="ports-by-ship"
       match="td"
       use="td[ancestor::tr[count(td) = 4]][1]"
/>

and even then I suspect I'll get ALL the ancestors instead of just the most 
recent…

At a higher level, I think the essence of the problem here is that you
aren't accounting for evaluation context properly in devising your
XPaths. Some review of the design and functionality of keys (apart
from how to do Muenchian grouping) would be effort well spent. Keys
are extremely useful in XSLT 2.0 as well! though no longer so
necessary for grouping.

I've read and re-read everything I could find about keys but it's just one of 
those things that for me, takes a while to sink in, especially if I haven't 
seen it in a while. I fully understand their power and have used them 
successfully in the past, but I'm just a bit lost on this one.

A sample of required output for your input might be helpful in
presenting your problem to us.

Correct. I had forgotten to provide the desired output. Let me try this again.

Input:
<table>
  <thead>
    <tr>
      <th>Ship</th>
      <th>Route</th>
      <th>Port</th>
      <th>Date</th>
    </tr>
  </thead>
  <tbody>
    <tr>
      <td rowspan="3">Titanic</td>
      <td rowspan="2">Pacific South</td>
      <td>San Francisco</td>
      <td>dd/mm/yyyy</td>
    </tr>
    <tr>
      <td>San Diego</td>
      <td>dd/mm/yyyy</td>
    </tr>
    <tr>
      <td>Acapulco</td>
      <td>dd/mm/yyyy</td>
    </tr>
    <tr>
      <td rowspan="2">Pacific Central</td>
      <td>Acapulco</td>
      <td>dd/mm/yyyy</td>
    </tr>
    <tr>
      <td>Punteras Cantón</td>
      <td>dd/mm/yyyy</td>
    </tr>
    <tr>
      <td>Panamá</td>
      <td>dd/mm/yyyy</td>
    </tr>
  </tbody>
</table>

Output:
<Titanic>
    <port name="San Francisco">dd/mm/yyyy</port>
    <port name="San Diego">dd/mm/yyyy</port>
    <port name="Acapulco">dd/mm/yyyy</port>
    <port name="Acapulco">dd/mm/yyyy</port>
    <port name="Punteras Cantón">dd/mm/yyyy</port>
    <port name="Panamá">dd/mm/yyyy</port>
</Titanic>

The actual input is a bit more complex than what I've shown here, but I think 
I've presented the problem faithfully. It is important to note, though, that I 
can't key off of the ROWSPAN attribute as there are other TD elements with this 
attribute set that are NOT part of the section header and the value of the 
ROWSPAN attribute can vary significantly (including being only 1 row). The only 
thing that makes the section header unique is the number of TD elements in the 
row.

And to be clear, I've managed to get the output I want, but not by using keys. 
Fortunately my input is tiny so processing times really aren't an issue, but as 
we all know, that could always change (and better just do things right to begin 
with).

Thanks so much for your considered response.

Sincerely,

Ted Stresen-Reuter
--~------------------------------------------------------------------
XSL-List info and archive:  http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--