xsl-list
[Top] [All Lists]

Re: [xsl] Joining list fragments

2020-05-09 14:07:22
Folks,

That piece of code by Gerrit is beautiful: 
https://github.com/gimsieke/join-list-fragments

My personal original strategy was to pick the beginning of a broken list and 
then walk the following-sibling axis to collect all "joinable" list fragments. 
As Gerrit mentioned early on 

This looks like a nested group-starting-with / group-adjacent to me at first 
glance.


I was willing to listen. But I could never have come up with this beauty.

In a private mail Gerrit confessed his mantra: »Everything that looks remotely 
like grouping has to be grouped!« What a fitting motto for the King of 
Grouping! (Should have that in Latin, though. Or some native speaker comes up 
with better phrasing, please.)

The main template uses group-starting-with to catch the broken lists, and a 
nested group-adjacent to select the "joinable" parts from each current-group(). 
BTW, has anyone seen innermost() or outermost() used before?

The main part is in template "collect", which uses group-starting-with, a 
nested group-adjacent, and another group-starting-with with recursion for more 
list levels. And this is the most beautiful solution, which I could never have 
imagined: There is exactly one XPath expression using element and attribute 
names from the source document. The rest is logic and evaluating list item 
levels. Wow!

Thank you very much,

- Michael MH


Am 06.05.2020 um 08:26 schrieb Imsieke, Gerrit, le-tex 
gerrit(_dot_)imsieke(_at_)le-tex(_dot_)de 
<xsl-list-service(_at_)lists(_dot_)mulberrytech(_dot_)com>:

Ok, it turned out that some recursion is necessary.

Michael (Müller-Hillebrand) sent me an updated test file and the expected 
results. As one can expect, the problem is even more complex than Michael's 
initial sample input suggests, due to the merging on multiple levels that is 
required.

But today I take pride in saying that the self-declared king of grouping (I) 
was able to solve it!

https://github.com/gimsieke/join-list-fragments

The solution is remotely similar to what I presented about "upward 
projection" at XML Prague 2019 
(https://subversion.le-tex.de/common/presentations/2019-02-09_xmlprague_xslt-upward-projection/slides/)
 in that leaf nodes are grouped and the surrounding subtree is later 
reconstructed.

If you run the example (apply xsl/join-list-fragments to test/sample_html.xml 
in #default mode), you will notice that a file debug1_atomic-items.xml is 
created. This is a somewhat flattened input that I looked at intensely and 
that I gradually modified when I set up the grouping. I can't stress enough 
how much looking at this semi-flattened file and the ad-hoc attributes that I 
created informed the evolution of the grouping. Without this debugging 
output, it would have been too complex to understand what is going on and 
what should happen in the recursive grouping.

The debugging output has the following additional attributes:

list-level: 0 for uninteresting elements, absent attribute for elements that 
need to be collected with the preceding list item, any other positive value 
indicates the nesting depth at which a new list item will be created for the 
group starting at that element

start: 'true' for an element that will become the first item of a (re-) 
created top-level ol element

start-level: the depth at which a re-created ol element will be created (2 
indicates an ol/li/ol). This attribute is not used for top-level lists, where 
@start is used.

It may be that an additional recursion is necessary if there is more 
variation than start-level="2". Maybe MMH can create more input that also 
contains such a case, but it might well be that it isn't relevant fpr their 
problem.

I might eventually add more documentation to the XSLT. At this stage, even 
with what I wrote above, it's a bit obscure -- write-only code -- which often 
is the case for recursive grouping. Running it in oXygen debugger with 
appropriate breakpoints and with inspecting current-group() might further 
illustrate how it works.

Gerrit



We want to join list fragments and some content in between them. An HTML-ish 
version of the input looks like this:

<div>
<h2id="E2">Item with content to be joined follows div to collect</h2>
<div>
<oldata-meta="listlevel=start">
<li>
<p>1st item</p>
</li>
</ol>
<divclass="box"data-meta="collect">
<p>Hint</p>
</div>
<oldata-meta="listlevel=continue">
<lidata-meta="listitem=continue">
<p>Para ff</p>
</li>
<li>
<p>2nd item</p>
</li>
</ol>
<p>Other arbitrary content</p>
</div>
</div>

Every broken list sequence starts with data-meta="listlevel=start" and a 
list or a list item that is supposed to be joined with the start list is 
marked using data-meta="listlevel=continue" and 
data-meta="listitem=continue". There can be any number of collect items 
between lists and multiple continue lists, but it is guaranteed that 
whatever needs to be collected will end with a list. In DTD content model 
notation: startList, (collectItem*, continueList)+

The lists are not limited to a single level. Gladly, if there is a 
"listitem=continue" in a continue list, it is guaranteed to be at the same 
level the previous list ends.

The task is to add to the last item of the previous list:
* all content marked "collect" between the lists; other content would break 
the process
* content of the next list’s first list item if marked "listitem=continue"
The remaining content of each continue list would be added as additional 
items to the start list.

The desired result for the input data above would look like this:

<div>
<h2id="E2">Item with content to be joined follows div to collect</h2>
<div>
<oldata-meta="listlevel=start">
<li>
<p>1st item</p>
<divclass="box"data-meta="collect">
<p>Hint</p>
</div>
<p>Para ff</p>
</li>
<li>
<p>2nd item</p>
</li>
</ol>
<p>Other arbitrary content</p>
</div>
</div>
--~----------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
EasyUnsubscribe: http://lists.mulberrytech.com/unsub/xsl-list/1167547
or by email: xsl-list-unsub(_at_)lists(_dot_)mulberrytech(_dot_)com
--~--
<Prev in Thread] Current Thread [Next in Thread>