Continuing my grouping issues:
XSLT2 handles grouping on a node level quite conveniently. However,
adding structure to legacy, rather flat content (i.e.: character runs)
still poses challenges in grouping. The following applies mainly to
document-centric (as opposed to data-centric) XML.
__ EXAMPLE 1 __
<p>Note #4: Don't tumble dry your pet.</p>
TASK:
Group the leading paragraph text "Note #4:" using <marker> so that the
result looks like (indented for readibility):
<p><marker>Note #4:</marker>
Don't tumble dry your pet.</p>
SOLUTION:
The solution is easy, as we can just work on the text without having to
worry about markup:
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:template match="p">
<xsl:copy>
<xsl:analyze-string select="." regex="^Note\s#\d+:">
<xsl:matching-substring>
<marker>
<xsl:value-of select="." />
</marker>
</xsl:matching-substring>
<xsl:non-matching-substring>
<xsl:value-of select="normalize-space(.)" />
</xsl:non-matching-substring>
</xsl:analyze-string>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
However, in "real" documents, you will have likely something like this:
__ EXAMPLE 2 __
<p><b>Note</b> <i>#4</i>: Don't tumble dry your pet.</p>
TASK:
Group the leading paragraph text "Note #4:" including any contained
markup using <marker> so that the result looks like:
<p><marker><b>Note</b> <i>#4</i>:</marker>
Don't tumble dry your pet.</p>
SOLUTION:
Here it starts to get really complicated. Since now the text will
contain markup we need to retain, but the text run is still to be
considered from the <p> level (so that we can test for "starts with
pattern" using '^'), <xsl:analyze-string/> does not seem to do the trick
in this case.
A worst-case scenario of course would be:
__ EXAMPLE 3 __
<p><ul><b>Note</b> <i>#4</i>: Don't tumble dry your pet</ul>.</p>
TASK:
Group the leading paragraph text "Note #4:" including any contained
markup using <marker> to a child of <p> so that the result looks like:
<p><marker><ul><b>Note</b> <i>#4</i>:</ul></marker>
<ul>Don't tumble dry your pet</ul>.</p>
SOLUTION:
Same problems as in EXAMPLE 2, but additionally note that the <ul>
element must be split/duplicated so that <marker> can be a child of <p>,
yet retains the full formatting info in form of the contained element
structure.
Is there a certain pattern on how to tackle these kind of problems in
XSLT, or is the language just not the tool of choice for this kind of
transformation?
-Christian
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--