In general the problem is insoluble, because given the sequence
g - first level
h - first level
1 - second level
2 - second level
A - third level
B - third level
i - first level? or fourth level?
the (i) could be either first level or fourth level.
Your input data structure is badly designed, because it gives no way of
distinguishing these two cases.
A reasonable way to proceed if you're stuck with this input would be to
assume that (i) is first level if and only if the previous first level
number is (h). But to do this you'll need to use sibling recursion rather
than pattern-based grouping.
Michael Kay
http://www.saxonica.com/
-----Original Message-----
From: James Sulak [mailto:jsulak(_at_)jonesmcclure(_dot_)com]
Sent: 10 October 2008 15:04
To: xsl-list(_at_)lists(_dot_)mulberrytech(_dot_)com
Subject: [xsl] Tricky XSLT 2.0 grouping problem
I have a tricky grouping problem that I'm running into a wall
with. I thought it might be a fun challenge to throw out
there. I'm attempting to group a flat list of <section>
elements into a hierarchy based on matching its number
against different regular expressions. The list is assumed
to be in the correct order. I have it working (the code is
below) with one exception: roman numerals.
For example, the XML:
<body>
<section><pnum>(a)</pnum><p>First-level section</p></section>
<section><pnum>(1)</pnum><p>Second-level
section</p></section> <section><pnum>(A)</pnum><p>Third-level
section</p></section>
<section><pnum>(i)</pnum><p>Fourth-level
section</p></section>
<section><pnum>(ii)</pnum><p>Fourth-level
section</p></section> <section><pnum>(B)</pnum><p>Third-level
section</p></section>
<section><pnum>(2)</pnum><p>Second-level
section</p></section> <section><pnum>(A)</pnum><p>Third-level
section</p></section> </body>
Should give the result:
<body>
<section>
<pnum>(a)</pnum><p>First-level section</p>
<section>
<pnum>(1)</pnum><p>Second-level section</p>
<section>
<pnum>(A)</pnum><p>Third-level section</p>
<section><pnum>(i)</pnum><p>Fourth-level section</p></section>
<section><pnum>(ii)</pnum><p>Fourth-level section</p></section>
</section>
<section>
<pnum>(B)</pnum><p>Third-level section</p>
</section>
</section>
<section>
<pnum>(2)</pnum><p>Second-level section</p>
<section><pnum>(A)</pnum><p>Third-level section</p></section>
</section>
</section>
</body>
The problem is that the number "(i)," which is supposed to be
a fourth-level section, in ambiguous with an "(i)" that would
be a first-level section. My transform ends up treating it
like a first-level section, and so gives the following,
incorrect output:
<body>
<section>
<pnum>(a)</pnum><p>First-level section</p>
<section>
<pnum>(1)</pnum><p>Second-level section</p>
<section>
<pnum>(A)</pnum><p>Third-level section</p>
</section>
</section>
</section>
<section>
<pnum>(i)</pnum><p>Fourth-level section</p>
<section>
<pnum>(ii)</pnum><p>Fourth-level section</p>
<section>
<pnum>(B)</pnum><p>Third-level section</p>
</section>
</section>
<section>
<pnum>(2)</pnum><p>Second-level section</p>
<section>
<pnum>(A)</pnum><p>Third-level section</p>
</section>
</section>
</section>
</body>
I've included my current transform below. The grouping_keys
variable is a sequence of regex strings that match each
subsequent level of section
nesting. Does anybody have an alternate way of tackling this?
Thanks,
-James
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0">
<xsl:variable name="grouping_keys" as="xs:string+"
select="('\([a-z]\)', '\([1-9]\)',
'\([A-Z]\)', '\([ivx]{1,4}\)')" />
<!-- Start the grouping here -->
<xsl:template match="codebody">
<codebody>
<xsl:copy-of select="@*"/>
<xsl:for-each-group select="*"
group-starting-with="section[matches(pnum,
string($grouping_keys[1]))]">
<xsl:apply-templates select="." mode="group">
<xsl:with-param name="level" select="1"
as="xs:integer"/>
</xsl:apply-templates>
</xsl:for-each-group>
</codebody>
</xsl:template>
<!-- This template copies the current section and groups
any "nested" sections -->
<xsl:template match="section" mode="group">
<xsl:param name="level" as="xs:integer"/>
<section>
<xsl:copy-of select="@*, *"/>
<xsl:if test="$level < count($grouping_keys)">
<xsl:for-each-group select="current-group() except ."
group-starting-with="section[matches(pnum,
string($grouping_keys[$level + 1]))]">
<xsl:apply-templates select="." mode="group">
<xsl:with-param name="level"
select="$level + 1"
as="xs:integer"/>
</xsl:apply-templates>
</xsl:for-each-group>
</xsl:if>
</section>
</xsl:template>
<xsl:template match="element()" mode="#all">
<xsl:copy>
<xsl:apply-templates select="@*,node()" mode="#current"/>
</xsl:copy>
</xsl:template>
<xsl:template
match="attribute()|text()|comment()|processing-instruction()"
mode="#all">
<xsl:copy/>
</xsl:template>
</xsl:stylesheet>
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail:
<mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--