Evan Leibovitch wrote:
I am working with an HTML input file, and I'd like to group things
better by sections (ultimately, with the intent of using
xml:result-document to create a new file for each section).
What I have is not uncommon:
<h1 class="section">Section Name</h1>
<h1 class="headline">Headline name</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 2</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 3</h1>
[... assorted HTML marked up text ...]
<h1 class="section">Section 2</h1>
<h1 class="headline">Headline 4</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 5</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 6</h1>
[... assorted HTML marked up text ...]
and so on.
What I'd like to end up with is, if possible
<section id="Section Name">
<headline id="Headline ">
[...marked up text...]
</headline id="Headline 2">
<headline>
[...marked up text...]
</headline>
<headline id="Headline 3">
[...marked up text...]
</headline>
</section>
XSLT 2.0 and group-starting-with could do that e.g.
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
version="2.0">
<xsl:output method="xml" indent="yes" version="1.0"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@*, node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="body">
<xsl:copy>
<xsl:for-each-group select="node()"
group-starting-with="h1[(_at_)class = 'section']">
<xsl:if test="self::h1[(_at_)class = 'section']">
<section id="{.}">
<xsl:for-each-group select="current-group() except ."
group-starting-with="h1[(_at_)class = 'headline']">
<xsl:if test="self::h1[(_at_)class = 'headline']">
<headline id="{.}">
<xsl:apply-templates select="current-group() except ."/>
</headline>
</xsl:if>
</xsl:for-each-group>
</section>
</xsl:if>
</xsl:for-each-group>
</xsl:copy>
</xsl:template>
</xsl:stylesheet>
will turn
<body>
<h1 class="section">Section Name</h1>
<h1 class="headline">Headline name</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 2</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 3</h1>
[... assorted HTML marked up text ...]
<h1 class="section">Section 2</h1>
<h1 class="headline">Headline 4</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 5</h1>
[... assorted HTML marked up text ...]
<h1 class="headline">Headline 6</h1>
[... assorted HTML marked up text ...]
</body>
into
<body>
<section id="Section Name">
<headline id="Headline name">
[... assorted HTML marked up text ...]
</headline>
<headline id="Headline 2">
[... assorted HTML marked up text ...]
</headline>
<headline id="Headline 3">
[... assorted HTML marked up text ...]
</headline>
</section>
<section id="Section 2">
<headline id="Headline 4">
[... assorted HTML marked up text ...]
</headline>
<headline id="Headline 5">
[... assorted HTML marked up text ...]
</headline>
<headline id="Headline 6">
[... assorted HTML marked up text ...]
</headline>
</section>
</body>
--
Martin Honnen
http://msmvps.com/blogs/martin_honnen/
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--