I'm cleaning up HTML exported from Word documents. First, I run Tidy again
the exported HTML, to convert it to well-formed XHTML, then I transform it
with my own XSLT stylesheet.
I need to convert (this is slightly simplified):
<p>Some paragraph that does not begin with a span element whose first
character is a hyphen.</p>
<p><span>-</span>List item</p>
<p><span>-</span>List item</p>
<p><span>-</span>List item</p>
<p><span>-</span>List item</p>
<p>Some paragraph that does not begin with a span element whose first
character is a hyphen.</p>
<p><span>-</span>List item</p>
<p><span>-</span>List item</p>
<p><span>-</span>List item</p>
<p><span>-</span>List item</p>
<p>Some paragraph that does not begin with a span element whose first
character is a hyphen.</p>
into this:
<p>Some paragraph that does not begin with a span element whose first
character is a hyphen.</p>
<ul>
<li>List item</li>
<li>List item</li>
<li>List item</li>
<li>List item</li>
</ul>
<p>Some paragraph that does not begin with a span element whose first
character is a hyphen.</p>
<ul>
<li>List item</li>
<li>List item</li>
<li>List item</li>
<li>List item</li>
</ul>
<p>Some paragraph that does not begin with a span element whose first
character is a hyphen.</p>
I'm afraid my attempts at this have been truly pitiful:
<xsl:template match="p[starts-with(span[1], '-')]">
<xsl:if
test="not(starts-with(preceding-sibling::p[1]/span[1], '-'))">
<xsl:text><ul></xsl:text>
</xsl:if>
<li>
<xsl:apply-templates />
</li>
<xsl:if
test="not(starts-with(following-sibling::p[1]/span[1], '-'))">
<xsl:text></ul></xsl:text>
</xsl:if>
</xsl:template>
I know, I know: those <xsl:text><ul></xsl:text> bits are a joke, they
don't produce the <ul> parent element.
I've tried writing a template that uses <xsl:for-each>, but that ends up
grabbing all following-sibling <p> elements that begin with <span>-</span>.
I don't know how to stop the <xsl:for-each> at the first non-matching <p>
element.
Can anyone throw me an XSLT pearl?
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list