Israel Viente wrote:
My input is something like the following:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
<p dir="rtl">
<span class="chapter">line1</span>
</p>
<p dir="rtl"> <br />
<span class="regular">line3.</span>
<span class="italic">line4</span>
<span class="regular">line5."</span>
</p>
<p dir="rtl"> <br />
<span class="regular">line6.</span>
<br />
<span class="regular">line7</span>
</p>
<p dir="rtl"> <br />
<span class="regular">line8.</span>
<span class="regular">line9.</span>
</p>
</body>
</html>
The reault output should be:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN"
"http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
<html xmlns="http://www.w3.org/1999/xhtml">
<body>
<p dir="rtl">
<span class="chapter">line1</span>
</p>
<p dir="rtl"> <br />
<span class="regular">line3.</span>
<span class="italic">line4</span>
<span class="regular">line5."</span>
</p>
<p dir="rtl"> <br />
<span class="regular">line6.</span>
<br />
<span class="regular">line7</span>
<span class="regular">line8.</span>
<span class="regular">line9.</span>
</p>
</body>
</html>
For every span element that the class<>'chapter' verify that in every
p the last span element text ends with one character of .?"!
(paragraph ending char).
If it does, copy as is to the output.
Otherwise: Move the span elements from the next p to the current one
and remove the next p completely.
Here is an attempt at solving that with XSLT 2.0:
<xsl:stylesheet
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xpath-default-namespace="http://www.w3.org/1999/xhtml"
version="2.0">
<xsl:output method="xhtml"/>
<xsl:template match="@* | node()">
<xsl:copy>
<xsl:apply-templates select="@* | node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="p[span[(_at_)class ne 'chapter'] and
not(matches(span[(_at_)class ne 'chapter'][last()], '[.?"!]$'))]">
<xsl:copy>
<xsl:apply-templates select="@* | node() |
following-sibling::p[1]/node()"/>
</xsl:copy>
</xsl:template>
<xsl:template match="p[preceding-sibling::p[1][span[(_at_)class ne
'chapter'] and not(matches(span[(_at_)class ne 'chapter'][last()],
'[.?"!]$'))]]"/>
</xsl:stylesheet>
For the posted input using Saxon 9 it produces the described output but
I have not tested with other inputs.
--
Martin Honnen
http://msmvps.com/blogs/martin_honnen/
--~------------------------------------------------------------------
XSL-List info and archive: http://www.mulberrytech.com/xsl/xsl-list
To unsubscribe, go to: http://lists.mulberrytech.com/xsl-list/
or e-mail: <mailto:xsl-list-unsubscribe(_at_)lists(_dot_)mulberrytech(_dot_)com>
--~--